Neural Network basics - Artificial Intelligence using AutoHotkey!

05 May 2018, 01:39

Well if we really wanted to program Neural Networks in AHK we probably need to implement a framework like iseahound mentioned.
And that seems like a lot of work for a single person.

blue83 · 15 May 2018, 01:32

Hi I have one more question.

Here are examples how can we use AI if we use some conditions to recieve something back.

Because AHK is script languague for automation of tasks in windows, my question is can be done something about prediction of our moves, clicks etc.

I mean that UI can learn how we use Windows and if something happens that we dont have in our script, that UI can learn and act accordingly to that new situation.

Also there is an issue with unstructured data.

Thanks

16 May 2018, 15:40

blue83 wrote:how can we use AI if we use some conditions to recieve something back.

(..) my question is can be done something about prediction of our moves, clicks etc.

I mean that UI can learn how we use Windows and if something happens that we dont have in our script, that UI can learn and act accordingly to that new situation.

In my opinion something like that is certainly possible, although it will most likely take the form of a project and scope planning must be considered. In this video a Neural Network was trained to play a certain game (Mario Kart) based on the inputs of a player related to the position of the objects on screen. The key point in there is that the network does not even know what winning a race is, but was able to win an entire cup by attempting to predict and mimic the players movements in each new situation based on previous data.

To this end, it is important to realise that the scope of the project is very important and must be very well planned. A network to predict any possible human action in a computer will certainly be too costly for any practical implementation, but a network that decides when a pop-up window is probably going to be immediately closed by the user, and than closes it for the user is something much more feasible to do. Also, it is impotant to realise that some tasks are more efficiently done by ordinary programming. Opening certain programs on boot based on how likely the user is to open that program as soon as he boots the PC is something that can be achieved with more simple statistic controls than a Neural Network. Also, prediction accuracy demands are VERY important. Neural Networks are unlikely to be 100% precise in their judgements. If an accuracy between 95-98% is too troublesome (i.e., there being a certain pop-up window that should never ever be closed, such as an online test that considers each session an attempt), i would suggest considering the project too complicated unless previous experience tells you otherwise or some form of "easy undo" routine is present (i.e, let's suppose that the flagged pop-ups are not really immediately closed, but rather hidden for some seconds before closing and the user can unhide during this time).

blue83 · 17 May 2018, 03:18

Hi Gio,

Thank you for your answer and clarification.

29 May 2018, 12:33

Just watched an excellent Youtube video by SciShow that deals with the hardships of developing self-driving cars. This is a perfect example on how overly complicated abstract models will still require a lot of programmers work for the years to come. I guess it is pretty safe to say this is a great opportunity

Joe Glines · 01 Jun 2018, 04:37

@Gio- Thanks for sharing that video! Nice to know humans are safe from extinction for a while... lol

16 Jun 2018, 15:04

I worked a little on Speedmasters example grid for section 2 of your tutorial.
Here is the updated version:
-On clicking the VALIDATION CASE the resulting values will be updated automatically
-extracted the network creation and training code and put it into a class
-added a second output
-clicking calculate ANN wont reset the neurons
-added bias to each neuron

Code: Select all

; Neural Network basics tutorial (section 2)
; tutorial by Gio 
; interactive display grid by Speedmaster
; modified by nnnik to make things easier
; topic: https://autohotkey.com//boards/viewtopic.php?f=7&t=42420
class HiddenLayerNetwork {
	
	__New( inputCount, neuronCount, outputCount ) {
		This.inputCount  := inputCount
		This.neuronCount := neuronCount
		This.outputCount := outputCount
		This.hiddenLayer := []
		This.outputLayer := []
		This.resetNeurons()
	}
	
	train( dataSet, iterations, callBack := "" ) {
		splitData := {}
		for each, data in dataSet {
			If ( data.MaxIndex() != This.inputCount + This.outputCount )
				Throw exception("bad data count at index " each, -1)
			i := [data.clone()]
			i.1.removeAt(This.inputCount + 1, This.outputCount)
			i.1.push(1)
			itrans := TRANSPOSE_MATRIX(i)
			o := [data.clone()]
			o.1.removeAt(1, This.inputCount)
			splitData.Push({input:i, output:o, trans:itrans})
		}
		hiddenLayer := TRANSPOSE_MATRIX( This.hiddenLayer )
		outputLayer := TRANSPOSE_MATRIX( This.outputLayer )
		Loop %iterations% {
			for each, data in splitData {
				HL := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(data.input, hiddenLayer))
				HL.1.push(1)
				output := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(HL, outputLayer))
				
				outputError := DEDUCT_MATRICES(data.output, output)
				outputDelta := MULTIPLY_MEMBER_BY_MEMBER(outputError, DERIVATIVE_OF_SIGMOID_OF_MATRIX(output))
				outputAdjust := MULTIPLY_MATRICES(TRANSPOSE_MATRIX(HL), outputDelta)
				
				HLError := MULTIPLY_MATRICES(outputError, This.outputLayer)
				HL.1.Pop()
				HLError.1.Pop()
				HLDelta := MULTIPLY_MEMBER_BY_MEMBER(HLError, DERIVATIVE_OF_SIGMOID_OF_MATRIX(HL))
				HLAdjust := MULTIPLY_MATRICES(data.trans, HLDelta)
				
				hiddenLayer := ADD_MATRICES(hiddenLayer, HLAdjust)
				outputLayer := ADD_MATRICES(outputLayer, outputAdjust)
				
				totalError := averageArray(outputError.1)
			}
		} Until %callBack%(A_Index, totalError, iterations)
		This.hiddenLayer := TRANSPOSE_MATRIX( hiddenLayer )
		This.outputLayer := TRANSPOSE_MATRIX( outputLayer )
	}
	
	calculate( input ) {
		input := [input.clone()]
		input.1.push(1)
		HL := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(input, TRANSPOSE_MATRIX( This.hiddenLayer )))
		HL.1.push(1)
		return SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(HL, TRANSPOSE_MATRIX( This.outputLayer))).1
	}
	
	resetNeurons() {
		Loop % This.neuronCount {
			neuronNr := A_Index
			Loop % This.inputCount + 1 { ;inputs + bias
				Random, weight, -1.0, 1.0
				This.hiddenLayer[neuronNr, A_Index] := weight
			}
		}
		Loop % This.outputCount {
			outputNr := A_Index
			Loop % This.neuronCount + 1 {
				Random, weight, -1.0, 1.0
				This.outputLayer[outputNr, A_Index] := weight
			}
		}
		
	}
}



;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

; The function below applies a sigmoid function to a single value and returns the results.
Sigmoid(x)
{
	return  1 / (1 + exp(-1 * x))
}

; The function below applies the derivative of the sigmoid function to a single value and returns the results.
Derivative(x)
{
	Return x * (1 - x)
}


; The function below applies the sigmoid function to all the members of a matrix and returns the results as a new matrix.
SIGMOID_OF_MATRIX(A)
{
	RESULT_MATRIX := Object()
	Loop % A.MaxIndex()
	{
		CURRENT_ROW := A_Index
		Loop % A[1].MaxIndex()
		{
			CURRENT_COLUMN := A_Index
			RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := 1 / (1 + exp(-1 * A[CURRENT_ROW, CURRENT_COLUMN]))
		}
	}
	Return RESULT_MATRIX
}


; The function below applies the derivative of the sigmoid function to all the members of a matrix and returns the results as a new matrix. 
DERIVATIVE_OF_SIGMOID_OF_MATRIX(A)
{
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := value * (1 - value)
	Return resultingMatrix
}


; The function below multiplies the individual members of two matrices with the same coordinates one by one (This is NOT equivalent to matrix multiplication).
MULTIPLY_MEMBER_BY_MEMBER(A,B)
{
	If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
		Throw exception("You cannot multiply matrices member by member unless both matrices are of the same size!", -1)
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := value * B[rowNr, columnNr]
	Return resultingMatrix
}


; The function below transposes a matrix. I.E.: Member[2,1] becomes Member[1,2]. Matrix dimensions ARE affected unless it is a square matrix.
TRANSPOSE_MATRIX(A)
{
	TRANSPOSED_MATRIX := Object()
	Loop % A.MaxIndex()
	{
		CURRENT_ROW := A_Index
		Loop % A[1].MaxIndex()
		{
			CURRENT_COLUMN := A_Index
			TRANSPOSED_MATRIX[CURRENT_COLUMN, CURRENT_ROW] := A[CURRENT_ROW, CURRENT_COLUMN]
		}
	}
	Return TRANSPOSED_MATRIX
}


; The function below adds a matrix to another.
ADD_MATRICES(A,B)
{
	If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
		Throw exception("You cannot add matrices member by member unless both matrices are of the same size!", -1)
	RESULT_MATRIX := Object()
	Loop % A.MaxIndex()
	{
		CURRENT_ROW := A_Index
		Loop % A[1].MaxIndex()
		{
			CURRENT_COLUMN := A_Index
			RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := A[CURRENT_ROW,CURRENT_COLUMN] + B[CURRENT_ROW,CURRENT_COLUMN]
		}
	}
	Return RESULT_MATRIX
}


; The function below deducts a matrix from another.
DEDUCT_MATRICES(A,B)
{
	If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
		Throw exception("You cannot subtract matrices member by member unless both matrices are of the same size!", -1)
	RESULT_MATRIX := Object()
	Loop % A.MaxIndex()
	{
		CURRENT_ROW := A_Index
		Loop % A[1].MaxIndex()
		{
			CURRENT_COLUMN := A_Index
			RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := A[CURRENT_ROW,CURRENT_COLUMN] - B[CURRENT_ROW,CURRENT_COLUMN]
		}
	}
	Return RESULT_MATRIX
}


; The function below multiplies two matrices according to matrix multiplication rules.
MULTIPLY_MATRICES(A,B)
{
	If (A[1].MaxIndex() != B.MaxIndex())
		Throw exception("Number of Columns in the first matrix must be equal to the number of rows in the second matrix.", -1)
	RESULT_MATRIX := Object()
	Loop % A.MaxIndex() ; Rows of A
	{
		CURRENT_ROW := A_Index
		Loop % B[1].MaxIndex() ; Cols of B
		{
			CURRENT_COLUMN := A_Index
			RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN]  := 0
			Loop % A[1].MaxIndex()
			{
				RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] += A[CURRENT_ROW, A_Index] * B[A_Index, CURRENT_COLUMN]
			}
		}
	}
	Return RESULT_MATRIX
}

;returns the average of an array
averageArray(arr) {
	for each,  errorValue in arr
		totalError += errorValue
	return totalError / arr.length()
}

SetBatchLines, -1

gui, -dpiscale
gui, font, s12
gui, add, text, cblack x20 y20, (click a cell to change its value)
gui, add, text, cred x20 y60, Training rules
gui, add, text, cgreen x280 yp, Expected output


;create a display grid
network := new HiddenLayerNetwork(3, 8, 2)
rows:=8, cols:=5, cllw:=100, cllh:=30,
clln:="", cmj:=0
gpx:=20, gpy:=90, wsp:=-1, hsp:=-1,
r:=0,  c:=0 , opt:="0x200 center BackGroundTrans border gclickcell "
While r++ < rows {
     while c++ < cols{
          gui 1: add, text, % opt " w"cllw " h"cllh " v" ((cmj) ? (clln c "_" r " Hwnd" clln  c "_" r):(clln r "_" c " Hwnd" clln  r "_" c)) ((c=1 && r=1) ? " x"gpx " y"gpy " section"
               : (c!=1 && r=1) ? " x+"wsp " yp" : (c=1 && r!=1) ? " xs" " y+"hsp : " x+"wsp " yp"),
       } c:=0
} r:=0, c:=0


gui, add, text, cblue x20 y+1, Validation case
gui, add, text, cpurple x310 yp, Final solution

gui, font, s12
gui, add, button,  h30 x155 gcalculate, ANN calculation

gui, add, text, cblack x20 y+10 vProgr, Training Progress:`t`t`t`t
gui, add, text, cblack x20 y+10 vError, Total Error:`t`t`t`t`t`t`t

gui, show,, Neural Network (SECTION 2)

TRAINING_INPUTS := array([0, 0, 1], [0, 1, 1], [1, 0, 1], [0, 1, 0], [1, 0, 0], [1, 1, 1], [0, 0, 0]) ; We will also feed the net creator code with the values of the inputs in the training samples (all organized in a matrix too). MATRIX 7 x 3.
EXPECTED_OUTPUTS := Array([0, 1],[1, 1],[1, 1],[1, 0],[1, 0],[0, 1],[0, 0]) ; And we will also provide the net creator with the expected answers to our training samples so that the net creator can properly train the net.
VALIDATION_CASE := Array([1,1,0])

drawTRAINING(TRAINING_INPUTS)
drawEXPECTED(EXPECTED_OUTPUTS)
drawValidation(VALIDATION_CASE)
return

PREPARATION_STEPS:
calculate:
dataSet := TRAINING_INPUTS.clone()
for each, row in dataSet {
	newRow := row.clone()
	newRow.Push(EXPECTED_OUTPUTS[each]*)
	dataSet[each] := newRow
}
network.train(dataSet, 10000, "displayProgress")
gosub, calculateValidation
drawValidation(VALIDATION_CASE)
return

calculateValidation:
input := VALIDATION_CASE.1.clone()
if (input.length() > 3)
	input.removeAt(4, input.length() - 3)
output := network.calculate(input)
input.push(output*)
VALIDATION_CASE.1 := input
return


clickcell:
if (debug)
	msgbox, % a_guicontrol

s:=strsplit(a_guicontrol, "_")

if !(s.1=rows) && !(s.2>=cols-1)  {
	TRAINING_INPUTS[s.1,s.2] := !TRAINING_INPUTS[s.1,s.2]
	drawTRAINING(TRAINING_INPUTS)
}

if !(s.1=rows) && (s.2>=cols-1)  {
	EXPECTED_OUTPUTS[s.1,s.2-3] := !EXPECTED_OUTPUTS[s.1,s.2-3]
	drawEXPECTED(EXPECTED_OUTPUTS)
}

if (s.1=rows) {
	if (s.2=cols)
		return
	VALIDATION_CASE[1,s.2] := !VALIDATION_CASE[1,s.2]
	if ( VALIDATION_CASE.1.4 != "" )
		gosub, calculateValidation
} else {
	VALIDATION_CASE.1.4 := ""
	VALIDATION_CASE.1.5 := ""
}
drawVALIDATION(VALIDATION_CASE)
return

drawTRAINING(array){
 for i, r in array
 for j, c in r
 drawchar( i "_" j , c, color:="red")
}


drawEXPECTED(array){
 global cols
 for i, r in array
 for j, c in r
 drawchar( i "_" 3 + j , c, color:="green")
}


drawValidation(array){
 global rows
 for i, r in array
 for j, c in r
 drawchar( rows "_" j , c, color:="blue")
}


drawchar(varname, chartodraw:="@", color:=""){
 guicontrol,, %varname%, %chartodraw%
 if color
 colorcell(varname, color)
}

ColorCell(cell_to_paint, color:="red"){
 GuiControl, +c%color%  , %cell_to_paint%
 GuiControl, MoveDraw, % cell_to_paint
}

CellFont(cell, params:="", fontname:="") {
 Gui, Font, %params%, %fontname% 
 GuiControl, font , %cell% 
 guicontrol, movedraw, %cell%
}

displayProgress( iterationNr, totalError, maxIterations ) {
	static lastIterationNr := -100000
	if ( iterationNr < lastIterationNr ) {
		lastIterationNr := iterationNr - 600
	}
	if ( iterationNr >= lastIterationNr + 600 ) {
		GUIControl , , Progr, % "Training Progress:	" . Format( "{:u}%", iterationNr / maxIterations * 100 )
		GUIControl , , Error, % "Total Error:		" . Format( "{:}%", totalError )
	}
}

~f7::
debug:=!debug
return

guiclose: 
esc:: 
exitapp 
return

18 Jun 2018, 10:08

nnnik wrote:I worked a little on Speedmasters example grid for section 2 of your tutorial.
Here is the updated version:
-On clicking the VALIDATION CASE the resulting values will be updated automatically
-extracted the network creation and training code and put it into a class
-added a second output
-clicking calculate ANN wont reset the neurons
-added bias to each neuron

Excellent Nnnik, the new class style is very nice

Will link your version in the tutorial aswell.

18 Jun 2018, 11:22

There is a bug in it though.
I tried training a deeper neural network using a part of the MNIST database: (28x28 Inputs, 16 Neurons each layer, 3 hidden layers, 10 outputs )
http://yann.lecun.com/exdb/mnist/
However the performance is abyssmal. I think I might update the entire library to use a Matrix class which uses binary data storage and MCODE to do the calculations.
Additionally the splitting between the input and output inside the train method itself is unneccessary. I think I will just accept 2 parameters and possibly also allow for binary data there.
While the script was still at 0% progress after letting it for a night it was already down to 16.6% total error.

If anybody is interested:
The script expects 2 files from the mnist database in its folder (see my link above)
The training images should be called "images" with no extension.
The training labels should be called "labels" with no extension.

Code: Select all

SetBatchLines, -1

gui, add, text, cblack x20 y+10 vProgr, Training Progress:`t`t`t`t
gui, add, text, cblack x20 y+10 vError, Total Error:`t`t`t`t`t`t`t
gui, show,, MNIST

stream1 := FileOpen("images", "r")
stream2 := FileOpen("labels", "r")
stream1.pos := 16
stream2.pos := 8

dataSet := []
lastIterationNr := 0
Loop 60000 {
	if ( A_Index >= lastIterationNr + 1000 ) {
		GUIControl , , Progr, % "Loading Progress:	" . Round( A_Index / 60000 * 100, 2 ) . "%"
		lastIterationNr := A_Index
	}
	dataSet.Push(data := [])
	Loop % 28 * 28
		data.push(stream1.readUChar()/128-1)
	out := [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
	out[stream2.readUChar()+1] := 1
	data.push(out*)
}

net := new DeepNeuralNetwork(28*28, 16, 3, 10)
net.train(dataSet, 1000, "displayProgress")
return

displayProgress( iterationNr, totalError, maxIterations ) {
		GUIControl , , Progr, % "Training Progress:	" . Round( iterationNr / maxIterations * 100, 2 ) . "%"
		GUIControl , , Error, % "Total Error:		" . Round( totalError * 100, 2 ) . "%"
}

class HiddenLayerNeuralNetwork {
	
	__New( inputCount, neuronCount, outputCount ) {
		This.inputCount  := inputCount
		This.neuronCount := neuronCount
		This.outputCount := outputCount
		This.hiddenLayer := []
		This.outputLayer := []
		This.resetNeurons()
	}
	
	train( dataSet, iterations, callBack := "" ) {
		splitData := {}
		for each, data in dataSet {
			If ( data.MaxIndex() != This.inputCount + This.outputCount )
				Throw exception("bad data count at index " each, -1)
			i := [data.clone()]
			i.1.removeAt(This.inputCount + 1, This.outputCount)
			i.1.push(1)
			itrans := TRANSPOSE_MATRIX(i)
			o := [data.clone()]
			o.1.removeAt(1, This.inputCount)
			splitData.Push({input:i, output:o, trans:itrans})
		}
		hiddenLayer := TRANSPOSE_MATRIX( This.hiddenLayer )
		outputLayer := TRANSPOSE_MATRIX( This.outputLayer )
		Loop %iterations% {
			totalError := 0
			for each, data in splitData {
				HL := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(data.input, hiddenLayer))
				HL.1.push(1)
				output := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(HL, outputLayer))
				
				outputError := DEDUCT_MATRICES(data.output, output)
				totalError += averageArray(outputError.1)				
				outputDelta := MULTIPLY_MEMBER_BY_MEMBER(outputError, DERIVATIVE_OF_SIGMOID_OF_MATRIX(output))
				outputAdjust := MULTIPLY_MATRICES(TRANSPOSE_MATRIX(HL), outputDelta)
				HLError := MULTIPLY_MATRICES(outputError, TRANSPOSE_MATRIX(outputLayer))
				HL.1.Pop()
				HLError.1.Pop()
				HLDelta := MULTIPLY_MEMBER_BY_MEMBER(HLError, DERIVATIVE_OF_SIGMOID_OF_MATRIX(HL))
				HLAdjust := MULTIPLY_MATRICES(data.trans, HLDelta)
				
				hiddenLayer := ADD_MATRICES(hiddenLayer, HLAdjust)
				outputLayer := ADD_MATRICES(outputLayer, outputAdjust)
			}
			totalError := totalError / splitData.length()
		} Until %callBack%(A_Index, totalError, iterations)
		This.hiddenLayer := TRANSPOSE_MATRIX( hiddenLayer )
		This.outputLayer := TRANSPOSE_MATRIX( outputLayer )
	}
	
	calculate( input ) {
		input := [input.clone()]
		input.1.push(1)
		HL := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(input, TRANSPOSE_MATRIX( This.hiddenLayer )))
		HL.1.push(1)
		return SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(HL, TRANSPOSE_MATRIX( This.outputLayer))).1
	}
	
	resetNeurons() {
		Loop % This.neuronCount {
			neuronNr := A_Index
			Loop % This.inputCount + 1 { ;inputs + bias
				Random, weight, -1.0, 1.0
				This.hiddenLayer[neuronNr, A_Index] := weight
			}
		}
		Loop % This.outputCount {
			outputNr := A_Index
			Loop % This.neuronCount + 1 {
				Random, weight, -1.0, 1.0
				This.outputLayer[outputNr, A_Index] := weight
			}
		}
		
	}
}


class DeepNeuralNetwork {
	
	__New( inputCount, neuronCount, neuronDepth, outputCount ) {
		This.inputCount   := inputCount
		This.neuronCount  := neuronCount
		This.neuronDepth  := neuronDepth
		This.outputCount  := outputCount
		This.layers := []
		This.resetNeurons()
	}
	
	train( dataSet, iterations, callBack := "" ) {
		
		splitData := {}
		for each, data in dataSet {
			If ( data.MaxIndex() != This.inputCount + This.outputCount )
				Throw exception("bad data count at index " each, -1)
			i := [data.clone()]
			i.1.removeAt(This.inputCount + 1, This.outputCount)
			i.1.push(1)
			o := [data.clone()]
			o.1.removeAt(1, This.inputCount)
			splitData.Push({input:i, output:o})
		}
		
		layers := []
		for layerNr, layerMatrix in This.layers
			layers[layerNr] := TRANSPOSE_MATRIX( layerMatrix )
		
		Loop %iterations% {
			totalError := 0
			for each,  data in splitData {
				
			;calculate output
				layerIO := data.input
				outputs := {0:layerIO}
				for each, layer in layers {
					layerIO := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(layerIO, layer))
					layerIO.1.push(1)
					outputs.Push(layerIO)
				}
				
			;calculate error for output
				layerIO.1.pop()
				layerError := DEDUCT_MATRICES(data.output, layerIO)
				totalError += averageArray(layerError.1)
				layerDelta := MULTIPLY_MEMBER_BY_MEMBER(layerError, DERIVATIVE_OF_SIGMOID_OF_MATRIX(layerIO))
				layerAdjust := MULTIPLY_MATRICES(TRANSPOSE_MATRIX(outputs[outputs.maxIndex()-1]), layerDelta)
				
			;propagate the error backwards
				Loop % outputs.Length() - 1 {
					currentIndex := outputs.maxIndex() - A_Index
					previousIndex := currentIndex + 1
					layerError := MULTIPLY_MATRICES(layerError, TRANSPOSE_MATRIX(layers[previousIndex]))
					layers[previousIndex] := ADD_MATRICES(layers[previousIndex], layerAdjust)
					layerIO := outputs[currentIndex]
					layerIO.1.pop()
					layerError.1.pop()
					layerDelta := MULTIPLY_MEMBER_BY_MEMBER(layerError, DERIVATIVE_OF_SIGMOID_OF_MATRIX(layerIO))
					layerAdjust := MULTIPLY_MATRICES(TRANSPOSE_MATRIX(outputs[currentIndex-1]), layerDelta)
				}
				layers[currentIndex] := ADD_MATRICES(layers[currentIndex], layerAdjust)
			}
			totalError := totalError / splitData.length()
		} Until %callBack%(A_Index, totalError, iterations)
		for layerNr, layerMatrix in layers
			This.layers[layerNr] := TRANSPOSE_MATRIX( layerMatrix )
	}
	
	calculate( input ) {
		layerIO := [input.clone()]
		layerIO.1.push(1)
		for each, layer in This.layers {
			layerIO := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(layerIO, TRANSPOSE_MATRIX(layer)))
			layerIO.1.push(1)
		}
		layerIO.1.pop()
		return layerIO.1
	}
	
	resetNeurons() {
		Loop % This.neuronCount {
			neuronNr := A_Index
			Loop % This.inputCount + 1 {
				Random, weight, -1.0, 1.0
				This.layers[1, neuronNr, A_Index] := weight
			}
		}
		Loop % This.neuronDepth - 1 {
			layerNr := A_Index + 1
			Loop % This.neuronCount {
				neuronNr := A_Index
				Loop % This.neuronCount + 1 { ;inputs + bias
					Random, weight, -1.0, 1.0
					This.layers[layerNr, neuronNr, A_Index] := weight
				}
			}
		}
		Loop % This.outputCount {
			outputNr := A_Index
			Loop % This.neuronCount + 1 {
				Random, weight, -1.0, 1.0
				This.layers[This.neuronDepth+1, outputNr, A_Index] := weight
			}
		}
	}
}


; The function below applies the sigmoid function to all the members of a matrix and returns the results as a new matrix.
SIGMOID_OF_MATRIX(A)
{
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := 1 / (1 + exp(-1 * value))
	Return resultingMatrix
}


; The function below applies the derivative of the sigmoid function to all the members of a matrix and returns the results as a new matrix. 
DERIVATIVE_OF_SIGMOID_OF_MATRIX(A)
{
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := value * (1 - value)
	Return resultingMatrix
}


; The function below multiplies the individual members of two matrices with the same coordinates one by one (This is NOT equivalent to matrix multiplication).
MULTIPLY_MEMBER_BY_MEMBER(A,B)
{
	If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
		Throw exception("You cannot multiply matrices member by member unless both matrices are of the same size!", -1)
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := value * B[rowNr, columnNr]
	Return resultingMatrix
}


; The function below transposes a matrix. I.E.: Member[2,1] becomes Member[1,2]. Matrix dimensions ARE affected unless it is a square matrix.
TRANSPOSE_MATRIX(A)
{
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[columnNr, rowNr] := value
	Return resultingMatrix
}


; The function below adds a matrix to another.
ADD_MATRICES(A,B)
{
	If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
		Throw exception("You cannot add matrices member by member unless both matrices are of the same size!", -1)
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := value + B[rowNr, columnNr]
	Return resultingMatrix
}


; The function below deducts a matrix from another.
DEDUCT_MATRICES(A,B)
{
	If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
		Throw exception("You cannot subtract matrices member by member unless both matrices are of the same size!", -1)
	resultingMatrix := {}
	For rowNr, rows in A
		For columnNr, value in rows
			resultingMatrix[rowNr, columnNr] := value - B[rowNr, columnNr]
	Return resultingMatrix
}


; The function below multiplies two matrices according to matrix multiplication rules.
MULTIPLY_MATRICES(A,B)
{
	If (A[1].MaxIndex() != B.MaxIndex())
		Throw exception("Number of Columns in the first matrix must be equal to the number of rows in the second matrix.", -1)
	resultingMatrix := {}
	Loop % A.MaxIndex() {
		rowNr := A_Index
		Loop % B.1.MaxIndex() {
			columnNr := A_Index
			resultingMatrix[rowNr, columnNr] := 0
			Loop % A.1.MaxIndex() {
				resultingMatrix[rowNr, columnNr] += A[rowNr, A_Index] * B[A_Index, columnNr]
			}
		}
	}
	Return resultingMatrix
}

;returns the average of an array
averageArray(arr) {
	for each,  errorValue in arr
		totalError += abs(errorValue)
	return totalError / arr.length()
}

18 Jun 2018, 12:47

nnnik wrote:While the script was still at 0% progress after letting it for a night it was already down to 16.6% total error.

This is pretty awesome news actually. 83.4% accuracy is very exciting for a first shot. Random guessing is only 10% accurate, which means you managed to get the AHK coded network to learn a lot about the images. Regarding the processing times, it reminds me of a barcode reader i once developed. At first, it was taking like 2-3 minutes for the code to scan an image. After changing the method of retrieving the pixel colors though, it started to scan the whole images in like 3-5 seconds. Optimization can make huge speed improvements when dealing with bulk data.

Also, maybe we can use Michael Nielsens python code as a rough estimate for how long a well optimized code will take to run the Mnist Database. After running a translation of the code to Python 3.x i saw it achieve 94% accuracy in under 5 minutes. Curisously though, a second run of the same Python code only got to 83,53% accuracy after 10 minutes

Michael probably did a lot of testing untill he got that specific code though (simplicity of code often hides the hardwork of optimizing it).

First run:

: PYTHON RESULTS.png (55.81 KiB) Viewed 15002 times

Second run:

: PYTHON RESULTS2.png (34.55 KiB) Viewed 15002 times

The Python version i ran the code on was 3.6.4 and the files can be downloaded here. To run the code, open the file Nielsen Code 3x by Unknown.Py inside the TEST CASE folder on Python 3.6.4 IDLE. Numpy must be installed aswell.

I think I might update the entire library to use a Matrix class which uses binary data storage and MCODE to do the calculations.
Additionally the splitting between the input and output inside the train method itself is unneccessary. I think I will just accept 2 parameters and possibly also allow for binary data there.

Those are some very good ideas

18 Jun 2018, 13:09

Well it's not hard to get ideas for optimisation from the current implementation - there is a lot of room for it.

I also plan to create different types of neural networks such as Convolutional neural nets and LSTM for RNN.
I might want to look into GANs but right now thats far into the future since finals are coming up and I have a lot of projects coming from my uni.

blue83 · 04 Oct 2018, 04:43

Hi,

I have finally viewed first hour of your webinar with Joe and put your 2 sample excels into ahk code.

SINGLE SAMPLE

Code: Select all

Macro1:
input1 := 1
weight1 := -60
input2 := 1
weight2 := -5
Loop
{
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade := "CORRECT"
        weight1 := weight1ADJ
        weight2 := weight2ADJ
        Break
    }
    Else
    {
        grade := "WRONG"
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
}
MsgBox, 0, , 
(LTrim
done

%weight1%
%weight2%
)
ExitApp
Return

MANY SAMPLES

Code: Select all

Macro1:
weight1 := 89
weight2 := -29
Loop
{
    input1 := 1
    input2 := 1
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade11 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade11 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 1
    input2 := 2
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade12 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade12 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 1
    input2 := 3
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade13 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade13 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 2
    input2 := 1
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade21 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade21 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 2
    input2 := 2
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade22 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade22 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 2
    input2 := 3
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade23 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade23 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 3
    input2 := 1
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade31 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade31 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 3
    input2 := 2
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade32 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade32 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    input1 := 3
    input2 := 3
    calculated := input1*weight1+input2*weight2
    If calculated >= 0
    {
        SAresult := 1
    }
    Else
    {
        SAresult := -1
    }
    If input2 >= %input1%
    {
        expected := 1
    }
    Else
    {
        expected := -1
    }
    totalerror := Expected-SAresult
    weight1ADJ := weight1+(totalerror*input1)
    weight2ADJ := weight2+(totalerror*input2)
    If SAresult = %expected%
    {
        grade33 := 1
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    Else
    {
        grade33 := 0
        weight1 := weight1ADJ
        weight2 := weight2ADJ
    }
    finalresult := grade11+grade12+grade13+grade21+grade22+grade23+grade31+grade32+grade33
    If finalresult = 9
    {
        Break
    }
    Else
    {
        finalresult := ""
        grade11 := ""
        grade12 := ""
        grade13 := ""
        grade21 := ""
        grade22 := ""
        grade23 := ""
        grade31 := ""
        grade32 := ""
        grade33 := ""
    }
}
MsgBox, 0, , 
(LTrim
done

%weight1%
%weight2%
)
ExitApp
Return

I hope that is it.

Also My second goal is listen to 2nd hour of your webinar and read again what is written here on first and every other page.

But my biggest short goal is to make a script where I have excel or csv with around 10.000 or more lines of data with some details about customers, and I want make prediction for lets say another thousand.
According to prediction and data inside excel or csv what will be their paygrade, age, married status, etc.

Can you help me with that?

Thank you

16 Oct 2018, 09:31

Hello Blue83.

But my biggest short goal is to make a script where I have excel or csv with around 10.000 or more lines of data with some details about customers, and I want make prediction for lets say another thousand.
According to prediction and data inside excel or csv what will be their paygrade, age, married status, etc.

Can you help me with that?

Move on to hour 2, and than move on to the multidimensional ANNs in the tutorial. When you get to understand that code, you will be able to modify it to accomodate all the new variables and run the calculations that finetune your predictions.

Joe Glines · 16 Oct 2018, 09:38

There are different models that can be used to come up with a prediction based on your data. The nice thing about a Neural Network is that, as the data changes, the model will change. The models used are quite varied and have a lot to do with what "level" (Nominal, Ordinal, Interval, Ratio) your independent & Dependent variables are. Also, even though Excel has some of these functions (like regression) it really isn't one that is used for it as modeling has a lot of assumptions about your data which Excel is not geared for detecting.

ahketype · 15 Dec 2018, 20:59

This is very exciting. Thanks for the great intro, Gio! It's very early days for me, but I tried your first example with the three bits and began to get some idea how it works. I was surprised to find that it still gave a high probability output (about .9+) fairly reliably even on just 100 training loops.

I immediately began to think this might be useful for a problem I had in mind, classifying text examples, but then I found this https://chatbotslife.com/text-classific ... 4d50dcba45 which suggests that such tasks might be better accomplished using "multinomial naive Bayes" (or similar) methods, due to the unrelated inputs when text is analysed just for its "bag of words". The site does describe a NN version, too. I've not had enough time to study it, and I don't know python.

I also watched the video by 3Blue1Brown https://www.youtube.com/watch?v=aircAruvnKk where at the end there's a comment that sometimes the sigmoid function ("old school") doesn't work as well as ReLU, a function where anything <0.5 becomes 0 and anything >0.5 is transformed linearly. That might be worth playing with on some tasks. Also, this raises a question for me - the ReLU +ve end doesn't have a maximum at 1 (or anywhere) as far as I can make out, because it's linear, and I was wondering if nodes can handle values different from the -1 to +1 values that most examples use.

For my intended task, I'm thinking the number of inputs is going to increase over time (these representing found words in text samples, so at first this will increase very rapidly), and it also needs (less often) to increase the number of outputs (categories to apply to texts). I guess if you can have evolving networks as in that Mario player thing, a network should be able to handle adding inputs and outputs. Whether my AutoHotkey is upto it is another matter.

Cheers

16 Dec 2018, 02:47

The activation function differs greatly depending on the task. You also don't need t to use the same activation function on every layer.

17 Dec 2018, 09:44

Hello Ahketype.

Glad to see you are interested in the subject

I immediately began to think this might be useful for a problem I had in mind, classifying text examples

That does sound like a nice task for a Neural network to solve.

there's a comment that sometimes the sigmoid function ("old school") doesn't work as well as ReLU

The sigmoid function worked alone best in the early (shallow, or not-to-many-layers) ANN models. But when you try and apply it recursively to account for many layers (as you have to do in Deep learning models) you get a probelm called vanishing gradient. The simplest way to describe this phenomenom is that when you apply gradients of sigmoid over grandients of sigmoid and repeat it multiple times, the change in the stacked output is too small compared to the change in the first calculations (or the output of the first layers). This basically means that in a many-multiple-layer scenario, the first layers are rendered almost useless (in the sense that a big change in their weights will result in almost no differnce in the final output), even though they still require a lot of processing time and power to be calculated.

For this reason, in deep-learning the calculations of the first layers of a multiple-layer-model are now mostly done using a different activation function: ReLU. This doesn't mean, however, that you cannot use sigmoid at all, as the final layers of a model can still benefit from it, as Nnnik pointed out.

I guess if you can have evolving networks as in that Mario player thing, a network should be able to handle adding inputs and outputs.

A network creator code can surely handle new inputs. Sometimes, new unpredictable inputs may result in a worse network though. An example is, in the event that some badly writen text is fed to it for recalculation (because you are feeding user-written inputs, in example). So i think you will have to get some evaluation routine going on to make sure the network creator code somehow maintain best results over time (discarding worsening inputs).

ahketype · 19 Dec 2018, 08:23

Thanks, Gio and nnnik, that's useful. I'm at the long-hard-think stage, but I'll do some testing of different ideas soon. I'll probably play about with ANNs for the learning anyway, even if the task I'm thinking about ends up using Bayesian or other algorithms.

ahketype · 02 Jan 2019, 20:48

I've been taking that simple example apart a bit, still trying to get my head round how it works. I thought I'd just mention one or two things I noticed.

1. Here:

Code: Select all

WEIGHTS := Array([Weight1],[Weight2],[Weight3])
EXPECTED_OUTPUTS := Array([0],[1],[1],[0])

the two arrays have an unused dimension, created by inserting the brackets, and everywhere these are referred to with [..., 1]. I even rewrote it making them linear arrays, and it works just the same (I wasn't sure if I'd missed some reason why they had to be 2D arrays). Why is that? Do more complicated examples use the other dimension?

2. In training, the ACQUIRED_OUTPUT is -1 > +1 from using the sigmoid function, I think. But then the calculations include the SIGMOID_GRADIENT := ACQUIRED_OUTPUT * (1 - ACQUIRED_OUTPUT) to improve the values (make them less extreme, from what I gather). I tried to see what the result would be if this additional correction wasn't included, but I'm not sure how to code it, because both terms are used in the following meat of the calculations:

Code: Select all

WEIGHTS[1,1] += TRAINING_INPUTS[A_Index, 1] * ((EXPECTED_OUTPUTS[A_Index, 1] - ACQUIRED_OUTPUT) * SIGMOID_GRADIENT)
...

That is to say, would the ACQUIRED_OUTPUT term be repeated in place of the SIGMOID_GRADIENT, or the latter multiplication term just be omitted, or something else? (I tried both, and they seem to give more "confident" answers; one was actually 1.000000).

3. I'm thinking how this applies to the typical handwritten number recognition problem. I think I'm right in thinking the 10 outputs representing the numerals are just an extension of the three bits here with their weights. The weights would give a final output of a numeral, presumably by just choosing the nearest result to 1, and giving a confidence.

Cheers.
Here's my edited version. I find it helps me to have much shorter variable names, and editing it helped me start to understand it a bit. I've a way to go yet. This has the linear arrays as well.

Code: Select all

/*
1. PREPARATION STEPS
*/

Random, Weight1, -4.0, 4.0 ; We start by initializing random numbers into the weight variables
Random, Weight2, -4.0, 4.0
Random, Weight3, -4.0, 4.0
Weights := Array(Weight1,Weight2,Weight3) ; Here not made into matrix.

Train := Array([0,0,1],[1,1,1],[1,0,1],[0,1,1]) ; input training samples (all organized in a matrix too). 
Expect := Array(0,1,1,0) ; expected answers (here made linear array)


/*
2 . ACTUAL TRAINING
*/

Loop 10000 ; And now we do the net creator code (which is the training code). It will perform 10000 training cycles.
{
	Loop 4 ; For each training cycle, this net creator code will train the network of Weights using the four training samples.
	{
		Raw := 1 / (1 + exp(-1 * Matrix(Train, Weights, A_Index))) ; calculate results (sigmoid-functioned so 0-1)
		SG := Raw * (1 - Raw) ; SG is a sigmoids gradient function to better represent distances between values while still keeping the results between 0 and 1.
		Weights[1] += Train[A_Index, 1] * ((Expect[A_Index] - Raw) * SG) ; Than, each weight is recalculated using the available knowledge in the samples and also the current calculated results.
		Weights[2] += Train[A_Index, 2] * ((Expect[A_Index] - Raw) * SG)
		Weights[3] += Train[A_Index, 3] * ((Expect[A_Index] - Raw) * SG)
		
		; Breaking the formula above: Weight is adjusted (we use +=, not :=) by getting the value of the input byte and multiplying it by the difference between calculated input (sigmoidally treated) and expected input, after this difference is adjusted by the gradient of the sigmoid (removing the sigmoidal  distortions). 
	}
}


; 3. TEST VALIDATION CASE [1,0,0]:
Input1 := 1, Input2 := 0, Input3 := 0

; After recalculating the Weights in 10000 iterations of training, we apply them by multiplying these Weights to inputs
; (this new case is a validation sample, not one of the training ones: [1, 0 ,0])

WeiSol := Input1 * Weights[1] + Input2 * Weights[2] + Input3 * Weights[3]
FinSol := (1 / (1 + EXP(-1 * WeiSol)))
MsgBox, % "VALIDATION CASE: `n" . Input1 . "`, " . Input2 . "`, " . Input3 . "`n`nFINAL Weights: `nWEIGHT1: " . Weights[1] . "`nWEIGHT2: " . Weights[2] . "`nWEIGHT3: " . Weights[3] . "`n`nWEIGHTED SOLUTION: `n" . WeiSol . "`n`nFINAL SOLUTION: `n" . FinSol . "`n`nComments: `nWEIGHTED_SOLUTION: If this is positive, the net believes the answer is 1 (If zero or negative, it belives the answer is 0).`nA FINAL SOLUTION between 0 and 0.5 means the final network thinks the solution is 0. How close the value is to 0 means how certain the net is of that."

; Breaking the output numbers:
; WEIGHTED_SOLUTION: If this is positive, the net believes the answer is 1 (If zero or negative, it belives the answer is 0). The higher a positive value is, 
; the more certain the net is of its answer being 1. The lower a negative value is, the more certain the net is of its answer being 0.  
; FINAL SOLUTION: A sigmoidally treated weighted_solution. If this is above 0.50, the net believes the answer to be 1. The closer to 1, the more certain 
; the net is about that. If this is 0.50 or below it, the net believes the answer to be 0. The closer to 0, the more certain the net is about that.

Return
; The function below is just a single step in multiplying matrices (this is repeated many times to multiply an entire matrix). It is used because the input_data, Weights and expected results were set into matrices for organization purposes.
Matrix(A,B,RowOfA)
{
	If (A[RowOfA].MaxIndex() != B.MaxIndex())
	{
		msgbox, 0x10, Error, Number of Columns in the first matrix must be equal to the number of rows in the second matrix.
		Return
	}
	Result := 0
	Loop % A[RowOfA].MaxIndex()
	{
		Result += A[RowOfA, A_index] * B[A_Index]
	}
	Return Result
}

04 Jan 2019, 12:00

All right, here are my thoughts about your questions Ahketype.

1. There is never actually an "absolute need" for data to be organized in matrices and/or arrays for ANN code to work. That being said, it helps a lot. You have to deal with lot's of data and to do lot's of math with them without losing track of positions and etc, so matrices and arrays are very handy tools and are used extensively in ANN code. But even though matrices and arrays can be used for other purposes aswell, not every programmer is an ace at coding them. Most will rather avoid them actually. If you are studying ANNs though, avoiding having to learn matrices and arrays is nearly impossible as they help a lot as mentioned, and are thus used extensively everywhere. So if you can get used to see ANN code organizing it's data into matrices and arrays early on, it may well be that in the future you will not have too much trouble understanding ANN code in general (or at least, your questions will mostly be directed to ANN subject and not matrices/arrays, which can be learned elsewhere).

It seems that you have managed to remove a dimension in some of the arrays. Cheers for understanding how to do it without breaking the ANN code

2. Let us study the actual code in question more througly.
From this:

Code: Select all

WEIGHTS[1,1] += TRAINING_INPUTS[A_Index, 1] * ((EXPECTED_OUTPUTS[A_Index, 1] - ACQUIRED_OUTPUT) * SIGMOID_GRADIENT)

A simplified pseudo-code would be:

Code: Select all

NEW_WEIGHT := CURRENT_WEIGHT + (INPUTS * ERROR * SIGMOID_GRADIENT)

Explaining it:
We are adjusting the weights, thereby, we have to start using the previous weights (thus, the use of += in the code). The adjustment of the weight is done by adding some value, which can be positive or negative, to the current weight, and this value is obtained from 3 sources: the inputs, the error and the calculated_outputs (or rather, the gradient-sigmoidal value that comes from the calculated_output). Either of these 3 can have a bigger or a lower influence in how much the weight is adjusted by as they are simply being multiplied together.

Considering the inputs is very important because they vary between samples and also because it is they that actually hold the underlying rules we are trying to get out network to absorb. So if we are to absorb anything from the inputs, we have to consider their values into the calculation.

Considering the error (which is the difference between the expected_output and the calculated_output) is also very important because our goal is to recalibrate our weights to reduce the error. Thereby, if our calculation resulted in a big error, we need a big adjustment, but if it has missed by very little, we should also recalibrate the weight just a little. I guess it makes sense right?

Finally, considering the sigmoid_gradient is how we get our formulae to do it's real magic. Neural networks are not really about "finding the real result" in a new unknown sample, but rather about aproximating it as much as possible based on what it can infer from known samples. A result of 0.987655 says a lot about the quality of a guess our network is outputing. A result of 1.000000, on the other hand, is actually a big problem, because a simple look at the known samples can tell us that there is absolutely no real way to make absolute sure that the result will be 1. We are infering the underlying rules, not being sure about them. It may well be that somehow the rule is slightly different and the result is not properly 1. Thus, if the network is calculating 1.000000, it may be guessing beyond what the samples are really telling, and this is probably indicating a flaw in the aproximation algorithm.

The example undelying rule below may sound sketchy for actually being the one in our problem, but it is nonetheless a possible scenario when all possibilities are to be considered (remember: we never actually tell our network what the rule is, so it may well be any rule):

Code: Select all

if ((input1 != 1) and (input2 != 0) and (input3 != 0))
{
 output := input1
}
Else 
{
output := 0
}

Thereby, it is simply bizarre that the network can tell with absolute certainty that 1,0,0 equals 1. That is not a qualitatively perfect guess actually, even though the possibility can be very high. A model like that is likely to not work well for guessing complex problems.

the gradient is the derivative of the sigmoid, and thereby, it represents the slope of the S-shaped curve at any specific point. The sigmoid goes from 0 to 1, but the gradient goes up and down along that curve, depending on the slope at the point. Explaining the inner workings of it's use is hard unless you are well acquainted with calculus, but you can take a look at gradient descent to get a basic idea of how using gradients can lead to an approximation of the ideal solution in a multiple-dimension scenario (such as in the scenarios real ANNs are to be applied) by increasing or decreasing the influence of the results.

This all has to be carefully planned though. In our example code, 0.5 is considered the most far-off result possible, as the expected output should be either 0 or 1. But the slope in point 0.5 at a sigmoid curve is the highest possible, which means the gradient will force the calculation to change everything the most. However, if we are nearing 0 or 1, we don't want to recalibrate a lot, and conveniently, the gradient of the sigmoid at those values is very low, resulting in little change to the weights. This could lead to small influence if the output is at the other side of the scale, but overtime (or rather, over many iterations), influence will increase (as it nears 0.5) and them decrease again (as it nears the other side). The "magic" is perhaps this way of adjusting everything less and less the more we approximate the results, so as to avoid an abrupt adjustment causing the network to actually perform worse than previous iteration.

3. I'm thinking how this applies to the typical handwritten number recognition problem. I think I'm right in thinking the 10 outputs representing the numerals are just an extension of the three bits here with their weights. The weights would give a final output of a numeral, presumably by just choosing the nearest result to 1, and giving a confidence.

There is more than one way a network could do it's job when dealing with this task. It could just give a sum that approximates to the actual number (1, 2, 3, etc) or It could also do as you say: output 10 numbers and whichever is closer to 1, that will be the answer.

Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Who is online