Page 4 of 4

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 04 Jan 2019, 20:08
Wow, Gio, thanks for all that information. What I'm realising is that this is such a complex subject that I'll need to do more learning - I haven't even really studied this thread yet.

I understand the principle of adjusting the weights, and also the use of the gradient (although my calculus is a bit rusty these days); my problem is following the complex formula by which all the data gets weighted and combined and crunched, so it shows when I actually try to construct code to do something new! I didn't do matrix stuff in school, so I'm trying to catch up, but I've been a hobby programmer for 30 years so I'm used to complex data structures. I think it's just about getting familiar enough until it clicks and you kind of see it. Maybe.

I did a little experiment in the last couple of days translating your code to deal with a more complex problem, but it's highlighting how little I know. It's not giving any meaningful results, either because the problem needs a different approach, or because I've stripped the second dimension out and this is when I need to use it. Please don't feel obliged to keep schooling me through this, though, I'm just sharing the journey. I've also switched language, because I find AHK a struggle and I can make quicker progress in BASIC.

The idea was to find a simple mathematical function that I could plug in instead of the three bits and an output bit. I used the highest common factor (HCF) of two integers, which is pretty easy to calculate, and then, if successful, I'd hope the net to "infer" a correct answer to a new pair of integers. Obviously this is different because it accesses the HCF result as an integer >=1, not just a binary as in the original. Anyway, my final results are all approaching 0.99999999, which would be nice if that represented anything.

I'm thinking now about other ways to approach the question, for instance, having just two input "neurons" representing the integers, and normalizing these with A / Range so they're both floats between 0 and 1 - this is a little more like the inputs in the handwritten integers example 3blue1brown uses here, where the brightness of pixels are normalized first. That would obviously be the way to go if the whole sherbang requires 0-1 data. But it might also require Range number of output neurons, as that uses 10 for the different numerals.

As I say, don't go to any trouble over this, but if you want to chip in with suggestions, that's very kind of you. I like messing about with stuff I don't understand until I either understand it or get bored and give up! I'll go back to reading for a while. Cheers.

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 07 Jan 2019, 11:46
Hello Ahketype,

I see that you are coming across some dificulties, but it also seems like you are doing it just right: the path of experimentation, inquiring into the specific parts of what constitues an ANN, the idea of approaching a problem using these bases, and the attempt to have the theory sink in, that is indeed mostly what goes in the study of ANNs.

To be perfectly honest, when dealing with ANNs, it is not always easy to get a first working code for solving a problem. The theory is a basis of study exactly because everything in the code does change a lot between succesful implementations. One example: in the example code of this tutorial, we are setting the initial weights using random numbers between -4 and +4. When i change this to a broader range of numbers (i.e., between -10000 and +10000) i found that the code gets much harder time approximating to the correct answers. Also, when i attempted to run a python code for solving handwriten digits using the mnist database, i ran the code and it quickly achieved over 90% accuracy, but when i ran it again, it took a while and yet could not exceed 85% accuracy. After studying the code for a while, i think that this is closely related to the initial weights, as if the a certain set of initial weights would have a huge impact on how many iterations are required to approximate the results, but the truth is that i am yet to find the actual reasoning that could have me understand the exactly best initial random range for the MNIST database in example. I like to think that this is somewhat similar to a classrom full of students, in which some of them will naturaly understand the teachers words, while others will require many more hours of studying to get the same level of understanding, but none of them is really better than the others, it may just be that some of them somehow lucked out in having their brains more prepared for that specific type of problem when the class had just started (because of somehow unrelated life experiences?).

Anyways, this is just me trying to get an analogy

Everything in the code of the example can be changed: The initital weights, the number of neurons, the number of iterations, the way data gets fed into the net, and even the formula that updates the weights (it's not like adding the results of multiplying inputs, error and gradient is the only way to go for every problem). Changing these will result in attempts that may or may not make the network creator code more succesful at solving the problem at hand. Experimentation is likely the only way through which we can somehow get better at creating networks (or getting a more natural feel to what to try first). Or so i think. Big companies (tesla, google, etc) have been trying for like 10+ years to get nets that allow perfect automated car driving, and they have been very succesful actually, but none of them is getting the exactly same results. The way things work now, it may well be that some small team of people will ge there first (or not), that is why ANNs are such an interesting field of research for programmers in my opinion: i think one can still get a name for themselves if they are both dedicated and lucky.

My suggestion for a begginer would be to study some workings codes first (before attempting to write one from scratch to deal with a new task). Studying a variety of working code can probably lessen up the hardships of beggining a dive into this subject.

Also, have you taken a look at part II of the tutorial? The code in part I is somewhat limited, you will need at the very least a multi-layered vanilla network to solve most real problems.

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 11 Jan 2019, 20:27
Thanks again Gio. Yikes, it's too hard for me! I think I'll leave artificial intelligence and continue with my current project, natural stupidity. But it was interesting and I've learned a few things. I think I get why there were other dimensions in the arrays now, having looked at the second example.
All the best.

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 09 Mar 2019, 12:55
Wooo, What a nice post.
I will check it out.

Thanks !!!

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 09 Mar 2019, 20:20
Nice tutorial. Don't know why I didn't see it before.

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 27 Aug 2019, 14:51
Hi to all,

Does anyone knows how to get output result of 5 if bellow are given conditions?

Code: Select all

``````TRAINING_INPUTS := Array([5,3,1],[1,1,1],[5,0,7],[3,1,1])
EXPECTED_OUTPUTS := Array([5],[1],[5],[3])
Input1 := 5, Input2 := 8, Input3 := 0``````
or

Maybe this example, result must be 0 or 1:

Code: Select all

``````TRAINING_INPUTS := Array([5,3,1],[1,1,1],[5,0,7],[3,1,1])
EXPECTED_OUTPUTS := Array([0],[1],[1],[0])
Input1 := 5, Input2 := 8, Input3 := 0``````

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 29 Aug 2019, 15:21
@blue83, thanks for the reply and your interested in the subject of neural networks.

Writing neural networks is currently a skill, so every new problem must be throughly analyzed and the according code usually comes from a lot of trial and error (there is actually no out-of-the-box-solves-everything-and-is-efficient-code). The problem you present above is much more complicated than the one in this tutorial. I see it as a problem that requires the the network to quantitatively identify numbers (instead of just working a boolean test like the sample problem). For this reason, the sample code in the tutorial must be adjusted.

That being said, your problem allowed me to study the issue of having a network understand some rudimentary quantification. The following network creator code succesfully creates a network that identifies a number 5 as the first number in a 2-number sequence (after some possible failed attempts, as i also experimented with weight reshufling, and only between the first 5 integer numbers).

Sequences from 1,1 to 5,5 (25 possibilities) make up the statistical universe:

Code: Select all

``````SetBatchLines, -1

; The code below does a lot of matricial calculations. This is important mostly as a means of organization. We would need far too many loose variables if we did not used matrices, so we are better off using them.

; We start by initializing random numbers into the weight variables (this simulates a first hipotesis of a solution and allows the beggining of the training).
; Since we are planning to have a first layer with 4 neurons that have 3 inputs each and a second layer with 1 neuron that has 4 inputs, we need a total of 16 initial hipothesis (random weights)
Loop 75
{
Random, Weight_%A_Index%, -1.0, 1.0
}

WEIGHTS_1 := Object()
NEXT_NUMBER := 1
Loop 2
{
CURRENT_ROW := A_index
Loop 25
{
NUMBER_TO_USE := NEXT_NUMBER
NEXT_NUMBER++
WEIGHTS_1[CURRENT_ROW, A_Index] := Weight_%NUMBER_TO_USE%
}
}

WEIGHTS_2 := Object()
Loop 25
{
NUMBER_TO_USE := NEXT_NUMBER
NEXT_NUMBER++
WEIGHTS_2[A_index, 1] := Weight_%NUMBER_TO_USE%
}

TRAINING_INPUTS := array([5,1],[1,1],[4,1],[1,4],[1,5],[2,4],[2,3],[3,3],[5,3],[3,4],[3,5],[5,5],[5,2],[2,2],[4,2]) ; 15 out of 25 possible cases are used as training set.
EXPECTED_OUTPUTS := array([1],[0],[0],[0],[0],[0],[0],[0],[1],[0],[0],[1],[1],[0],[0])

; Below we are declaring a number of objects that we will need to hold our matrices.
OUTPUT_LAYER_1 := Object(), OUTPUT_LAYER_2 := Object(), OUTPUT_LAYER_1_DERIVATIVE := Object(), OUTPUT_LAYER_2_DERIVATIVE := Object(), LAYER_1_DELTA := Object(), LAYER_2_DELTA := Object(), OLD_INDEX := 0

Loop 1000 ; This is the training loop (The network creator code). In this loop we recalculate weights to aproximate desired results based on the samples. We will do 5.000 training cycles (care must be taken not to overtrain!).
{
; First, we calculate an output from layer 1. This is done by multiplying the inputs and the weights.
OUTPUT_LAYER_1 := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(TRAINING_INPUTS, WEIGHTS_1))

; Than we calculate a derivative (rate of change) for the output of layer 1.
OUTPUT_LAYER_1_DERIVATIVE := DERIVATIVE_OF_SIGMOID_OF_MATRIX(OUTPUT_LAYER_1)

; Next, we calculate the outputs of the second layer.
OUTPUT_LAYER_2 := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(OUTPUT_LAYER_1, WEIGHTS_2))

; And than we also calculate a derivative (rate of change) for the outputs of layer 2.
OUTPUT_LAYER_2_DERIVATIVE := DERIVATIVE_OF_SIGMOID_OF_MATRIX(OUTPUT_LAYER_2)

; Next, we check the errors of layers 2. Since layer 2 is the last, this is just a difference between calculated results and expected results.
LAYER_2_ERROR := DEDUCT_MATRICES(EXPECTED_OUTPUTS, OUTPUT_LAYER_2)

; Now we calculate a delta for layer 2. A delta is a rate of change: how much a change will affect the results.
LAYER_2_DELTA := MULTIPLY_MEMBER_BY_MEMBER(LAYER_2_ERROR, OUTPUT_LAYER_2_DERIVATIVE)

; Than, we transpose the matrix of weights (this is just to allow matricial multiplication, we are just reseting the dimensions of the matrix).
WEIGHTS_2_TRANSPOSED := TRANSPOSE_MATRIX(WEIGHTS_2)

; !! IMPORTANT !!
; So, we multiply (matricial multiplication) the delta (rate of change) of layer 2 and the transposed matrix of weights of layer 2.
; This is what gives us a matrix that represents the error of layer 1 (REMEBER: The error of layer 1 is measured by the rate of change of layer 2).
; It may seem counter-intuitive at first that the error of layer 1 is calculated solely with arguments about layer 2, but you have to interpret this line alongside the line below (just read it).
LAYER_1_ERROR := MULTIPLY_MATRICES(LAYER_2_DELTA, WEIGHTS_2_TRANSPOSED)

;Thus, when we calculate the delta (rate of change) of layer 1, we are finally connecting the layer 2 arguments (by the means of LAYER_1_ERROR) to layer 1 arguments (by the means of layer_1_derivative).
; The rates of change (deltas) are the key to understand multi-layer neural networks. Their calculation answer this: If i change the weights of layer 1 by X, how much will it change layer 2s output?
; This Delta defines the adjustment of the weights of layer 1 a few lines below...
LAYER_1_DELTA := MULTIPLY_MEMBER_BY_MEMBER(LAYER_1_ERROR, OUTPUT_LAYER_1_DERIVATIVE)

; Than, we transpose the matrix of training inputs (this is just to allow matricial multiplication, we are just reseting the dimensions of the matrix to better suit it).
TRAINING_INPUTS_TRANSPOSED := TRANSPOSE_MATRIX(TRAINING_INPUTS)

; Finally, we calculate how much we have to adjust the weights of layer 1. The delta of the Layer 1 versus the inputs we used this time are the key here.

; Another matricial transposition to better suit multiplication...
OUTPUT_LAYER_1_TRANSPOSED := TRANSPOSE_MATRIX(OUTPUT_LAYER_1)

; And finally, we also calculate how much we have to adjust the weights of layer 2. The delta of the Layer 2 versus the inputs of layer 2 (which are really the outputs of layer 1) are the key here.

; And than we adjust the weights to aproximate intended results.

; The conditional below is just to display the current progress in the training loop.
If (A_Index >= OLD_INDEX + 10)
{
TrayTip, Status:, % "TRAINING A NEW NETWORK: " . Round(A_Index / 10, 0) . "`%"
OLD_INDEX := A_Index
}
}

; TESTING OUR OUPUT NETWORK!
; The loop below will evaluate if our calculated network is is accurate enougth to predict all possible cases. If not, the network will be droped and the script reloaded. (This is to avoid losing too much time on bad sets of initial weights...)
Loop 5
{
NUMBER_1 := A_index
Loop 5
{
NUMBER_2 := A_Index
CASE := Array([NUMBER_1,NUMBER_2])
OUT_1 := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(CASE, WEIGHTS_1))
OUT_2 := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(OUT_1, WEIGHTS_2))
If ((NUMBER_1 = 5) AND (OUT_2[1,1] >= 0.5))
{
}
Else If ((NUMBER_1 = 5) AND (OUT_2[1,1] < 0.5))
{
}
Else If ((NUMBER_1 < 5) AND (OUT_2[1,1] < 0.5))
{
}
Else If ((NUMBER_1 < 5) AND (OUT_2[1,1] > 0.5))
{
}
}
}

; Now that we have a perfect network, it's time to prove it's power!
; The code below will apply the network to solve every possible case (25 possibilities) and present the net's conclusions individually.
{
Sleep 3000
}
Else
Loop 5
{
NUMBER_1 := A_index
Loop 5
{
NUMBER_2 := A_index
CASE := Array([NUMBER_1,NUMBER_2])
OUT_1 := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(CASE, WEIGHTS_1))
OUT_2 := SIGMOID_OF_MATRIX(MULTIPLY_MATRICES(OUT_1, WEIGHTS_2))
If (OUT_2[1,1] > 0.5)
{
}
else
{
ANSWER := " is NOT 5"
}
msgbox % "The final network thinks the first number of [" . NUMBER_1 . "," . NUMBER_2 . "]" . ANSWER
}
}

RETURN ; aaaand That's it !! :D The logical part of the ANN code ends here (the results are displayed above). Below are just the bodies of the functions that do the math (matricial multiplication, sigmoid function, etc). But you can have a look at them if you want, i will provide some explanation there too.

; The function below applies a sigmoid function to a single value and returns the results.
Sigmoid(x)
{
return  1 / (1 + exp(-1 * x))
}

Return
; The function below applies the derivative of the sigmoid function to a single value and returns the results.
Derivative(x)
{
Return x * (1 - x)
}

Return
; The function below applies the sigmoid function to all the members of a matrix and returns the results as a new matrix.
SIGMOID_OF_MATRIX(A)
{
RESULT_MATRIX := Object()
Loop % A.MaxIndex()
{
CURRENT_ROW := A_Index
Loop % A[1].MaxIndex()
{
CURRENT_COLUMN := A_Index
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := 1 / (1 + exp(-1 * A[CURRENT_ROW, CURRENT_COLUMN]))
}
}
Return RESULT_MATRIX
}

Return
; The function below applies the derivative of the sigmoid function to all the members of a matrix and returns the results as a new matrix.
DERIVATIVE_OF_SIGMOID_OF_MATRIX(A)
{
RESULT_MATRIX := Object()
Loop % A.MaxIndex()
{
CURRENT_ROW := A_Index
Loop % A[1].MaxIndex()
{
CURRENT_COLUMN := A_Index
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := A[CURRENT_ROW, CURRENT_COLUMN] * (1 - A[CURRENT_ROW, CURRENT_COLUMN])
}
}
Return RESULT_MATRIX
}

Return
; The function below multiplies the individual members of two matrices with the same coordinates one by one (This is NOT equivalent to matrix multiplication).
MULTIPLY_MEMBER_BY_MEMBER(A,B)
{
If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
{
msgbox, 0x10, Error, You cannot multiply matrices member by member unless both matrices are of the same size!
Return
}
RESULT_MATRIX := Object()
Loop % A.MaxIndex()
{
CURRENT_ROW := A_Index
Loop % A[1].MaxIndex()
{
CURRENT_COLUMN := A_Index
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := A[CURRENT_ROW, CURRENT_COLUMN] * B[CURRENT_ROW, CURRENT_COLUMN]
}
}
Return RESULT_MATRIX
}

Return
; The function below transposes a matrix. I.E.: Member[2,1] becomes Member[1,2]. Matrix dimensions ARE affected unless it is a square matrix.
TRANSPOSE_MATRIX(A)
{
TRANSPOSED_MATRIX := Object()
Loop % A.MaxIndex()
{
CURRENT_ROW := A_Index
Loop % A[1].MaxIndex()
{
CURRENT_COLUMN := A_Index
TRANSPOSED_MATRIX[CURRENT_COLUMN, CURRENT_ROW] := A[CURRENT_ROW, CURRENT_COLUMN]
}
}
Return TRANSPOSED_MATRIX
}

Return
; The function below adds a matrix to another.
{
If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
{
msgbox, 0x10, Error, You cannot subtract matrices unless they are of same size! (The number of rows and columns must be equal in both)
Return
}
RESULT_MATRIX := Object()
Loop % A.MaxIndex()
{
CURRENT_ROW := A_Index
Loop % A[1].MaxIndex()
{
CURRENT_COLUMN := A_Index
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := A[CURRENT_ROW,CURRENT_COLUMN] + B[CURRENT_ROW,CURRENT_COLUMN]
}
}
Return RESULT_MATRIX
}

Return
; The function below deducts a matrix from another.
DEDUCT_MATRICES(A,B)
{
If ((A.MaxIndex() != B.MaxIndex()) OR (A[1].MaxIndex() != B[1].MaxIndex()))
{
msgbox, 0x10, Error, You cannot subtract matrices unless they are of same size! (The number of rows and columns must be equal in both)
Return
}
RESULT_MATRIX := Object()
Loop % A.MaxIndex()
{
CURRENT_ROW := A_Index
Loop % A[1].MaxIndex()
{
CURRENT_COLUMN := A_Index
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] := A[CURRENT_ROW,CURRENT_COLUMN] - B[CURRENT_ROW,CURRENT_COLUMN]
}
}
Return RESULT_MATRIX
}

Return
; The function below multiplies two matrices according to matrix multiplication rules.
MULTIPLY_MATRICES(A,B)
{
If (A[1].MaxIndex() != B.MaxIndex())
{
msgbox, 0x10, Error, Number of Columns in the first matrix must be equal to the number of rows in the second matrix.
Return
}
RESULT_MATRIX := Object()
Loop % A.MaxIndex() ; Rows of A
{
CURRENT_ROW := A_Index
Loop % B[1].MaxIndex() ; Cols of B
{
CURRENT_COLUMN := A_Index
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN]  := 0
Loop % A[1].MaxIndex()
{
RESULT_MATRIX[CURRENT_ROW, CURRENT_COLUMN] += A[CURRENT_ROW, A_Index] * B[A_Index, CURRENT_COLUMN]
}
}
}
Return RESULT_MATRIX
}

Return
; The function below does a single step in matrix multiplication (THIS IS NOT USED HERE).
MATRIX_ROW_TIMES_COLUMN_MULTIPLY(A,B,RowA)
{
If (A[RowA].MaxIndex() != B.MaxIndex())
{
msgbox, 0x10, Error, Number of Columns in the first matrix must be equal to the number of rows in the second matrix.
Return
}
Result := 0
Loop % A[RowA].MaxIndex()
{
Result += A[RowA, A_index] * B[A_Index, 1]
}
Return Result
}``````
I consider the code above a mere first step towards making a network that solves the problems you proposed (quantitatively identifying the first number of a 3-number sequence). The nerwork is a simple 2-layer network (25 neurons in first layer, 1 neuron in second).

### Re: Neural Network basics - Artificial Intelligence using AutoHotkey!

Posted: 29 Aug 2019, 16:03

You know why I am asking this, because if this is resolved this sample bellow will be also possible to resolve.

Problem is if we have all this parameters for training and we enter new parameters, neural network need to decide will someone have income more or less than 50k (example is from MS Azure, there are 32 thousand rows in that .csv file, so lot of examples to train our network).

Finally that can be also useful in many other cases for every bussiness.

What do you think about that?

age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country income
39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba <=50K
37 Private 284582 Masters 14 Married-civ-spouse Exec-managerial Wife White Female 0 0 40 United-States <=50K
49 Private 160187 9th 5 Married-spouse-absent Other-service Not-in-family Black Female 0 0 16 Jamaica <=50K
52 Self-emp-not-inc 209642 HS-grad 9 Married-civ-spouse Exec-managerial Husband White Male 0 0 45 United-States >50K
31 Private 45781 Masters 14 Never-married Prof-specialty Not-in-family White Female 14084 0 50 United-States >50K
42 Private 159449 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 5178 0 40 United-States >50K
37 Private 280464 Some-college 10 Married-civ-spouse Exec-managerial Husband Black Male 0 0 80 United-States >50K
30 State-gov 141297 Bachelors 13 Married-civ-spouse Prof-specialty Husband Asian-Pac-Islander Male 0 0 40 India >50K
23 Private 122272 Bachelors 13 Never-married Adm-clerical Own-child White Female 0 0 30 United-States <=50K
32 Private 205019 Assoc-acdm 12 Never-married Sales Not-in-family Black Male 0 0 50 United-States <=50K