Chapter 3: How To Train The Network?

[TODO: spelcheck]

[TODO: proof-read]

Chapter 3: How To Train The Network?

More Complex Example

I wish there would be a way to determine weights for each neuron in the network manually. But, unfortunetly, it is not the case in real life. Usually real networks have so many neurons that it is impossible to hand pick weights for them. In fact, usually it is even impossible to explain how exactly one neuron impacts on the Network output. Let's modify our initial example in order to create slightly more complex example (and a more complex network).

[TODO: image of the prev network]

We are going to start with the same task: determine if we should go to the party based on the three input factors: favorite drink, friend and party. However in our updated case I want to introduce one high factor: we should never-ever go to the party if there will be drink and the wether not sunny! This means that the new factor outweights the case when you have: drink, rain and friend waiting you there. Last example would give you positive prediction with our previouse network, but in our update case it should not. Have a moment to think if it is possible to assigne such weights to our current network that would sutisfy new requirenemnt.

TODO

I hope I have convinced you that process of manual weights picking is no go. What should we do than? And the answer to this problem is training. We are going to train our nueral netwrok, which means that the network will find weights itself. On a high level the process includes following steps:

  1. Initiate the network with random weights;

  2. Run network on the real data;

  3. Calculate error;

  4. If error is unresonably high: 1. identify, base of the error how much you need to update weights and in which direction 2. update weights 3. repeat process starting from step #2

    There is a reason why initial network initialization need to be done with the random weights, we will discuss it later. For now let's focus on step #3 and more precicely on the following question

What Is The Error?

Indeed, how we going to meassure the error? For example when someone says that apple is an vagitable it is an error, but when another person claims that the apple is a steal bridge for trains is a hole new level of error. Probably if this somene is a newural network we need to notify the netowork some how when the answer is closer to the correct one or when it is actually further away.

Derivative

TODO

Back Prop (SGD)

TODO

Full Picture

TODO

Why initial weights need to be random?

Last updated