NN Menu


Back Propagation


Back Propagation Neural Networks

Back Propagation Neural (BPN) was first introduced in 1960s, It is a multilayer neural network consisting of the input layer, at least one hidden layer and output layer. It is a multi-layer feed forward neural network using Gradient descent approach which exploits the chain rule to optimize the parameter. The error which is calculated at the output layer, by comparing the target output and the actual output, will be propagated back towards the input layer.

The main features of Backpropagation are the iterative, recursive and efficient method through which it calculates the updated weight to improve the network until it is not able to perform the task for which it is being trained.



Back Propagation Neural Networks Training Algorithm

Back Propagation Neural will use binary sigmoid activation function. The training will have the following three phases.

  • Phase 1 − Feed Forward Phase
  • Phase 2 − Back Propagation of error
  • Phase 3 − Updating of weights

Step 0 − initialize the weights and the bias(For easy calculation and simplicity, take some small random values but not zero). also initialize the learning rate α(0, α, 1).

Step 1 − Continue step 2-10 when the stopping condition is not true.

Step 2 − Continue step 3-9 for every training pair.

Phase 1

Step 3 − Each input unit receives input signal xi and sends it to the hidden unit for all i = 1 to n

Step 4Calculate the net input at the hidden unit using the following relation − Q i n j = i = 1 n x i v i j + b j j = 1 t o p

Now calculate the net output by applying the following sigmoidal activation function Q j = f ( Q i n j )

Step 5 Calculate the net input at the output layer unit using the following relation y i n k = j = 1 p Q j w j k + b k k = 1 t o m

Calculate the net output by applying the following sigmoidal activation function y k = f ( y i n k )

Phase 2

Step 6 − Compute the error correcting term, in correspondence with the target pattern received at each output unit, as follows δ k = ( t k y k ) f ( y i n k )

The derivative f ( y i n k ) can be calculated as:


if Binary sigmoid function: f ( y i n k ) = f ( y i n k ) ( 1 f ( y i n k ) )

if Bipolar sigmoid function: f ( y i n k ) = ( 1 + f ( y i n k ) ) ( 1 f ( y i n k ) )

Then, send δ k back to the hidden layer

Step 7 − Now each hidden unit will be the sum of its delta inputs from the output units. δ i n j = k = 1 m δ k w j k

Error term can be calculated as follows − δ j = δ i n j f ( Q i n j )

Phase 3

Step 8 − Each output unit yk (k = 1 to m) updates the weight and bias as follows w j k ( n e w ) = w j k ( o l d ) + Δ w j k b k ( n e w ) = b k ( o l d ) + Δ b k


Δ w j k = α δ k Q j Δ b 0 k = α δ k

Step 9 − Each Hidden unit qj (j = 1 to p) updates the weight and bias as follows v i j ( n e w ) = v i j ( o l d ) + Δ v i j b j ( n e w ) = b j ( o l d ) + Δ b j


Δ v i j = α δ j x i Δ b j = α δ j

Step 10 − Check for the stopping condition, which may be either the number of epochs reached or the target output matches the actual output.


Next Topic :Associate Memory Network