
This indicates that the batch methodrequires more time to train the neural network to yield a similar level ofaccuracy of that of the SGD method.


In other words, the batch method learnsslowly.

比较SGD与批处理方法(Comparison of the SGD and the Batch)


In this section, we practically investigatethe learning speeds of the SGD and the batch.


The errors of these methods are compared atthe end of the training processes for the entire training data.


The following program listing shows theSGDvsBatch.m file, which compares the mean error of the two methods.


In order to evaluate a fair comparison, theweights of both methods are initialized with the same values.

clear all
X = [ 0 0 1;
0 1 1;
1 0 1;
1 1 1;

D = [ 0 0 1 1];
E1 = zeros(1000, 1);
E2 = zeros(1000, 1);
W1 = 2*rand(1, 3) - 1;
W2 = W1;

for epoch = 1:1000 % train

   W1= DeltaSGD(W1, X, D);

   W2= DeltaBatch(W2, X, D);

   es1= 0;

   es2= 0;

N = 4;

for k = 1:N

x = X(k, 😃’;
d = D(k);
v1 = W1x;
y1 = Sigmoid(v1);
es1 = es1 + (d - y1)^2;
v2 = W2
y2 = Sigmoid(v2);
es2 = es2 + (d - y2)^2;


E1(epoch) = es1 /N;

E2(epoch) = es2/ N;
plot(E1, ‘r’)
hold on
plot(E2, ‘b:’)
ylabel(‘Average of Training error’)
legend(‘SGD’, ‘Batch’)


This program trains the neural network1,000 times for each function, DeltaSGD and DeltaBatch.

在每一代训练中,它将训练数据输入神经网络,并计算输出的均方误差(E1, E2)。

At each epoch, it inputs the training datainto the neural network and calculates the mean square error (E1, E2) of theoutput.


Once the program completes 1,000 trainings,it generates a graph that shows the mean error at each epoch.


As Figure 2-20 shows, the SGD yields fasterreduction of the learning error than the batch; the SGD learns faster.


图2-20 SGD的学习速度优于批处理方法The SGD method learnsfaster than the batch method

——本文译自Phil Kim所著的《Matlab Deep Learning》


10-05 08:52