RegularizationÂ¶

In this post, we applied some regularization techniques: drop out and early stopping. The purpose is to prevent overfitting, in which the model fits the training data very well, but can not generalize to new data. When the model overfits, it has high accuracy on training set but low accuracy on test set. In addition to regularization techniques, we also mention how to save the model in order to use it later.

# First, load the model. Please check the previous post on how to save the MNIST data
import numpy as np
MNIST_data = np.load('/home/vietanh/data/MNIST/MNIST_data.npz')
train_data = MNIST_data['train_data']
train_labels = MNIST_data['train_labels']
validation_data = MNIST_data['validation_data']
validation_labels = MNIST_data['validation_labels']
test_data = MNIST_data['test_data']
test_labels = MNIST_data['test_labels']

print 'MNIST train data shape is: ', train_data.shape
print 'MNIST train data label shape is: ', train_labels.shape
print train_data[0,200:250]

MNIST train data shape is:  (55000, 784)
MNIST train data label shape is:  (55000, 10)
[ 0.          0.          0.          0.          0.          0.          0.
  0.38039219  0.37647063  0.3019608   0.46274513  0.2392157   0.          0.
  0.          0.          0.          0.          0.          0.          0.
  0.          0.          0.          0.          0.          0.
  0.35294119  0.5411765   0.92156869  0.92156869  0.92156869  0.92156869
  0.92156869  0.92156869  0.98431379  0.98431379  0.97254908  0.99607849
  0.96078438  0.92156869  0.74509805  0.08235294  0.          0.          0.
  0.          0.          0.          0.        ]

We can see that there are 55000 training samples, every sample is an image of a number from 0 to 9. Each image is 28 pixels by 28 pixels. We can flatten this array into a vector of 28 x 28 = 784 numbers.

import tensorflow as tf
# Reset the graph 
tf.reset_default_graph()
# Create new session
sess = tf.Session()

x = tf.placeholder(tf.float32, [None, 784],name ='x')
W = tf.Variable(tf.zeros([784, 10]),name = 'W')
b = tf.Variable(tf.zeros([10]),name ='b')
y = tf.nn.softmax(tf.matmul(x, W) + b, name = 'y')

# Drop out: 
keep_prob = 1.0
y_drop_out = tf.nn.dropout(y,keep_prob, name = 'y_drop_out')

# Save model:
saver_best = tf.train.Saver(name='best_model')

# Early Stopping
epoch_early_stopping = 3

# y_ has shape [None,10] , which None is batch size, and 10 is label size. 
# label is one hot vector, for example, if the number is 3 then the correspoding
# one hot vector is [0,0,0,1,0,0,0,0,0,0]
y_ = tf.placeholder(tf.float32, [None, 10], name = 'y_')
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_drop_out), reduction_indices=[1]))
train_op = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy, name = 'train_op')
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name = 'accuracy')

# Initialize the variables : sets the variable to its initial value
# In this case, variable W is set to be an zeros array with shape [784,10] 
# variable b is an zeros array with shape [10]
init_op = tf.global_variables_initializer()  
sess.run(init_op)

epoch = 100
batch_size = 5000
no_train_batch = len(train_labels)/batch_size 
best_accuracy_dev = -5; best_epoch = 0

for i in range(epoch):
    print '--------------'
    print 'Epoch ', i
    ptr = 0
    for j in range(no_train_batch): 
        train_batch, train_label = train_data[ptr:ptr+batch_size,:],train_labels[ptr:ptr+batch_size,:]  
        feed_dict_train={x: train_batch, y_: train_label}
        
        # We train the network by "run" the train_op, with input is declared in the feed_dict_train
        sess.run(train_op,feed_dict_train)
        
        accuracy_train = sess.run(accuracy, feed_dict_train)
        ptr+=batch_size
        
    # We measure the accuracy on the validation set every epoch
    accuracy_dev = sess.run(accuracy, feed_dict={x: validation_data, y_: validation_labels})
    print 'accuracy in dev set is', accuracy_dev
    
    # Save the best epoch
    if (accuracy_dev >  best_accuracy_dev):                    
        # Save the best model and the White noise related to this best model
        saver_best.save(sess, '/home/vietanh/data/MNIST/model')        
        print 'Saved best model'
        best_accuracy_dev = accuracy_dev
        best_epoch = i
        
    # Early stopping. If after a fixed number of epochs, the accuracy on the validation set
    # not changed, then we stop. 
    if (i - best_epoch > epoch_early_stopping):
        print 'Early Stopping'
        # Break out from the for-loop.
        break

--------------
Epoch  0
accuracy in dev set is 0.847
Saved best model
--------------
Epoch  1
accuracy in dev set is 0.8712
Saved best model
--------------
Epoch  2
accuracy in dev set is 0.8802
Saved best model
--------------
Epoch  3
accuracy in dev set is 0.8874
Saved best model
--------------
Epoch  4
accuracy in dev set is 0.8916
Saved best model
--------------
Epoch  5
accuracy in dev set is 0.895
Saved best model
--------------
Epoch  6
accuracy in dev set is 0.8962
Saved best model
--------------
Epoch  7
accuracy in dev set is 0.8982
Saved best model
--------------
Epoch  8
accuracy in dev set is 0.899
Saved best model
--------------
Epoch  9
accuracy in dev set is 0.901
Saved best model
--------------
Epoch  10
accuracy in dev set is 0.9018
Saved best model
--------------
Epoch  11
accuracy in dev set is 0.904
Saved best model
--------------
Epoch  12
accuracy in dev set is 0.905
Saved best model
--------------
Epoch  13
accuracy in dev set is 0.9066
Saved best model
--------------
Epoch  14
accuracy in dev set is 0.9088
Saved best model
--------------
Epoch  15
accuracy in dev set is 0.9094
Saved best model
--------------
Epoch  16
accuracy in dev set is 0.91
Saved best model
--------------
Epoch  17
accuracy in dev set is 0.9108
Saved best model
--------------
Epoch  18
accuracy in dev set is 0.9116
Saved best model
--------------
Epoch  19
accuracy in dev set is 0.9116
--------------
Epoch  20
accuracy in dev set is 0.9126
Saved best model
--------------
Epoch  21
accuracy in dev set is 0.9134
Saved best model
--------------
Epoch  22
accuracy in dev set is 0.9134
--------------
Epoch  23
accuracy in dev set is 0.9138
Saved best model
--------------
Epoch  24
accuracy in dev set is 0.9142
Saved best model
--------------
Epoch  25
accuracy in dev set is 0.9144
Saved best model
--------------
Epoch  26
accuracy in dev set is 0.9144
--------------
Epoch  27
accuracy in dev set is 0.9156
Saved best model
--------------
Epoch  28
accuracy in dev set is 0.916
Saved best model
--------------
Epoch  29
accuracy in dev set is 0.9162
Saved best model
--------------
Epoch  30
accuracy in dev set is 0.9164
Saved best model
--------------
Epoch  31
accuracy in dev set is 0.9164
--------------
Epoch  32
accuracy in dev set is 0.9164
--------------
Epoch  33
accuracy in dev set is 0.9164
--------------
Epoch  34
accuracy in dev set is 0.9172
Saved best model
--------------
Epoch  35
accuracy in dev set is 0.9176
Saved best model
--------------
Epoch  36
accuracy in dev set is 0.9178
Saved best model
--------------
Epoch  37
accuracy in dev set is 0.9184
Saved best model
--------------
Epoch  38
accuracy in dev set is 0.9182
--------------
Epoch  39
accuracy in dev set is 0.9182
--------------
Epoch  40
accuracy in dev set is 0.9184
--------------
Epoch  41
accuracy in dev set is 0.919
Saved best model
--------------
Epoch  42
accuracy in dev set is 0.9192
Saved best model
--------------
Epoch  43
accuracy in dev set is 0.9202
Saved best model
--------------
Epoch  44
accuracy in dev set is 0.9202
--------------
Epoch  45
accuracy in dev set is 0.9204
Saved best model
--------------
Epoch  46
accuracy in dev set is 0.9208
Saved best model
--------------
Epoch  47
accuracy in dev set is 0.921
Saved best model
--------------
Epoch  48
accuracy in dev set is 0.9212
Saved best model
--------------
Epoch  49
accuracy in dev set is 0.921
--------------
Epoch  50
accuracy in dev set is 0.9214
Saved best model
--------------
Epoch  51
accuracy in dev set is 0.9214
Saved best model
--------------
Epoch  52
accuracy in dev set is 0.9212
--------------
Epoch  53
accuracy in dev set is 0.9214
--------------
Epoch  54
accuracy in dev set is 0.9212
--------------
Epoch  55
accuracy in dev set is 0.9216
Saved best model
--------------
Epoch  56
accuracy in dev set is 0.9218
Saved best model
--------------
Epoch  57
accuracy in dev set is 0.922
Saved best model
--------------
Epoch  58
accuracy in dev set is 0.9222
Saved best model
--------------
Epoch  59
accuracy in dev set is 0.9224
Saved best model
--------------
Epoch  60
accuracy in dev set is 0.9224
--------------
Epoch  61
accuracy in dev set is 0.9224
--------------
Epoch  62
accuracy in dev set is 0.9228
Saved best model
--------------
Epoch  63
accuracy in dev set is 0.9228
--------------
Epoch  64
accuracy in dev set is 0.9228
--------------
Epoch  65
accuracy in dev set is 0.9232
Saved best model
--------------
Epoch  66
accuracy in dev set is 0.923
--------------
Epoch  67
accuracy in dev set is 0.923
--------------
Epoch  68
accuracy in dev set is 0.9232
--------------
Epoch  69
accuracy in dev set is 0.9234
Saved best model
--------------
Epoch  70
accuracy in dev set is 0.924
Saved best model
--------------
Epoch  71
accuracy in dev set is 0.924
--------------
Epoch  72
accuracy in dev set is 0.9244
Saved best model
--------------
Epoch  73
accuracy in dev set is 0.9242
--------------
Epoch  74
accuracy in dev set is 0.9244
--------------
Epoch  75
accuracy in dev set is 0.9244
--------------
Epoch  76
accuracy in dev set is 0.9242
Early Stopping