Introduction to Tensorflow¶

Tensorflow is an useful tool to build neural network by Google. You can check the installation guide at: https://www.tensorflow.org/install/. Below code is working with Python 2.7 and Tensorflow version r1.2. Newer versions of Python and Tensorflow might need some small modifications. First, let's start with a simple program:

In [1]:
import tensorflow as tf
# Create session with the TensorFlow run time
sess =  tf.Session()
# Define some ops
a = tf.constant(2.0)
b = tf.constant(3.0)
c = a * b
print 'c is',c
print 'when using sess.run(c): c is', sess.run(c)
# Close session
sess.close()
c is Tensor("mul:0", shape=(), dtype=float32)
when using sess.run(c): c is 6.0

The code prints the result of 2 x 3. In Python, we just need to define the operations. However, in Tensorflow, we need to do the following steps:

  1. Create a session: sess = tf.Session()
  2. Define the computational graph
  3. Run the computational graph within this session to get the result

In order to feed the data to the network, we can use placeholder. With the placeholder, we can feed data for training and testing phase. Let's come back to the previous example using the placeholder

In [2]:
import tensorflow as tf
sess =  tf.Session()
x = tf.placeholder(tf.float32, shape=[1],name='input1') 
y = tf.placeholder(tf.float32, shape=[1],name='input2') 
z = x * y
feed_dict = {x:[2.0],y:[3.0]}
print sess.run(z, feed_dict)
# Close session
sess.close()
[ 6.]

Now, let's start with a simple feed forward network. We use the example from Tensorflow website: https://www.tensorflow.org/get_started/mnist/beginners.

In [3]:
# First, load the MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
In [4]:
# Save data to folder as numpy array
import numpy as np
train_data = mnist.train.images
train_labels = mnist.train.labels 
validation_data = mnist.validation.images
validation_labels = mnist.validation.labels 
test_data = mnist.test.images
test_labels = mnist.test.labels 
np.savez('/home/vietanh/data/MNIST/MNIST_data',train_data = train_data ,train_labels = train_labels\
,validation_data = validation_data ,validation_labels = validation_labels \
,test_data = test_data, test_labels = test_labels )
In [5]:
print 'MNIST train data shape is: ', train_data.shape
print 'MNIST train data label shape is: ', train_labels.shape
print train_data[0,200:250]
MNIST train data shape is:  (55000, 784)
MNIST train data label shape is:  (55000, 10)
[ 0.          0.          0.          0.          0.          0.          0.
  0.38039219  0.37647063  0.3019608   0.46274513  0.2392157   0.          0.
  0.          0.          0.          0.          0.          0.          0.
  0.          0.          0.          0.          0.          0.
  0.35294119  0.5411765   0.92156869  0.92156869  0.92156869  0.92156869
  0.92156869  0.92156869  0.98431379  0.98431379  0.97254908  0.99607849
  0.96078438  0.92156869  0.74509805  0.08235294  0.          0.          0.
  0.          0.          0.          0.        ]

We can see that there are 55000 training samples, every sample is an image of a number from 0 to 9. Each image is 28 pixels by 28 pixels. We can flatten this array into a vector of 28 x 28 = 784 numbers.

In [6]:
import tensorflow as tf
# Reset the graph 
tf.reset_default_graph()
# Create new session
sess = tf.Session()
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
# y_ has shape [None,10] , which None is batch size, and 10 is label size. 
# label is one hot vector, for example, if the number is 3 then the correspoding
# one hot vector is [0,0,0,1,0,0,0,0,0,0]
y_ = tf.placeholder(tf.float32, [None, 10])
target = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_op = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Initialize the variables : sets the variable to its initial value
# In this case, variable W is set to be an zeros array with shape [784,10] 
# variable b is an zeros array with shape [10]
init_op = tf.global_variables_initializer()  
sess.run(init_op)
# We run 2 epochs
epoch = 2
batch_size = 10000
no_train_batch = len(train_labels)/batch_size 
for i in range(epoch):
    print '--------------'
    print 'Epoch ', i
    ptr = 0
    for j in range(no_train_batch): 
        print 'Batch ', i * no_train_batch + j
        train_batch, train_label = train_data[ptr:ptr+batch_size,:],train_labels[ptr:ptr+batch_size,:]  
        feed_dict_train={x: train_batch, y_: train_label}
        # We train the network by sess.run(train_op), with input is the training label
        sess.run(train_op,feed_dict_train)
        train_accuracy = sess.run(accuracy, feed_dict_train)
        if i == 0 and j == 0:
            print 'target y is',train_label[0]
            print 'prediction y_ is', sess.run(y, feed_dict_train)[0]
            print 'tf.argmax(y,1) is', sess.run(tf.argmax(y,1), feed_dict_train)[0]
            print 'tf.argmax(y_,1) is', sess.run(tf.argmax(y_,1), feed_dict_train)[0]
            print 'correct_prediction is:', sess.run(correct_prediction,feed_dict_train)[0]
        print 'train accuracy in this batch is ',train_accuracy 
        ptr+=batch_size
        
    # After training 1 epoch, we measure the accuracy on the development set
    print 'development accuracy is', sess.run(accuracy, feed_dict={x: validation_data, y_: validation_labels})
--------------
Epoch  0
Batch  0
target y is [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
prediction y_ is [ 0.12373896  0.04600484  0.11918675  0.09749474  0.09378932  0.05513489
  0.11043836  0.13019302  0.10341849  0.12060074]
tf.argmax(y,1) is 7
tf.argmax(y_,1) is 7
correct_prediction is: True
train accuracy in this batch is  0.6252
Batch  1
train accuracy in this batch is  0.7814
Batch  2
train accuracy in this batch is  0.7565
Batch  3
train accuracy in this batch is  0.8021
Batch  4
train accuracy in this batch is  0.795
development accuracy is 0.8032
--------------
Epoch  1
Batch  5
train accuracy in this batch is  0.7943
Batch  6
train accuracy in this batch is  0.8222
Batch  7
train accuracy in this batch is  0.8186
Batch  8
train accuracy in this batch is  0.8297
Batch  9
train accuracy in this batch is  0.835
development accuracy is 0.8474
>