logistic regression to recognizance handwritten numbers

Overview

This is a quick example on how to a neuron performs classifying hand reading digits You can download or run this code from here

Download data

The first thing you want to do is to download your data.MIST comes pre-loaded with TensorFlow, you can download it suing the following command

    from tensorflow.examples.tutorials.mnist import input_data
    
    mnist = input_data.read_data_sets("MNIST_data/", reshape=False)
    X_train, y_train           = mnist.train.images, mnist.train.labels
    X_validation, y_validation = mnist.validation.images, mnist.validation.labels
    X_test, y_test             = mnist.test.images, mnist.test.labels
    

Visualize data

Finding out the size of your data, how it looks, and image dimensions can help you know what type of ML algorithms to use and what type of postprocessing is needed.

    import random
    import matplotlib.pyplot as plt
    import numpy as np
    %matplotlib inline
    
    index = random.randint(0, len(X_train))
    image = X_train[index].squeeze()
    
    plt.figure(figsize=(1,1))
    plt.imshow(image, cmap="gray")
    print("label for this image: {}".format(y_train[index]))
    
    img_shape = (X_train[0].shape[0], X_train[0].shape[1])
    img_size_flat = (img_shape[0] * img_shape[1])
    print ("image size {}" .format(img_size_flat))
    
    num_classes = len(np.unique(y_train))
    print(num_classes)
    

Results

Tensorflow basics:

A placeholder is a variable that we can use to transfer a value when we run our Tensorflow session. It creates a memory space for variables that will be using in the future. In this case x stands for our images and y for our labels

Logits is a matrix with an estimate number of how likely the input image is to be of the a class. In order for this number to look like a provability we have to normalize them (zero to one) using softmax.

Architecture

    import tensorflow as tf
    from tensorflow.contrib.layers import flatten
    tf.reset_default_graph()
    
    #Placeholder variables
    x = tf.placeholder(tf.float32, (None, img_shape[0], img_shape[1], 1))
    y = tf.placeholder(tf.int32, (None))
    
    weights = tf.Variable(tf.zeros([img_size_flat, num_classes]))
    biases = tf.Variable(tf.zeros([num_classes]))
    
    def model(x):
        fc   = flatten(x)
        logits = tf.matmul(fc, weights) + biases
        return logits
    

Training Graph

    EPOCHS = 20
    BATCH_SIZE = 128
    rate = 0.001
    
    one_hot_y = tf.one_hot(y, num_classes)
    
    # Get logits
    logits = model(x)
    
    y_pred = tf.nn.softmax(logits)
    y_pred_cls = tf.argmax(y_pred, axis=1)
    
    # Computes softmax cross entropy between logits and labels
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=one_hot_y, logits=logits)
    #Calculate loss
    loss_operation = tf.reduce_mean(cross_entropy)
    optimizer = tf.train.AdamOptimizer(learning_rate = rate)
    training_operation = optimizer.minimize(loss_operation)
    

Validation Graph

    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
    accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    def evaluate(X_data, y_data, sess):
        num_examples = len(X_data)
        total_accuracy = 0
        #sess = tf.get_default_session()
        for offset in range(0, num_examples, BATCH_SIZE):
            batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
            accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y})
            total_accuracy += (accuracy * len(batch_x))
        return total_accuracy / num_examples
    

Start tensorflow training session

    import time
    
    sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
    sess.run(tf.global_variables_initializer())
    
    def train():
        beginTime = time.time()
        for offset in range(0, len(X_train), BATCH_SIZE):
            end = offset + BATCH_SIZE
            batch_x, batch_y = X_train[offset:end], y_train[offset:end]
            sess.run(training_operation, feed_dict={x: batch_x, y: batch_y})
    
        validation_accuracy = evaluate(X_validation, y_validation, sess)        
        endTime = time.time()
        print ("Total time {:5.2f}s accuracy:{}".format(endTime - beginTime, validation_accuracy))
    

Start tensorflow training session

    for i in range (15):
        train()
    plot_weights()
    
Output:
Total time  0.46s accuracy:0.9104
Total time  0.45s accuracy:0.9158
Total time  0.44s accuracy:0.9184
Total time  0.45s accuracy:0.9204
Total time  0.50s accuracy:0.9226
Total time  0.51s accuracy:0.924
Total time  0.51s accuracy:0.9248
Total time  0.50s accuracy:0.9264
Total time  0.51s accuracy:0.926
Total time  0.54s accuracy:0.926
Total time  0.49s accuracy:0.9264
Total time  0.50s accuracy:0.9274
Total time  0.49s accuracy:0.9278
Total time  0.49s accuracy:0.929
Total time  0.51s accuracy:0.9294
-1.6156687
1.2737342

Conclusion

With out suing any sophisticated architectures, a single neuron net was able to get 93% accuracy at MINST dataset.

Results


Reference

Download or run this code from here

Hvass Laboratories tutorial

Manuel Cuevas

Manuel Cuevas

Hello, I'm Manuel Cuevas a Software Engineer with background in machine learning and artificial intelligence.