Learn Build a TensorFlow Traffic Sign Classifier at 95% Accuracy

with machinehorizon

Cfccsaotu6rnblk74rmi?cache=true

Overview of TensorFlow Classifier Approach

What are we Building?

From the ground up, we are going to create a TensorFlow convolutional neural network classifier that can detect 43 different German Traffic Signs at approximately 95% accuracy. The model is also be able to use images from the web to make predictions on new data. It is small enough to train on a CPU, but training on a GPU will make your life easier. 

Dependencies

  • Python 3.5
  • TensorFlow >= 0.10
  • OpenCV 3


Exploratory Analysis of the German Traffic Signs Dataset

B2herlqetfinmzsd24da
The first image is the raw 32x32 photo from the dataset, the second is adjusted to gray scale, then histogram equalized in OpenCV. This is the basic preprocessing pattern we will follow.

Preprocessing Images with OpenCV

When you start with this dataset, you should have the training and testing data loaded into a numpy array with a shape of (None, 32, 32, 3), where none represents the number of photos in the dataset.  In this example, training data is held in the X_train, y_train, and testing in X_test, y_test. 

Converting images to gray scale reduces the total size of the dataset by 66% by going from 3 color channels to just 1. This will improve the speed of the algorithm without sacrificing much information. The main features of a traffic sign are present in the shapes of the signs. OpenCV provides helper functions to make these transformations with just a few lines of code. Also, make note that we are expanding the dimensions of the images from (32x32) to (32x32x1), which is required for 2D convolutions in TensorFlow.

def preprocess(data):
    """Convert to grayscale, histogram equalize, and expand dims"""
    imgs = np.ndarray((data.shape[0], 32, 32, 1), dtype=np.uint8)
    for i, img in enumerate(data):
        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        img = cv2.equalizeHist(img)
        img = np.expand_dims(img, axis=2)
        imgs[i] = img
    return imgs

X_train = preprocess(X_train)
X_test = preprocess(X_test)

def center_normaize(data, mean, std):
    """Center normalize images"""
    data = data.astype('float32')
    data -= mean
    data /= std
    return data

mean = np.mean(X_train)
std = np.std(X_train)

X_train = center_normaize(X_train, mean, std)
X_test = center_normaize(X_test, mean, std)

One-Hot-Encoding the Labels

In this step, we are going to One-Hot-Encode the labels to work with the softmax activation in TensorFlow. There are many ways to do this, but I am using Scikit-Learn label binnerizer to get the job done. 

from sklearn.preprocessing import LabelBinarizer
ohe = LabelBinarizer().fit(y_train)
y_train_ohe = ohe.transform(y_train)
y_test_ohe = ohe.transform(y_test)

Helper Methods for TensorFlow Convolutional Neural Networks

TensorFlow neural networks can become verbose when dealing with many layers. In this section, I provide a few helper methods to keep the code more concise and readable. For more great examples of helpers, check out this TensorFlow examples repo.

def conv2d(x, W, b, strides=1):
    # Conv2D wrapper, with bias and relu activation
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.relu(x)

def maxpool2d(x, k=2):
    return tf.nn.max_pool(
        x,
        ksize=[1, k, k, 1],
        strides=[1, k, k, 1],
        padding='SAME')

Building and Tuning the Classifier

Now it's time to build the actual model. This model is similar to the VGG-style convnets that have been very successful at image classification tasks. 

Model Structure Layer-by-Layer

The list below shows each layer and the shape of the tensor it outputs. In total, we have 4 convolutional layers and two dense layers that output a prediction of 43 logits (for 43 diffent classes of traffic signs). 

  • Input: (32, 32, 1)
  • Conv: (32, 32, 64)
  • Pool: (16, 16, 64)
  • Conv: (16, 16, 128)
  • Pool: (8, 8, 128)
  • Conv: (8, 8, 256)
  • Conv: (8, 8, 256)
  • Pool: (4, 4, 256)
  • Flatten: (2048)
  • FullyConnected: (400)
  • Dropout 0.5
  • FullyConnected: (200)
  • Dropout 0.5
  • Output: (43)

Coding Up the Model

Below is the code for the actual model. Note that the weights and biases have not been defined yet, which will be completed in the next step. 

def conv_net(x, weights, biases, dropout):
    
    conv1 = conv2d(x, weights['layer_1'], biases['layer_1'])
    conv1 = maxpool2d(conv1)

    conv2 = conv2d(conv1, weights['layer_2'], biases['layer_2'])
    conv2 = maxpool2d(conv2)

    conv3 = conv2d(conv2, weights['layer_3'], biases['layer_3'])
    conv4 = conv2d(conv3, weights['layer_4'], biases['layer_4'])
    conv4 = maxpool2d(conv4)

    fc1 = tf.reshape(conv4, [-1, weights['dense_1'].get_shape().as_list()[0]])
    
    fc1 = tf.add(tf.matmul(fc1, weights['dense_1']), biases['dense_1'])
    fc1 = tf.nn.relu(fc1)
    fc1 = tf.nn.dropout(fc1, dropout_prob)
    
    fc2 = tf.add(tf.matmul(fc1, weights['dense_2']), biases['dense_2'])
    fc2 = tf.nn.relu(fc2)
    fc2 = tf.nn.dropout(fc2, dropout_prob)
    
    out = tf.add(tf.matmul(fc2, weights['out']), biases['out'])
    
    return out

Defining Hyperparameters, Weights, Biases for a TensorFlow Session

This Section is Locked!

Unlock this lesson for $10 to view all sections.

Running the Model with Shuffled Batches

Now it's time to run a TensorFlow session to train the model and generate predictions. Here's what's going on step-by-step in the code below. 

  1. Initializing global variables (just resetting everything). 
  2. Starting a TensorFlow session. Training and prediction will always happen within the session scope. 
  3. Iterating over the number of epochs. The data is reshuffled at each epoch with Numpy.  
  4. Iterating over each batch and feeding the data to the optimizer. This is where the training happens. For each batch of images, the neural network will make predictions on the data, then adjust the weights based on the cost function. 
  5. Check the training progress by predicting the test data. You can get predictions in different formats, in this case we are returning the crossentropy loss and accuracy. 
  6. When all epochs are finished, we save the model for use in later TensorFlow sessions. 
init = tf.global_variables_initializer()

with tf.Session() as sess:
    
    sess.run(init)
    for epoch in range(training_epochs):
        total_batch = int(n_train/batch_size)
        
        # shuffle data index for each epoch
        rand_idx = np.random.permutation(n_train)

        for i in range(total_batch):
            offset = i*batch_size
            off_end = offset+batch_size
            batch_idx = rand_idx[offset:off_end]
            
            batch_x = X_train[batch_idx]
            batch_y = y_train_ohe[batch_idx]

            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout_prob})
        
        cost_ts, acc_ts = sess.run([cost, accuracy], feed_dict={x: X_test, y: y_test_ohe, keep_prob: 1.})

        print("Cost:  {:.5f}  | Accuracy:  {:.5f}".format(cost_ts, acc_ts))
    
    save_path = saver.save(sess, "models/model.ckpt")
    print("Training Complete! Model saved in file: %s" % save_path)

Validating the Model with Photos from the Web

This Section is Locked!

Unlock this lesson for $10 to view all sections.

Processed Traffic Signs from the Web

Xquoihtkrla4cbtb3cll
Here are a few public domain images of traffic signs that will be fed into our trained network. Unless the model is overfitting badly, we should get most of these correct.

Analyzing the Top K Predictions

This Section is Locked!

Unlock this lesson for $10 to view all sections.

Visualization of Logits for each Prediction

Fcrlg8tuqcm0liryowz7
The top 5 logit scores for the models prediction. The predicted classes fall on the x-axis, while the logit score is on the y-axis. As you can see, the model is able to generalize well to new photos taken from the web.
Signup and Unlock for $10

Grades

Nobody has graded this lesson yet.

n 0.0%
Technology Data Analysis

  • 26 Unlocks
  • 2707 Total Reads
  • about 1 hour Est. Learning Time
Top Seller