Project 2: Train Your Own Artificial Neural Network

CS 6501 - Large-scale Data-Driven Graphics and Vision

Due: Tues, Oct 20 (11:59 PM)

This project involves training your own neural networks. In the first part, you will implement back-propagation and gradient descent yourself on a fully-connected feedforward neural network. In the second part, you will use Caffe, which you have already installed in the first project, to train a more sophisticated convolutional neural network.

Dataset

For this project, we will look at classifying handwritten digits from the MNIST dataset. You can start by downloading and extracting the 4 datasets from the MNIST website. These consist of images and labels for both the training and testing datasets. As a reminder, in machine learning, frequently one trains on one dataset, and tests on an independent dataset. Usually in the end performance is reported on the test dataset, because repeatedly fitting parameters to the same training dataset can result in overfitting, which manifests as artificially low training errors.

The dataset is in a custom binary format. You can load it into a programming language of your choice by using a loader. There are available loaders for Python (which is slightly modified to use floating point arrays from the blog post where it is documented), MATLAB, or C.

Part 1: Building Your Own Neural Network from Scratch

In this part of the project, you will implement your own fully-connected feedforward neural network. The benefit of doing this is that you will understand in detail how neural networks work and are trained. You will implement a neural network with the following structure:

A multi-layer feedforward neural network. Image from Wikipedia user Paskari.
Your task is to implement the following:
General suggestions: Parameters: What to submit for Part 1:

Part 2: Training More Sophisticated Networks with Caffe

The algorithms are no different to train significantly larger networks, or networks using other weight matrices such as those created by convolutions. One simply uses backpropagation and stochastic gradient descent. However, engineering efficient code for such "deep learning" is a practical challenge, which is why people use libraries such as Caffe or Torch.

In Part 2 you will be using Caffe to see if you can improve the performance over the small neural network you built in Part 1.

Installation and Setup of Caffe

The program environment can use your already installed Caffe package and make some slight modifications:

Your Tasks

  1. Visualize the test results for the MNIST digit training images and plot the train loss and test accuracy figure as in the example of digit recognition:
  2. Report the overall test accuracy, and experiment to configure the network to achieve maximum accuracy. Experiment with modifying the neural network activation functions, pooling sizes, convolution stencil sizes, and so forth. You can also try different learning rate, different gradient descent methods and other setting parameters, so see if it improves test accuracy.

Policies

Feel free to collaborate on solving the problem but write your code individually.

Submission

Submit your assignment in a zip file named yourname_project2.zip. Please include your source code from part 1. Also, include a document named writeup.doc, describing the best accuracy you achieved in parts 1 and 2, the neural network architectures you used to achieve this accuracy, and the answers to the last question of part 1.

Finally submit your zip to UVA Collab.