AI & Cognitive Science Neural Net Assignment

For this assignment, we will tackle making a neural net class in python. I will provide some parameters an shell code to limit the scope of your class, and so that it's input/output can be used w/ the project Malmo system we'll use (I will introduce this next class, after we've focused on getting a working Neural net working!)

We will create a neural network class that can solve the four simple boolean functions we discussed in class (AND, XOR, OR, NAND), and much more if you have the time to train it!

At minimum your Neural Net should A constructor that allows you to set the following:

the # of input nodes
the # of hidden nodes, # of hidden layers
the # of output nodes, the learning rate,
maximum number of epochs (at which to stop)
the minimum SSE (at which to stop)

(Note that I provide a shell for your class including the constructor below)

Neural Net Shell Code


# Some potentially useful modules
# Whether or not you use these (or others) depends on your implementation!
import random
import numpy
import math
import matplotlib.pyplot as plt

class NeuralMMAgent(object):
    '''
    Class to for Neural Net Agents
    '''

    def __init__(self, num_in_nodes, num_hid_nodes, num_hid_layers, num_out_nodes, \
                learning_rate = 0.2, max_epoch=10000, max_sse=.01, momentum=0.2, \
                creation_function=None, activation_function=None, random_seed=1):
        '''
        Arguments:
            num_in_nodes -- total # of input nodes for Neural Net
            num_hid_nodes -- total # of hidden nodes for each hidden layer
                in the Neural Net
            num_hid_layers -- total # of hidden layers for Neural Net
            num_out_nodes -- total # of output nodes for Neural Net
            learning_rate -- learning rate to be used when propagating error
            creation_function -- function that will be used to create the
                neural network given the input
            activation_function -- list of two functions:
                1st function will be used by network to determine activation given a weighted summed input
                2nd function will be the derivative of the 1st function
            random_seed -- used to seed object random attribute.
                This ensures that we can reproduce results if wanted
        '''
        assert num_in_nodes > 0 and num_hid_layers > 0 and num_hid_nodes and\
            num_out_nodes > 0, "Illegal number of input, hidden, or output layers!"


    def train_net(self, input_list, output_list, max_num_epoch=100000, \
                    max_sse=0.1):
        ''' Trains neural net using incremental learning
            (update once per input-output pair)
            Arguments:
                input_list -- 2D list of inputs
                output_list -- 2D list of outputs matching inputs
        '''
                #Some code...#
            all_err.append(total_err)

            if (total_err < max_sse):
                break
				#Show us how our error has changed
        plt.plot(all_err)
        plt.show()


    def _calculate_deltas(self):
        '''Used to calculate all weight deltas for our neural net
            Arguments:
                out_error -- output error (typically SSE), obtained using target
                    output and actual output
        '''

        #Calculate error gradient for each output node & propgate error
        #   (calculate weight deltas going backward from output_nodes)




    def _adjust_weights_thetas(self):
        '''Used to apply deltas
        '''


    @staticmethod
    def create_neural_structure(num_in, num_hid, num_hid_layers, num_out, rand_obj):
        ''' Creates the structures needed for a simple backprop neural net
        This method creates random weights [-0.5, 0.5]
        Arguments:
            num_in -- total # of input nodes for Neural Net
            num_hid -- total # of hidden nodes for each hidden layer
                in the Neural Net
            num_hid_layers -- total # of hidden layers for Neural Net
            num_out -- total # of output nodes for Neural Net
            rand_obj -- the random object that will be used to selecting
                random weights
        Outputs:
            Tuple w/ the following items
                1st - 2D list of initial weights
                2nd - 2D list for weight deltas
                3rd - 2D list for activations
                4th - 2D list for errors
                5th - 2D list of thetas for threshold
                6th - 2D list for thetas deltas
        '''

    #-----Begin ACCESSORS-----#
		#-----End ACCESSORS-----#


    @staticmethod
    def sigmoid_af(summed_input):
        #Sigmoid function

    @staticmethod
    def sigmoid_af_deriv(sig_output):
        #the derivative of the sigmoid function

test_agent = NeuralMMAgent(2, 2, 1, 1,random_seed=5, max_epoch=1000000, \
                            learning_rate=0.2, momentum=0)
test_in = [[1,0],[0,0],[1,1],[0,1]]
test_out = [[1],[0],[0],[1]]
test_agent.set_weights([[-.37,.26,.1,-.24],[-.01,-.05]])
test_agent.set_thetas([[0,0],[0,0],[0]])
test_agent.train_net(test_in, test_out, max_sse = test_agent.max_sse, \
                     max_num_epoch = test_agent.max_epoch)

Let's go through some of this code

First things first, our constructor

This constructor expects the user to specify the number of inputs, hidden nodes, hidden layers, and output nodes
To make things a bit more simple, we are assuming that all of our hidden layers will have the same number of nodes.
Most of the other arguments are, essentially, optional. However, the creation_function and the activation_function are important and you will have to create/specify those. For you activation function, you must at least implement the sigmoid function.

def __init__(self, num_in_nodes, num_hid_nodes, num_hid_layers, num_out_nodes, \
						learning_rate = 0.2, max_epoch=10000, max_sse=.01, momentum=0, \
						creation_function=None, activation_function=None, random_seed=1):
		'''
		Arguments:
				num_in_nodes -- total # of input nodes for Neural Net
				num_hid_nodes -- total # of hidden nodes for each hidden layer
						in the Neural Net
				num_hid_layers -- total # of hidden layers for Neural Net
				num_out_nodes -- total # of output nodes for Neural Net
				learning_rate -- learning rate to be used when propagating error
				creation_function -- function that will be used to create the
						neural network given the input
				activation_function -- list of two functions:
						1st function will be used by network to determine activation given a weighted summed input
						2nd function will be the derivative of the 1st function
				random_seed -- used to seed object random attribute.
						This ensures that we can reproduce results if wanted
		'''
		assert num_in_nodes > 0 and num_hid_layers > 0 and num_hid_nodes and\
				num_out_nodes > 0, "Illegal number of input, hidden, or output layers!"

train_net is where the action is

You will use this method to do the heavy lifting, that is, train your neural net, and report our error.
The network configuration should be saved within the network itself at the end of the training

def train_net(self, input_list, output_list, max_num_epoch=100000, \
								max_sse=0.1):
		''' Trains neural net using incremental learning
				(update once per input-output pair)
				Arguments:
						input_list -- 2D list/array of inputs
						output_list -- 2D list/array of outputs matching inputs
		'''
						#Some code...#
				all_err.append(total_err)

				if (total_err < max_sse):
						break
		#Show us how our error has changed
		plt.plot(all_err)
		plt.show()

learning in the network

Your network will be learning using the functions above.
As I noted below you'll want to calculate all of your deltas before updating things. Thus, we have two functions to handle the calculations and application of the results of these calculations

def _calculate_deltas(self):
		'''Used to calculate all weight deltas for our neural net
		'''

		#Calculate error gradient for each output node & propgate error
		#   (calculate weight deltas going backward from output_nodes)




def _adjust_weights_thetas(self):
		'''Used to apply deltas
		'''

This will initialize our neural network
For the initial version of your code, you should use the outputs I indicate in the comments (by doing this, it makes it easier for me to diagnose issues you may have in your code).

@staticmethod
def create_neural_structure(num_in, num_hid, num_hid_layers, num_out, rand_obj):
		''' Creates the structures needed for a simple backprop neural net
		This method creates random weights [-0.5, 0.5]
		Arguments:
				num_in -- total # of input layers for Neural Net
				num_hid -- total # of hidden nodes for each hidden layer
						in the Neural Net
				num_hid_layers -- total # of hidden layers for Neural Net
				num_out -- total # of output layers for Neural Net
				rand_obj -- the random object that will be used to selecting
						random weights
		Outputs:
				Tuple w/ the following items
						1st - 2D list/array of initial weights
						2nd - 2D list/array for weight deltas
						3rd - 2D list/array for activations
						4th - 2D list/array for errors
						5th - 2D list/array of thetas for threshold
						6th - 2D list/array for thetas deltas
		'''

#-----Begin ACCESSORS-----#
#-----End ACCESSORS-----#

now lets & initialize & run our network

Finally, note that while I've gone through the normal constructor process to initialize my neural net in the code above, I also artificially set the weights and the thetas (-bias) to what we used as our example in class. This should help you ensure that your network at least runs correctly on one pass.
It will be up to you to ensure that your neural network works as it should!

test_agent = NeuralMMAgent(2, 2, 1, 1, random_seed=5, max_epoch=1000000, \
                            learning_rate=0.2, momentum=0)
test_in = [[1,0],[0,0],[1,1],[0,1]]
test_out = [[1],[0],[0],[1]]
test_agent.set_weights([[-.37,.26,.1,-.24],[-.01,-.05]])
test_agent.set_thetas([[0,0],[0,0],[0]])
test_agent.train_net(test_in, test_out, max_sse = test_agent.max_sse, \
                     max_num_epoch = test_agent.max_epoch)

Grading

Grade item	Points
Neural Net implements activation functions	0.5 pts
Neural Net Implements back propagation	0.5 pts
Reasonable design principles used	0.5 pts
Activation correct on 1st iteration of 1st epoch	0.5 pts
Neural Net weights after 1st iteration correct	0.5 pts
Neural Net outputs correct output for all boolean functions	0.5 pts
Neural Net propagates activation through network	0.5 pts
Neural Net can use different numbers of inputs and outputs	0.5 pts
Neural Net can use different numbers of hidden nodes and layers	1 pt

Notes

You'll want to include a threshold and/or bias here. As I've gone over in class, you should also calculate a delta for the threshold/bias and change thetas during the same steps where they update the weights
- DON'T FORGET TO CALCULATE ALL OF YOUR WEIGHT DELTAS BEFORE UPDATING ANY OF THE WEIGHTS (you can get by without calculating all the weight deltas, but you should be sure if you're doing things correctly if you end up not updating all weight deltas 1st)
There are many ways you can optimize your code. When in doubt I'd advise to take the easier, less optimal route to improve your progress (you can always go back for optimizations!)
At a minimum, your neural net will need 2 input, 2 hidden (1 layer), 2 output to solve XOR
Using Numpy is nice, but unless you are really confident w/ that, I think python lists are going to be faster to code up for this (the zip function is helpful if you use built-in python sequences)
You can use matplotlib.pyplot to easily plot the SSE reducing over epochs
- (it's a nice way to see what's happening to your SSE if you suspect your error isn't reducing correctly)