In the recent years, the field of Machine learning has been progressing at an exponential rate. The progress in the last 4 – 5 years has happened primarily because of the decrease in the cost of hardware which has enabled scientists and researchers to develop really powerful Deep Learning (Neural Network) algorithms. Neural Networks were developed back in the 80s. However, the progress can only be attributed to the recent price reduction in hardware. In this blog post, we will talk about how to train a Neural Network in Python.
Learn How To Program A Neural Network in Python From Scratch
In order to understand it better, let us first think of a problem statement such as – given a credit card transaction, classify if it is a genuine transaction or a fraud transaction.
A fraud transaction is a transaction where the transaction has happened without the consent of the owner of the credit card. For instance, you may have lost your wallet at a shop. Someone picked it up and made a purchase of $10,000. Clearly, you may not be making such high purchases. Wouldn’t it be great if your bank could somehow identify that this transaction wasn’t done by you and so, they could block it automatically?
Today, with the advancements in Machine Learning, such things are indeed possible and in fact, they have been implemented by some of the top banks globally.
In order to be a bit more specific, let us also try to understand – what comprises of a transaction?
- Amount of transaction – if the amount is not well aligned with the type of transactions you do (for instance $10,000), clearly it may be a fraud.
- Time of the transaction – a transaction at 2.00 AM indicates the likelihood of a fraud.
- Category of purchase (food, electronics, bills, etc) – for instance, you may never use your credit card to purchase appliances.
- City where the transaction happened (more specifically, ZIP code) – what if you live in the US while the transaction happened in Spain?
Clearly, there is a pattern in the data – the “features” of the transaction indicate the fraudulent nature of the transaction. What if we could code a Neural Network to capture this pattern?
Before we go ahead and code the Neural Network, let us formalize the inputs and the outputs. The inputs will include various features like the following:
- Transaction amount (number)
- Transaction time – can be indicated as the hour of the day. For example, 2:41 AM would be denoted by a vector of size 4: [0, 2, 4, 1]
- Category of purchase – we can use one hot encoding for the same
- ZIP code of transaction – we can directly use the 6 digit ZIP code. For instance: [0, 9, 4, 3, 0, 6]
- …. (other features that you can think of)
Let us assume that after all the feature engineering, we have an input feature vector of size 25. We now need to formalize the output – fraud or not. This is simple – we can label fraudulent transactions as 1 and non-fraudulent ones as 0.
So, we will have the following dimensions of the data:
- Input – it will be a feature vector of size 25
- Output – it will be 0 or 1
We can design a simple Neural Network architecture comprising of 2 hidden layers:
- Hidden layer 1: 16 nodes
- Hidden layer 2: 4 nodes
Coding such a Neural Network in Python is very simple. We will use the Sklearn (Scikit Learn) library to achieve the same. Check the code snippet below:
# 1.) Import the required libraries from sklearn.neural_network import MLPClassifier # 2.) read the training data X = read_features_from_data() # read the feature vectors from data Y = read_labels_from_data() # read the labels from data # 3.) read the testing data X_test = read_test_data_features() Y_test = read_test_data_labels() # 4.) initialize the neural network parameters NUM_LAYER_1 = 16 # nodes in first hidden layer NUM_LAYER_2 = 4 # nodes in second hidden layer LEARNING_RATE = 1e-5 # learning rate, alpha # A Neural Network Classifier is called as Multi-layer Perceptron # in Sklearn. # 5.) design the classifier clf = MLPClassifier(solver='lbfgs', alpha=LEARNING_RATE, hidden_layer_sizes=(NUM_LAYER_1, NUM_LAYER_2), random_state=1) # 6.) fit the classifier clf.fit(X, Y) # fit the classifier # 7.) run predictions on the test data Y_predicted = clf.predict(X_test)
Let us now focus on each part of the code to understand it better.
1. In this part of the code, we are importing the required libraries for Neural Network. We have used Sklearn library which is one of the most popularly used Machine Learning libraries in Python.
# 1.) Import the required libraries from sklearn.neural_network import MLPClassifier
2. Here, we are reading the training data. It has been assumed that the functions read_features_from_data() and read_labels_from_data() have been written that help in reading the preprocessed training data.
# 2.) read the training data X = read_features_from_data() Y = read_labels_from_data()
3. Here we are reading the test data. Just like the training data, it has been assumed that the functions read_test_data_features() and read_test_data_labels() have been written to help in reading the preprocessed testing data.
# 3.) read the testing data X_test = read_test_data_features() Y_test = read_test_data_labels()
4. Every Neural Network has certain parameters/hyper-parameters. For instance, the number of hidden layers, the number of units in each hidden layer, the learning rate, etc. We define the value of each of these parameters in this part of the code.
# 4.) initialize the neural network parameters NUM_LAYER_1 = 16 NUM_LAYER_2 = 4 LEARNING_RATE = 1e-5
5. This is the crux of the code. This is where we actually create a Neural Network Classifier. As can be seen from the code snippet, we have used the parameter values in section 4 to initialize the classifier. ‘lbfgs’ is an optimizer in sklearn which does the computation in a limited memory setting.
# 5.) design the classifier clf = MLPClassifier(solver='lbfgs', alpha=LEARNING_RATE, hidden_layer_sizes=(NUM_LAYER_1, NUM_LAYER_2), random_state=1)
6. Finally, we train the Neural Network using the training data that we’ve read above.
# 6.) fit the classifier clf.fit(X, Y)
7. Now, we are ready to run the predictions on the testing data.
# 7.) run predictions on the test data Y_predicted = clf.predict(X_test)
The above code snippet talks about an extremely simple Neural Network. Various parameters and hyper-parameters can be tweaked to create complex Neural Network architectures on huge datasets which can make predictions with a high accuracy. In fact, you are encouraged to pick up a dataset of your choice and train your own Neural Networks.