Playing games with Machine Learning and Python

Machine learning is a crazy new innovation that is poised to disrupt human life as we know.Gaining a deeper understanding of it can actually help one understand how it works and innovate with it.

In this tutorial we will be making a simple AI based agent which learns how to play a simple using Python(and a couple of libraries).


We will be using a neural network for this task.


A neural network is a computer science construct that imitates the biological neuron in it’s behaviour. The simplest type of a neural network is called a perceptron.

Image result for perceptron neural network

A perceptron consists of a number of inputs (denoted by X1, X2…..Xn) and their associated weights(denoted by W1,W2….Wn) in the above diagram.

These last diagram is the activation function of the perceptron.


Training a neural network generally involves feeding the neural network sample inputs and sample output. Using this the neural network changes the weight associated with each input in a way that minimizes error.

As a mental model I like to think of training a neural network as a very advanced version of curve fitting.

This tutorial doesn’t assume/require anymore knowledge about Neural networks, however incase you want more information check out this.

Open AI Gym and Scikit-learn

We shall be using Open AI gym to get the gaming environment. It provides the simplified way to control games using observation vectors and action vectors.

We shall be using it’s CarWheel environment.


An agent needs to balance a pole on top of a cart. Game ends incase the cart moves more than 2.4 units or the pole crosses an angle of more than 15 degrees.

Getting Started

You will need to install the following packages:

To install open ai gym you need to clone and install the package manually.

First install all the dependencies

Now clone and install open AI gym

Use .[all] for installing all environments available in the package.

We are going to the follow the steps mentioned below in the given order:

  1. Create the initial training data: Since we do not have any previously defined data, we will start by creating a couple of data points with random input.
  2. Training the neural network: This involves training the neural network using the data generated from the random move capture.
  3. Playing the game using the trained Neural Network: We will play the game again this time using the predictions from the Neural Network as in the action.

The below given function defines the order of everything.

Creating training data

Since we need to train the neural network to perform a play the game, but don’t really have any training data, we will start by making random moves and selecting the cases where the agent reaches a threshold score. This will form our basic training data. We create the sample data by playing 50K games randomly. The code is given below:


Training the Neural Network

Training the neural network involves training the neural network using the data received from the above function. Training the NN is quite straightforward:

Playing the game using the Neural Network

Using the above trained neural network we can predict what the next move should be using the current state of the game environment. Code for playing the game:

Note: All the previous games were played without rendering(since this makes it very fast), and these games are going to be played with UI hence end up being slow. If you want to test your trained NN on a large number of games (without waiting for a very long time) consider commenting out the line game_env.render() from the code above.

After this  a window should open up with the agent playing a game.  The final result should look something like this:


While this is clear improvement from the random agent, this is not the end in terms of the quality of the player. You can start by experimenting with the activation functions and the number of layers in the neural network.

Also you can find the code to do all this here.

I am a developer and tech enthusiast based out of New Delhi, India. I love python , have a love-hate relationship with Javascript, and feel that filing out tax forms are easier than Java.

Leave a Reply