Local Files


    • Making predictions from 2D data

      In this example we will train a simple neural network to make predictions from given 2D data. It is taken from the official tensorflow examples. The network is trained on 2D data related to "Horsepower" and "Miles per Gallon" of cars. The goal is for the network to give predictions for the "Miles per Gallon" given the "Horsepower" of a car. Before we start with the example, we need to make sure that the required libraries are imported into the notebook. For that we can use the javascript import statement to import "tensorflowjs" and "plotly":



    • Load and format the input data

      The first step is to load and process the data for the machine learning model. The dataset that we are going to use contains a lot of data concerning different cars. Our goal is to look at the data related to "Horsepower" and "Miles per Gallon" and find a relation between them. Let us first load the data and get a feeling for what we are dealing with. For that we fetch the data from a given url and read the contained JSON data from it. Be aware of the await keyword. It is required to obtain values from asynchronous functions like fetch.



    • In the next step we want to clean the data. That means we will only keep the entries related to the "Miles per Gallon" and "Horsepower". Additionally we make sure that all entries have valid values.



    • It is generally a good idea to visualize your data to get a better understanding of it. It will help you to find a structure in the data that a model can learn. Let's take advantage of the interactive capabilities of the notebook and display the data. To do this, we use the plotly-js library. The following cell will create a div element into which we will plot the data.



    • We can then plot the data as follows:



    • The figure shows the data displayed as "Miles per Gallon" over "Horsepower". We can see from the plot that there is a negative correlation between the horsepower and the MPG. As the horsepower goes up the miles per gallon go down.


    • Define the task

      The goal of this example is to train a model to take in one number (horsepower) and predict one number (miles per gallon). The model will be trained with the available data. This is called Supervised learning, since we have the correct values for the input data.


    • Define the model architecture

      In this example we are training a neural network. The model architecture describes how the neural network is structured. When using neural networks, the algorithm is a set of layers of neurons with "weights" governing their output. The training process learns the ideal values for those weights.

      For this example we need a relatively simple model. We will use a sequential model where the inputs flow straight to the outputs. Lets instantiate the model with the following command:



    • Add layers

      Neural networks always need an input and ouptut layer that fits the requirements of the problem. In our case we have one input variable and one output variable. Consequenlty, we need to define an input layer with dimension 1. The following command adds an input layer with inputShape=[1] to our neural network.



    • The input layer is directly connected to dense layers. A dense layers multiplies its input by a (weight) matrix and then adds a (bias) number. The parameter units sets the number of neurons.

      In this example we will use a second dense layer with 8 neurons.



    • Finally we create our ouput layer with the following command:



    • Setting units to 1 makes sure we have only one output variable. We can have a look at the model by printing it to the browser console.



    • Prepare the data for training

      To be able to use the data for training, we have to process it first. Depending on your application it is a good idea to shuffle your data before you train your network. This is done with the following command:



    • Additionally we will convert the data into tensors to obtain a better performance.



    • For training it is often helpful to normalize the inputs such that all values lie in between 0-1. To use the "Min-Max" scaling we first calculate the minimum and maximum values of our Tensors.



    • Train the model

      With the model architecture defined and the data stored as "tensors", we can start training our model. To do that we have to "compile" the defined model. For the compilation a couple of very important things have to be specified:

      • optimizer: the optimizer governs the change in the weights as the model is trained with the input data. There are many different optimizers available in tensorflow. Check out the documentation for more details.
      • loss: the loss function is a measure of how well the model is learning the subsets of data it is trained with. The meanSquaredError is a very common choice.


    • Additionally we have to define the batchsize and the number of epochs used:

      • batchsize: refers to the size of data subsets the model will see on each iteration of training. Common batchsizes tend to be in the range of 32-512.
      • epochs: refers to the number of times the model is going to look at the entire dataset that you provide it.


    • Now we are actually ready to train the model. To monitor the training process will will first create a chart into which we will plot the loss function during training.




    • Now that the plot is defined, we can start the training process.



    • Lets open the visor and start the training process. The callbacks created earlier will update the plots for the loss and mse metric. We can see that both are going down with the training.


    • Make predictions

      Now that we've trained the model, lets make some predictions! For that we'll have a look at the predictions from low to high horsepowers. However, we trained our model with normalized data. Which means that we first need to obtain our predictions with normalized data and "un-normalize" them afterwards. We create a tensor with 100 entries from 0-1 as out input for the prediction.



    • From that we can calculate our normalized predictions.



    • In order to "un-normalize" the input, we scale the input tensor to fit the range between the minimum and maximum horsepower values. To "un-normalize" the predictions, we scale them to fit the range between the minimum and maximum "Miles per Gallon" values.

      The method arraySync() obtains a regular Javascript Array from a tensor. We transform the tensors because we can then process them in Javascript.



    • Lets have a look how our predictions compare to the original data. For that let us first create a div Element and then plot the data.




    • Bonus

      To be honest our predictions could be better. Play around with the model architecture and see if you can improve the predictions. A couple of things you could try:

      • change the number of epochs
      • change the number of units in the hidden layer
      • change the number of hidden layers between the first hidden layer and the output layer
      • change the activation function

      Have fun!