I have several tutorials on Tensorflow where built-in loss functions and layers had always been used. But Tensorflow is a lot more dynamic than that. It allows us to write our own custom loss functions and create our own custom layers. So, there are many ways to make highly efficient models in Tensorflow.
The best way to learn is by doing. So, we will learn with exercises using a free public dataset that I used in my last tutorial on the multi-output model.
I am assuming that you already know the basics of data analysis, data cleaning, and Tensorflow already. So, we will move a bit fast in the beginning.Data Processing
The open public dataset I will use in this tutorial is fairly clean. But still, a little bit of cleaning is necessary.
Here is the link to the dataset.
I already cleaned up the dataset as necessary. Please feel free to download the clean dataset from here to follow along.
First import all the necessary packages here:
import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense from tensorflow.keras.optimizers import Adam from tensorflow.keras import backend as K from tensorflow.keras.layers import Layer
Here is the dataset:
df = pd.read_csv("auto_price.csv")
Though I said it is a clean dataset, it still has two unnecessary columns that need to be dropped:
df = df.drop(columns=['Unnamed: 0', 'symboling'])
We will divide the dataset into three portions. One for training, one for testing, and one for validation.
from sklearn.model_selection import train_test_split train, test = train_test_split(df, test_size=0.2, random_state=2) train, val = train_test_split(train, test_size=0.2, random_state=23)
To normalize the training data using the z-score method, we need to know the mean and standard deviation of all the training features. Here is how I got it:
train_stats = train.describe() train_stats= train_stats.transpose() train_stats
There is extra information as well but the mean and standard deviation is also there.
This norm function takes data and normalizes it using the mean and standard deviation we received in the previous step:
def norm(x): return (x - train_stats['mean']) / train_stats['std']
Let’s normalize train, test, and validation data:
train_x = norm(train) test_x = norm(test) val_x = norm(val)
For this exercise, the price of the automobile will be used as the target variable and the rest of the variables as the training features.
train_x = train_x.drop(columns='price') test_x = test_x.drop(columns='price') val_x=val_x.drop(columns='price') train_y = train['price'] test_y = test['price'] val_y = val['price']
Training and target variables are ready.Custom Loss and Custom Layer
Let’s start with the loss function, we all know. That is the root mean squared error. We will define it as a function and pass that function while compiling the model.
def rmse(y_true, y_pred): return K.sqrt(K.mean(K.square(y_pred - y_true)))
Looks very familiar, right? Let’s keep this function in hand to use later. There are many other kinds of loss functions you can try.
Now, moving on to the custom Layer. For this as well, we will use the simple Linear formula Y=WX+B as the formula. This formula requires weights which are the coefficients of X and Bias (denoted as ‘B’ in the formula). I will explain more in detail after you see the code for this:
class SimpleLinear(Layer): def __init__(self, units=64, activation=None): super(SimpleLinear, self).__init__() self.units = units self.activation=tf.keras.activations.get(activation)
def weightsAndBias(self, input_shape): w_init = tf.random_normal_initializer() self.w = tf.Variable(name="kernel", initial_value=w_init(shape=(input_shape[-1], self.units), dtype='float32'), trainable=True) b_init = tf.zeros_initializer() self.b = tf.Variable(name="bias", initial_value=b_init(shape=(self.units,), dtype='float32'), trainable=True)
def call(self, inputs): return self.activation(tf.matmul(inputs, self.w) + self.b)
In the code above, we started by passing units and activation as parameters. Here I used units as 64 which means 64 neurons. We will end up specifying different numbers as neurons in the model. Here activation is none. We will use an activation in the model as well.
In the ‘weightsAndBias’ above, we initiate the weights and biases where weights are initiated as random numbers and biases as zeros.
In the call function, we multiply our inputs and weights using matrix multiplication (matmul method does matrix multiplication) and add the bias to it (remember the formula wx+b)
This is the most basic one. Please feel free to try some non lilear layers, may be quadratic or cubic formula.
Model development is the simpler part. We have 24 variables as training features. So the input shape is (24, ). Here is the complete model:
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(24,)), SimpleLinear(512, activation='relu'), tf.keras.layers.Dropout(0.2), SimpleLinear(256, activation='relu'), SimpleLinear(128, activation='relu'), tf.keras.layers.Dense(1, activation='relu') ])
As you can see, we simply called SimpleLinear method we defined earlier as the layers. 512, 256, and 128 are the units and activation is ‘relu’.
Though it is also possible to use a custom activation method which will be in the next part.
Let’s compile the model and use the loss function ‘rmse’ we defined earlier:
model.compile(optimizer='adam', loss = rmse, metrics=tf.keras.metrics.RootMeanSquaredError()) h = model.fit(train_x, train_y, epochs=3) model.evaluate(val_x, val_y)
Epoch 1/3 4/4 [==============================] - 0s 3ms/step - loss: 13684.0762 - root_mean_squared_error: 13726.8496 Epoch 2/3 4/4 [==============================] - 0s 3ms/step - loss: 13669.2314 - root_mean_squared_error: 13726.8496 Epoch 3/3 4/4 [==============================] - 0s 3ms/step - loss: 13537.3682 - root_mean_squared_error: 13726.8496
In the next part, we will experiment with some custom activation functions.Custom Activation Function
I will explain two ways to use the custom activation function here. The first one is to use a lambda layer. The lambda layer defines the function right in the layer.
For example in the following model, the lambda layer takes the output from the SimpleLinear method and takes its absolute values of it so we do not get any negatives.
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(24,)), SimpleLinear(512), tf.keras.layers.Lambda(lambda x: tf.abs(x)), tf.keras.layers.Dropout(0.2), SimpleLinear(256), tf.keras.layers.Lambda(lambda x: tf.abs(x)), tf.keras.layers.Dense(1), tf.keras.layers.Lambda(lambda x: tf.abs(x)), ])
Please feel free to try any other kinds of operation in the lambda layer.
You do not have to define the operation in the lambda layer itself. It can be defined in a function and passed on to the lambda layer.
Here is a function that takes data and squares it:
def active1(x): return x**2
Now, this function can be simply passed into the lambda layer like this:
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(24,)), SimpleLinear(512), tf.keras.layers.Lambda(active1), tf.keras.layers.Dropout(0.2), SimpleLinear(256), tf.keras.layers.Lambda(active1), tf.keras.layers.Dense(1), tf.keras.layers.Lambda(active1), ])
There are so many other different functions that can be used based on your project and your needs.Conclusion
Tensorflow can be so dynamic to use. There are so many different ways it can be manipulated. In this article, I wanted to share some of the methods to make Tensorflow more flexible for you. I hope it is helpful and you try it in your own projects.
Feel free to follow me on Twitter and like my Facebook page.