Introduction to GradientTape in TensorFlow

Introduction to GradientTape in TensorFlow

In TensorFlow we get everything ready for us. It takes care of the mathematical part for us. But to a lot of extent, we can take control of the mathematical part as well. In this article, I explained how to perform some operations the basic building blocks of TensorFlow which is Tensors before. Today we will work on the GradientTape that does the differentiation part. So even if it gives a lot of flexibility in the mathematical operation, still you don’t have to do the basic math by yourself!

Let’s do some hands-on exercises. First import TensorFlow and create a Tensor of ones of 2×3 shape.

import tensorflow as tf
x = tf.ones((2, 3))
x

Output:

2, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

The sum of the elements in x is 6. If we want to do that programmatically:

y = tf.reduce_sum(x)
y

Output:

6.0>
First, let’s do the differential manually to refresh the process in our mind and then I will show you how to do that using GradientTape

Taking the square of the y and creating another variable z:

Finally, if take a differential of z with respect to x:

As x is 2 x 3 matrix, it will be applied in all the positions if you take differential with respect to x.

Here is how to do this using GradientTape in TensorFlow:

from tensorflow.python.eager.backprop import GradientTape
with tf.GradientTape() as t:
  t.watch(x) #keep track of the x
y = tf.reduce_sum(x) #sum up the elements in x which is 6 z = tf.square(y) #square of y dz_dx = t.gradient(z, x) #taking the differential z with respect to x print(dz_dx)

Output:

tf.Tensor(
[[12. 12. 12.]
 [12. 12. 12.]], shape=(2, 3), dtype=float32)

Now we can try it with a more complex equation:

x = tf.constant(2.0)
with GradientTape() as t: t.watch(x) y = 2*(x**3) + 5*(x**2) + 3*x dy_dx = t.gradient(y, x) print(dy_dx)

Output:

tf.Tensor(47.0, shape=(), dtype=float32)

This was just one simple differential. What is we need to take double differentiation? That is also possible.

x = tf.Variable(2.0)
with tf.GradientTape() as t1: with tf.GradientTape() as t2: y = 2*(x**3) + 5*(x**2) + 3*x dy_dx = t2.gradient(y, x) d2y_dx2 = t1.gradient(dy_dx, x) print(dy_dx) print(d2y_dx2)

Output:

tf.Tensor(47.0, shape=(), dtype=float32)
tf.Tensor(34.0, shape=(), dtype=float32)

One last thing to share in this tutorial. GradientTape we called before can be used only once. Once you run the code that GradientTape is done. You cannot use it again.

For example, if I run the GradientTape again like this:

z = tf.reduce_sum(dy_dx)
dz_dx = t.gradient(z, x)

I get a runtime error like this:

RuntimeError                              Traceback (most recent call last)
<ipython-input-13-5a6319a58377> in 2>()
      1 z = tf.reduce_sum(dy_dx)
----> 2 dz_dx = t.gradient(z, x)
/usr/local/lib/python3.10/dist-packages/tensorflow/python/eager/backprop.py in gradient(self, target, sources, output_gradients, unconnected_gradients) 1001 “”” 1002 if self._tape is None: -> 1003 raise RuntimeError(“A non-persistent GradientTape can only be used to ” 1004 “compute one set of gradients (or jacobians)”) 1005 if self._recording: RuntimeError: A non-persistent GradientTape can only be used to compute one set of gradients (or jacobians)

For that you need to “use persistent=True” in the GradientTape call. Here is the example:

x = tf.constant(2.0)
with GradientTape(persistent=True) as t: t.watch(x) y = 2*(x**3) + 5*(x**2) + 3*x z = y*y*y*y dy_dx = t.gradient(y, x) print(dy_dx)

Output:

tf.Tensor(47.0, shape=(), dtype=float32)

This time we can perform the differentiation again using the same GradientTape we just called:

dz_dy = t.gradient(z, y)
dz_dy

Output:

296352.0>

Conclusion

Hopefully, it was a good learning experience for you if you haven’t known it already. In my next article, I will explain how to use it to solve a problem.

Leave a Reply

Close Menu