Calculating the derivative of Tanh - step by step

Aug. 16, 2017, 11:32 p.m.

Summary

I recently created a blog post outlining how to calculate the gradients for the sigmoid activation function step by step. In this post, I will do the same for the tanh function.

The Tanh Function

The tanh function is just another possible functions that can be used as a nonlinear activation function between layers of a neural network. It actually shares a few things in common with the sigmoid activation function. They both look very similar. But while a sigmoid function will map input values to be between 0 and 1, Tanh will map values to be between -1 and 1.

Image of sigmoid vs tanh functions

You will also notice that the tanh is a lot steeper.

Like the sigmoid function, one of the interesting properties of the tanh function is that the derivative can be expressed in terms of the function itself. Below is the actual formula for the tanh function along with the formula for calculating its derivative.

Image of formulas for tanh and derivative

Calculating the derivative

Below, I will go step by step on how the derivative was calculated. But before we start, here are three useful rules from calculus we will use.

Image of useful calculus rules

We can start by representing the tanh function in the following way.

Image of tanh function restructured

When represented in this way, we can make use of the product rule, and carry on as follows.

Image of derivatives being worked out

We can see that we end up with the same derivative formula.

What the derivative looks like

We can create a plot that shows the relationship between the tanh function and its derivative as follows:

import matplotlib.pyplot as plt
import numpy as np

def tanh(z):
    ez = np.exp(z)
    enz = np.exp(-z)
    return (ez - enz)/ (ez + enz)

# Calculate plot points
z = np.arange(-4., 4., 0.01)
a = tanh(z)
dz = 1 - a**2

# Setup centered axes
fig, ax = plt.subplots(figsize=(9, 5))
ax.spines['left'].set_position('center')
ax.spines['bottom'].set_position('center')
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
ax.yaxis.set_ticks_position('left')

# Create and show plot
ax.plot(z,a, color="#307EC7", linewidth=3, label="tanh")
ax.plot(z,dz, color="#9621E2", linewidth=3, label="derivative")
ax.legend(loc="upper right", frameon=False)
fig.show()

Image of

Comments

Note you can comment without any login by:

  1. Typing your comment
  2. Selecting "sign up with Disqus"
  3. Then checking "I'd rather post as a guest"