top of page

Efficient Backpropagation: Exploring the Role of Jacobian Matrices and the Chain Rule

  • shivamshinde92722
  • Oct 9, 2024
  • 2 min read

Updated: Jan 11

In this article, I will touch upon the topic of Jacobian matrix and how it is used in the back-propagation operation of deep learning.

ree

What is Jacobian Matrix?

Let’s consider the following function.

ree

Now, the derivative of f1, f2 and f2 with respect to x1, x2 are:


ree

If we arrange these derivatives in specific way, then we get our Jacobian matrix.

ree

The jacobian matrix above is created using f1, f2, f3 and variable x1, x2. However, the number of f functions and x variable can be much higher.


Chain Rule of Multivariate Calculus


ree

Here, partial derivatives are the jacobian matrices. Also, this chain rule generalizes to the arbitrary deep compositions. We can even find the partial derivate of f(g(h(x))) with respect to x (f, g, h are functions).


Let’s take an example,


ree

we can solve this problem using chain rule i.e.,


ree

Use of Jacobian in Neural Network


Let’s consider the following neural network:


ree

The neural network training process consists of two main steps: forward propagation and backward propagation.


During backward propagation, we update the weight and bias values by calculating the partial derivative of the loss function with respect to each weight or bias. Instead of calculating these partial derivatives for each weight and bias separately, we use Jacobians. This approach increases the efficiency of the code used for training the neural network.


To update the weights using gradient descent during backpropagation, we calculate the partial derivative of the loss with respect to each corresponding weight. Specifically, for updating any weight in the W2 matrix, we find the partial derivative of the loss with respect to that particular weight.


Instead of calculating each gradient individually, we can use the Jacobian matrix to compute all the partial derivatives of the loss with respect to the weight values in the W2 matrix simultaneously.


ree

Outro


Thank you so much for reading. Follow me on Medium and LinkedIn for more such articles.


Have a nice day!

Comments


  • LinkedIn
  • GitHub
  • Medium
bottom of page