Chain rule (MVC)

What we know about the chain rule

In another section we learned about the chain rule as it applies to functions in $\mathbb{R}^2$. We learned to take derivatives of nested functions,

$$\frac{d}{dx} f(g(x)) = \frac{df}{d(g(x))} \frac{dg}{dx}$$

For example, if $f(x) = \ln(x)$ and $g(x) = 3x^3$, then $f(g(x)) = \ln(3x^3)$. the derivative of $f(g(x))$ is

$$\frac{df}{dx} = \frac{1}{3x^3} \cdot 9x^2$$

We also know that we can "chain" these expressions indefinitely. In the case of a function of the form $f(g(h(x)))$, for example,

$$\frac{df}{dx} = \frac{df}{d g(h(x))} \frac{d g((h(x))}{d h(x)} \frac{dh}{dx}$$

One use of the chain rule is in implicit differentiation to find derivatives of inverse functions, such as $f(x) = \sin^{-1}(x)$. To do this, we let $y = \text{\sin}^{-1}(x)$ and take the sine of both sides:

$$ \begin{align} x &= \sin(y) \tag{1} \\[5pt] \frac{dx}{dx} &= \frac{d}{dx} \sin(y) \\[5pt] 1 &= \cos(y) \cdot \frac{dy}{dx} \\[5pt] \frac{dy}{dx} &= \frac{1}{\cos(y)} \end{align}$$

Now let's use equation (1) to build a right triangle that represents $x = \sin(y)$. Notice that in this expression, $y$ is an angle and $x$ is the length of the opposite side of its triangle.

Now it should be clear that

$$\frac{dy}{dx} = \frac{1}{\cos(y)} = \frac{1}{\sqrt{1-x^2}}$$

Now let's try to expand these ideas to functions of higher dimensions using partial derivatives.

First, just a note about differentials like $dx$ and $\partial x$. We should remember that these are not really "things." They are ideas. Sometimes we call them infinitessimal pieces of the variable $x$. On the other hand, $\Delta x$, is a number. It is a small, but measureable piece of the variable $x$. We can make a linear approximation of a function at a point $x$: $y \approx f(0) + f'(x) \Delta x$. This equation becomes an equality in the limit where $\Delta x \rightarrow 0$, and we could expand it further to match the curvature of the function, the change in that curvature, and so on, in a Taylor series.

Chain rule in higher dimensions

Let's take a function of three variables, $f(x, y, z)$, and write its total differential, which encodes how changes in $x, y$ and $z$ affect $f$:

$$ \begin{align} df &= \frac{\partial f}{\partial x} + \frac{\partial f}{\partial y} + \frac{\partial f}{\partial z} \tag{1} \\[5pt] &= f_x \, dx + f_y\, dy + f_z \, dz \end{align}$$

This is analogous to the case for a 2D function:

$$\frac{dy}{dx} = f'(x) \; \rightarrow \; dy = f'(x) dx$$

Now let's think about rates of change. We'll take our $f(x, y, z)$ and assume that $x$, $y$ and $z$ are all functions of time, $t$:

$$ \begin{align} x &= x(t) \\[5pt] y &= y(t) \\[5pt] z &= z(t) \end{align}$$

Now if we divide equation (1) through by $dt$ we get our first example of the chain rule in higher-dimensional spaces:

$$\frac{df}{dt} = \frac{\partial f}{\partial x} \frac{dx}{dt} + \frac{\partial f}{\partial y} \frac{dy}{dt} + \frac{\partial f}{\partial z} \frac{dz}{dt}$$

We really just threw this together, so we might want to take a closer look and see why it works. We'll go back to our approximation formula for $\Delta f$:

$$\Delta f = f_x \Delta x + f_y \Delta y + f_z \Delta z$$

Now divide through by $\Delta t$ to convert to rates of change:

$$\frac{\Delta f}{\Delta t} = \frac{f_x \Delta x + f_y \Delta y + f_z \Delta z}{\Delta t}$$

Now in the limit that $\Delta t \, \rightarrow \, 0$ we have

$$\frac{\Delta f}{\Delta t} \, \rightarrow \, \frac{df}{dx}, \frac{\Delta x}{\Delta t} \, \rightarrow \, \frac{df}{dt} ... \text{and so on}$$

Our approximation improves as we take smaller time chunks, ultimately becoming exact in the limit that $\Delta t \, \rightarrow \, 0$.

Tree diagrams are handy

It can be very useful, especially at the start, to draw tree diagrams showing the dependencies of functions on variables, sub-variables, and sometimes even deeper levels of variables. Here's a diagram of a function $f(x,y,z)$ in which $x, y$ and $z$ all depend on two independent variables $s$ and $t$.

Let's say we want to find $\frac{\partial f}{\partial t}$. We simply make all of the connections between $f$ and $t$ (shown in green), and construct derivatives for each. It looks like this:

$$\frac{\partial f}{\partial t} = \frac{\partial f}{\partial x} \cdot \frac{\partial x}{\partial t} + \frac{\partial f}{\partial y} \cdot \frac{\partial y}{\partial t} + \frac{\partial f}{\partial z} \cdot \frac{\partial z}{\partial t}$$

Now let's take a simpler example to see a subtle difference in our chain rule expressions. This tree represents a function $f(x, y)$ for which $x$ and $y$ are both functions of a parameter $t$.

The two possible first partial derivatives are

$$\frac{\partial f}{\partial x} \frac{dx}{dt} \phantom{000} \text{and} \phantom{000} \frac{\partial f}{\partial y} \frac{dy}{dt}$$

Notice that the derivatives with respect to the independent variable $t$ aren't partial derivatives because $x$ and $y$ depend only on $t$. The total differential of this function is

$$\frac{d f}{d t} = \frac{\partial f}{\partial x} \frac{dx}{dt}+ \frac{\partial f}{\partial y} \frac{dy}{dt}$$

Finally, these tree diagrams can really come in handy when functions contain deeper levels of dependency. Consider this tree.

From this tree, we can calculate, for example:

$$\frac{\partial f}{\partial v} = \frac{\partial f}{\partial x} \frac{\partial x}{\partial t} \frac{\partial t}{\partial v} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial t} \frac{\partial t}{\partial v} + \frac{\partial f}{\partial z} \frac{\partial z}{\partial t} \frac{\partial t}{\partial v}$$

Example 1

Find the derivative of $f(x, y, z) = x y^2 + z$, where $x = t$, $y = e^t$ and $z = -\cos(t)$.

Solution: Applying the chain rule gives

$$\begin{align} \frac{df}{dt} &= \frac{\partial f}{\partial dx} \frac{dx}{dt} + \frac{\partial f}{\partial dy} \frac{dy}{dt} + \frac{\partial f}{\partial dz} \frac{dz}{dt} \\[5pt] &= y^2 \frac{dx}{dt} + 2xy \frac{dy}{dt} + (1) \frac{dz}{dt} \\[5pt] &= y^2 (1) + 2xy (e^t) + \sin(t) \\[5pt] &= y^2 + 2xy \, e^t + \sin(t) \end{align}$$

Plugging in the functions of $t$ gives

$$\frac{df}{dt} = e^{2t} + 2t e^{2t} + \sin(t)$$

We don't actually need the chain rule to find this derivative, so that allows us to confirm our result just by plugging in the functions of $t$ and taking the derivative directly:

$$ \begin{align} f(t) &= t e^2t - \cos(t) \\[5pt] f'(t) &= e^{2t} + 2t e^{2t} + \sin(t) \end{align}$$

$\color{green}{\checkmark}$ Confirmed.

Example 2

Calculate $\frac{dz}{dt}$ for $z = f(x,y) = xe^{xy}$, where $x=t^2, \; y = \frac{1}{t}$.

Solution: The chain rule gives us our pattern:

$$\begin{align} \frac{df}{dt} &= \frac{\partial f}{\partial dx} \frac{dx}{dt} + \frac{\partial f}{\partial dy} \frac{dy}{dt} + \frac{\partial f}{\partial dz} \frac{dz}{dt} \\[5pt] &= \left(e^{xy}+xye^{xy} \right)2t + x^2 e^{xy} \left(-\frac{1}{t^2} \right) \end{align}$$

Now substitute for $x$ and $y$ in terms of $t$:

$$ \begin{align} \frac{dz}{dt} &= \left(e^{\frac{t^2}{t}} + \frac{t^2}{t} e^{\frac{t^2}{t}} \right) 2t + t^4 e^{\frac{t^2}{t}}\left( \frac{-1}{t^2} \right) \\[5pt] &= \left( e^t + t e^t \right)2t - t^2 e^t \\[5pt] &= 2te^t + 2t^2e^t - t^2e^t \\[5pt] &= 2te^t + t^2e^t \end{align}$$

Example 3

Let $f(x,y) = (x^2+y^2)xy$, with $x = r \, \cos(\theta), \; y = r \, \sin(\theta)$. Calculate $\frac{df}{dr}$.

Solution: The chain rule gives us the recipe for finding this derivative:

$$\frac{df}{dr} = \frac{\partial f}{\partial x} \frac{\partial x}{\partial r} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial r}$$

So we have

$$ \begin{align} \frac{df}{dr} &= \left[ y(x^2+y^2) + 2x^2y \right]\cos(\theta) \\[5pt] &\phantom{00} + \left[ y(x^2+y^2) + 2x^2y \right]\sin(\theta) \\[5pt] &= \left[x^2y+y^3+2x^2y \right] \cos(\theta) \\[5pt] &\phantom{00} + \left[ x^3 + xy^2+2xy^2 \right]\sin(\theta) \\[5pt] &= \left[ y^3 + 3x^2y \right] \cos(\theta) + \left[ x^3 + 3xy^2 \right] \sin(\theta) \\[5pt] &= (r^3 \sin^3(\theta) + 3r^3 \cos^2(\theta) \sin(\theta)) \cos(\theta) \tag{1} \\[5pt] &\phantom{00} + (r^3 \cos^3(\theta) + 3r^3 \cos(\theta)\sin^2(\theta)) \sin(\theta) \\[5pt] &= r^3 \left[ 4 \sin^3(\theta) \cos(\theta) + 4\cos^3(\theta) \sin(\theta) \right] \\[5pt] &= 4r^3 \left[ \sin^3(\theta) \cos(\theta) + \cos^3(\theta) \sin(\theta) \right] \end{align}$$

In step $(1)$ we changed variables to our parameters, $r$ and $\theta$. We can also just leave our derivative written in terms of $x$ and $y$, below. The latter is much easier, and we can use whichever representation is most convenient, of course.

$$ \begin{align} \frac{df}{dr} &= \left[ y^3 + 3x^2y \right] \cos(\theta) + \left[ x^3 + 3xy^2 \right] \sin(\theta) \\[5pt] &= xy^3 + 3x^3y + x^3y + 3xy^3 \\[5pt] &= 4(xy^3 + x^3y) \end{align}$$

The chain rule

For a function $f(x(s, t), y(s, t), z(s, t))$, the derivatives with respect to $s$ and $t$ are

$$\frac{df}{ds} = \frac{\partial f}{\partial x} \frac{dx}{ds} + \frac{\partial f}{\partial y} \frac{dy}{ds} + \frac{\partial f}{\partial z} \frac{dz}{ds}$$

$$\frac{df}{dt} = \frac{\partial f}{\partial x} \frac{dx}{dt} + \frac{\partial f}{\partial y} \frac{dy}{dt} + \frac{\partial f}{\partial z} \frac{dz}{dt}$$

Deriving the product and quotient rules

We can use the multivariable chain rule to derive these important results from calculus.

Let $f(u, v) = uv$, where $u = u(t)$ and $v = v(t)$.

Then we have

$$\frac{df}{dt} = \frac{d(uv)}{dt} = f_u \frac{du}{dt} + f_v \frac{dv}{dt} = v \, \frac{du}{dt} + u \, \frac{dv}{dt}$$

where the last result is the product rule,

$$(uv)' = uv' + u'v$$

Likewise, let $g(u, v) = \frac{u}{v}$. Then we have

$$ \begin{align} \frac{df}{dt} = \frac{d}{dt}\left( \frac{u}{v} \right) &= f_u \frac{du}{dt} + f_v \frac{dv}{dt} \\[5pt] &= \frac{1}{v} \frac{du}{dt} - \frac{u}{v^2} \frac{dv}{dt} \\[5pt] &= \frac{v}{v^2} \frac{du}{dt} - \frac{u}{v^2} \frac{dv}{dt} \\[5pt] &= \frac{v \frac{du}{dt} - u \frac{dv}{dt}}{v^2} \end{align}$$

Practice problems

Find indicated partial derivatives for each of these functions.

$f(x,y) = xy - z \, \sin(y), \phantom{00}$

$x = 2t, \; y = e^t, \; z = t^2$ Find $\frac{\partial f}{\partial t}$

Solution

$$ \begin{align} \frac{df}{dt} &= \frac{\partial f}{\partial y} \frac{dx}{dt} + \frac{\partial f}{\partial y} \frac{dy}{dt} + \frac{\partial f}{\partial z} \frac{dz}{dx}\\[5pt] &= y \frac{dx}{dt} + (x - z \, \cos(y)) \frac{dy}{dt} - \cos(y) \frac{dz}{dt} \\[5pt] &= y(2) + (x-z \, \cos(y))e^t - 2t \, \cos(y) \end{align}$$

Now replace the $x,y$ and $z$'s with their $t$ expressions:

$$ \require{cancel} \begin{align} \frac{d f}{d t} &= 2 e^t + (2t - t^2)\, \cos(e^t) - 2t \, \cos(e^t) \\[5pt] &= 2e^t + \cos(e^t) \, (\cancel{2t} - t^2 - \cancel{2t}) \\[5pt] &= 2e^t - t^2 \cos(e^t) \end{align}$$

$f(x,y) = \ln(x) + 4\ln(y) - x - 4y$

Solution

$$ \begin{align} f_x &= \frac{1}{x}-1 = 0 \\[5pt] x &= 1 \\[5pt] f_y &= \frac{4}{y} - 4 = 0 \\[5pt] y &= 1 \end{align}$$

So our only critical point is $(1, 1)$.

Then we use this in $y = -x -1$ to get $y = -\frac{3}{2}$, thus our one critical point is $\left( -\frac{1}{2}, -\frac{3}{2} \right)$.

The second partials are

$$ \begin{align} f_{xx} &= \frac{-1}{x^2} \\[5pt] f_{yy} &= \frac{-4}{x^2} \\[5pt] f_{xy} &= 0 \end{align}$$

$$D = \text{det} \left( \begin{matrix} \frac{-1}{x^2} & 0 \\ 0 & \frac{-4}{y^2} \end{matrix} \right) = \frac{4}{x^2 y^2}$$

$D$ is always positive. Then $f_{xx}(1,1) = -1$. Because this is less than zero and $D > 0$, the critical point is a local maximum.

$f(x,y) = x^3 + 2xy - 2y^2 - 10x$

Solution

$$ \begin{align} f_x &= 3x^2+2y-10 = 0 \\[5pt] f_y &= 2x - 4y = 0 \\[5pt] 4y &= 2x \; \rightarrow \; x = 2y \end{align}$$

Plugging $x=2y$ into the $f_x$ equation gives

$$ \begin{align} 3(2y)^2 + 2y -10 &= 0 \\[5pt] 4(4y^2) + 2y -10 &= 0 \\[5pt] 12y^2 + 2y &= 10 \\[5pt] y^2 + \frac{y}{6} + \left( \frac{1}{12} \right)^2 &= \frac{5}{6} + \frac{1}{144} \tag{*} \\[5pt] \left( y + \frac{1}{12} \right)^2 &= \frac{120 + 1}{144} \tag{*} \\[5pt] y &= \frac{-1 \pm \sqrt{121}}{\sqrt{144}} \tag{*} \\ y &= \frac{-1 \pm 11}{12} = -1, \frac{5}{6} \end{align}$$

The starred steps above are solving the quadratic by completing the square. We can get the x-coordinates for each of those $y$ values using $f_y = 0$:

$$ \begin{align} f_y(y = -1) = 2x + 4 = 0 \; \rightarrow \; x &= -2 \\[5pt] f_y\left(y = \frac{5}{6} \right) = 2x - 4 \left( \frac{5}{6} \right) = 0 \; \rightarrow \; x &= \frac{5}{3} \end{align}$$

So our two critical points are $(-2, -1)$ and $\left( \frac{5}{3}, \frac{5}{6} \right)$

The second partials are

$$ \begin{align} f_{xx} &= 6x \\[5pt] f_{yy} &= -4 \\[5pt] f_{xy} &= 2 \end{align}$$

$$D = \text{det} \left( \begin{matrix} 6x & 2 \\ 2 & -4 \end{matrix} \right) = \frac{4}{x^2 y^2} = -24x - 4$$

Here is a table of $D$ and $f_{xx}$ values for our two critical points.

Qty.	(-2,-1)	(5/3,5/6)
$D(a,b)$	44	-24
$f_{xx}(a,b)$	-12	10

Based on these quantities, $(-2,-1)$ is a local maximum and $\left( \frac{5}{3}, \frac{5}{6} \right)$ is a saddle point.

$f(x,y) = x^2 + y^2 - xy + x$

Solution

The first partial derivatives are

$$ \begin{align} f_x &= 2x - y + 1 = 0 \\[5pt] f_y &= 2y -x = 0 \; \rightarrow \; y = \frac{x}{2} \\[5pt] \end{align}$$

Plugging that value of y back into $f_x$ gives

$$ \begin{align} 2x - \frac{x}{2} + 1 &= 0 \\[5pt] \frac{3}{2} y = -1 \; \rightarrow \; y &= -\frac{3}{2} \end{align}$$

The $x$-coordinate is $-\frac{2}{3}$, so our critical point is $\left( -\frac{2}{3}, -\frac{3}{2} \right)$.

The partial derivatives are

$$ \begin{align} f_{xx} = 2 \\[5pt] f_{yy} = 2 \\[5pt] f_{xy} = -1 \end{align}$$

That makes $D$ easy, $D = 3 \gt 0$. Now $f_{xx}(-2/3,-1/3) = 2$, also positive, so the critical point must be a local minimum.

xaktly.com by Dr. Jeff Cruzan is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. © 2024-2025, Jeff Cruzan. All text and images on this website not specifically attributed to another source were created by me and I reserve all rights as to their use. Any opinions expressed on this website are entirely mine, and do not necessarily reflect the views of any of my employers. Please feel free to send any questions or comments to jeff.cruzan@verizon.net.

xaktly | Vector calculus