If I have seen further it is by standing on the shoulders of giants.
— Isaac Newton
3.1 Definition of a Derivative
Imagine you are driving in your parent’s car (or perhaps you are driving your own car). You can check the speedometer to read the current speed. Suppose we wanted to determine how the car is calculating this speed. We could start by using the typical formula learned in basic physics:
\[
v = \frac{\Delta x}{\Delta t} = \frac{x_f - x_i}{t_f - t_i} = \frac{\text{Change in Position}}{\text{Change in Time}}
\]
How do we measure the change in position or the change in time? We can calculate how far your car has moved over the past 2 seconds, the past 20 seconds, the past minute, or the past 30 minutes! Which of these times would allow us to compute the value reported by your speedometer?
We make a distinction between two types of speed:
Average Speed allows one to compute the average rate at which a car is changing position over time. To compute this value, simply divide how far the car has traveled by the time it took to travel that distance (the formula above).
Instantaneous Speed is the rate at which one is moving at a single moment in time.
What we want is the instantaneous speed, not the average speed. Computing the instantaneous speed is a bit more subtle than calculating the average speed. If we are calculating the instantaneous speed, we are essentially asking what the speed of the car will be in a single moment, so the difference in time (the denominator in the equation above) will be 0! This seems like a contradiction: how can we expect the car to travel (and thus have a speed) over no time? The way out of this conundrum is an application of the limiting process described in the previous two chapters. In particular, we need a notion called the derivative.
The interactive figure below illustrates the difference between average speed and instantaneous speed. Whenever the two points are not overlapping, that indicates we are computing the average speed, which is the slope of the line defined by two non-overlapping points on the plot. In a mathematical context, such a line is called a secant line. On the other hand, when the points overlap, then the slope of the line represents the instantaneous speed of the car at that moment. In a mathematical context, such a line is called the tangent to the curve at that point.
Now we have some context of the problem. We also have a nice picture. How do we translate these thoughts and these pictures into mathematics? Fundamentally, what we want is the slope of the line that results when we bring the points \((a, f(a))\) and \((b, f(b))\) very close together. As previously mentioned, this involves a limit. In particular, if we want the slope of the curve at a particular point on some graph (which is the instantaneous speed, or derivative) we can compute the following limit:
This definition is identical to the one above if we let \(a = x\) and \(b = x + h\). Note that \(h\) is the horizontal distance between \(a\) and \(b\) in our interactive figure above. Therefore, the previous definitions means: “Calculate the slope of the line between \((x, f(x))\) and \((x + h, f(x + h))\) as we let the distance between those points go to zero”.
Note that there are other notations which are frequently used to denote the derivative of a function. The following all denote the same concept:
\[ f'(x) \hspace{7mm} D_x f(x) \hspace{7mm} \dot{f}(x) \hspace{7mm} \frac{d}{dx}f(x)\]
The first and last are the notations that will be used most frequently throughout this book. The second is typically reserved for derivatives of functions involving multiple variables. The third notation is frequently used in physics to indicate a derivative with respect to time. Each notation type has a name: the first is Lagrange’s Notation, the second is Euler’s Notation, the third is Newton’s Notation, and the last is Leibniz’s Notation.
3.2 Examples Applying the Limit Definition
3.2.1 Derivative of Constants
3.2.2 Derivative of Lines
Now that we have a mathematical definition, we want to apply that definition to functions. Let’s start with a really simple function: a straight line!
Notice that no matter which points are selected on a line, they always return the same slope, namely, the slope of the line itself! Therefore, if we put the formula of a line in the limit definition above, we ought to get the slope of the line back. Recall that any line can be written in slope intercept form as: \(f(x) = ax + b\), where \(a\) is the slope of the line and \(b\) is the y-intercept of that line. Therefore, we have:
\[
f(x + h) = a(x + h) + b
\]
\[
f(x) = ax + b
\]
Therefore,
\[
f'(x) = \lim_{h \rightarrow 0} \frac{f(x + h) - f(x)}{h} = \lim_{h \rightarrow 0}\frac{a(x + h) + b - (ax + b)}{h}
\]\[
= \lim_{h \rightarrow 0} \frac{ax + ah + b - ax - b}{h} = \lim_{h \rightarrow 0} \frac{ah}{h} = \lim_{h\rightarrow 0} a = a
\]
Wonderful! The definition returned the value we expected (the slope of the line)!
3.2.3 Derivative of Monomials
Next, let’s try to compute the derivative of \(f(x) = x^2\), a parabola:
\[ f'(x) = \lim_{h \rightarrow 0} \frac{f(x + h) - f(x)}{h} = \lim_{h \rightarrow 0} \frac{(x + h)^2 - x^2}{h}\]\[ = \lim_{h \rightarrow 0} \frac{(x^2 + 2xh + h^2) - x^2}{h} = \lim_{h \rightarrow 0} \frac{2xh + h^2}{h} = \lim_{h \rightarrow 0} 2x + h = 2x.\]
The interactive graph below illustrates the relationship between the tangent lines of a parabola (on the left) and the value of the slope (the point on the line \(y = 2x\)).
We can also find the derivative of the function \(f(x) = x^3\) using an argument identical to the one just provided. We have:
Now suppose we wanted to find the derivatives of \(x^4\) and \(x^5\) and … and so on. It would take a long time for us to compute the derivative for all of these curves! One thing to notice is that for both \(x^2\) and \(x^3\), the exponent became a coefficient. After bringing the exponent down, we reduced the exponent by one. Therefore we might expect the following:
This turns out to be true. Proving that this is true requires the use of the Binomial Theorem, which the reader may not know. The reader is not obligated to understand the proof of this fact; however, the reader must be familiar with the formula, as it will arise frequently later.
We will let the exponent be \(n\), which will stand for any positive integer (for now):
\[f(x) = x^n\]
\[f(x + h) = (x + h)^n\]
Now, there was a very valuable theorem called the Binomial Theorem that you learned (but may have forgotten) in Algebra. The theorem is the following:
\[ (a + b)^n = \sum_{x = 0}^n {n \choose x} a^x b^{n - x}\]
Therefore, we have:
\[ f'(x) = \lim_{h \rightarrow 0}\frac{f(x + h) - f(x)}{h} = \lim_{h \rightarrow 0}\frac{(x + h)^n - x^n}{h} = \lim_{h \rightarrow 0}\frac{\sum_{i = 0}^{n} {n \choose x} x^i h^{n-i} - x^n}{h}\]\[ = \lim_{h \rightarrow 0}\frac{x^n + \sum_{i = 0}^{n - 1} {n \choose i} x^ih^{n - i} - x^n}{h} = \lim_{h \rightarrow 0} \frac{\sum_{i = 0}^{n - 1} {n \choose i} x^i h^{n - i}}{h} = \lim_{h\rightarrow 0}\sum_{i = 0}^{n - 1} {n \choose i} x^i h^{(n - 1) - i}\]
Now, all of the terms in the last quantity but one will be zero since we are taking the limit as \(h \rightarrow 0\). The only term for which is this not true is when the index \(i = n - 1\), since \(h^{(n - 1) - (n - 1)} = h^0 = 1\):
We will illustrate the limit definition one last time, then we will proceed to rules regarding the derivative.
Suppose we seek the derivative of \(f(x) = \sqrt{x}\). Then we have the following:
\[ f(x) = \sqrt{x} \]
and
\[ f(x + h) = \sqrt{x + h}\]
Therefore, we have:
\[f'(x) = \lim_{h\rightarrow 0}\frac{f(x + h) - f(x)}{h} = \lim_{h \rightarrow 0}\frac{\sqrt{x + h} - \sqrt{x}}{h}\]
Notice that if we naïvely allow \(h \rightarrow 0\) before manipulating the fraction, we end up with \(\frac{\sqrt{x} - \sqrt{x}}{0} = \frac{0}{0}\). This isn’t very helpful; we will not be able to apply that result to any problem we encounter in the real world. We’ll have to use a little trick to cancel out the \(h\) from the denominator:
Note that \(\frac{d}{dx}\sqrt{x} = \frac{1}{2\sqrt{x}} = \frac{1}{2}x^{1/2 - 1}\) also satisfies the formula \(\frac{d}{dx} x^n = nx^{n - 1}\) where we assumed \(n\) was assumed to be an integer. The same proof we provided for the case when \(n\) is an integer can be extended to show that the formula \(\frac{d}{dx} x^n = nx^{n - 1}\) holds for any real number \(n\).
3.3 Derivatives of log(x), exp(x), sin(x), and cos(x)
3.3.1 Log(x)
First, we will provide a sketch demonstrating how to determine the derivative of the function \(f(x) = \ln(x)\) using the definition 3.2.
We take the common definition of Euler’s number (as opposed to Euler’s Constant) \(e = \lim_{h \rightarrow \infty} \left(1 + \frac{1}{h}\right)^h\). Therefore, rearranging the limit above, we have:
Next, we will find the derivative of the function \(f(x) = e^x\). We will use a clever trick involving the chain rule, 3.3. The proof goes like this:
\[f(x) = e^x\]
Recall that \(\ln(e^x) = x\) since \(\ln(x)\) and \(e^x\) are inverse functions. Thus, taking the natural log of both sides of the equation above, we have:
\[\ln\left( f(x)\right) = x\]
Now we can take a derivative with respect to \(x\). By the chain rule, we first take the derivative of the natural log part, then we will take the derivative of \(f(x)\). Furthermore, the derivative of \(x\) with respect to \(x\) is just one. Therefore, we have:
\[\frac{1}{f(x)}\frac{d}{dx}f(x) = 1\]
Multiplying both sides of this equation by \(f(x)\), we have:
\[\frac{d}{dx} f(x) = f(x) = e^x\]
Therefore, the derivative of \(f(x) = e^x\) is itself! That’s an amazing thing; indeed, \(e^x\) is the only function for which this holds! An illustration of this remarkable fact is provided below.
3.3.3 Sin(x)
We will now compute the derivative of the sine function, \(\sin x\). As usual, we begin with the definition of a limit:
\[\frac{d}{dx}\sin x = \lim_{h\rightarrow 0}\frac{\sin(x + h) - \sin(x)}{h}\]
We will now expand \(\sin(x + h)\). To do so, we will be using formula (1.7). Using this result, we have:
We now have two limits we must compute. The first is \(\lim_{h\rightarrow 0}\frac{\sin(h)}{h}\) and the second is \(\lim_{h\rightarrow 0}\frac{1 - \cos(h)}{h}\).
These are limits computed in the previous chapter.
\[ \frac{d}{dx} \sin(x) = \cos(x)\lim_{h\rightarrow 0}\frac{\sin(h)}{h} - \sin(x)\lim_{h\rightarrow 0}\frac{1 - \cos(h)}{h} \]
Applying the limits we just computed, we have:
Those limits should look familiar! Indeed, we have already proved that \(\lim_{h\rightarrow 0}\frac{\cos(h) - 1}{h} = 0\) and \(\lim_{h\rightarrow 0}\frac{\sin(h)}{h} = 1\). Therefore, we have:
We have seen how to compute the derivatives of a variety of functions which will be used throughout this text. With the derivatives of these functions, we now consider general rules that apply to any function.
3.4.1 Constant Rule
Constants are not affected by a derivative:
Theorem 3.1 (Constant Rule) Constants can be removed from inside (or be pushed into) a derivative at your leisure:
\[ \frac{d}{dx}(c f(x)) = c\frac{df(x)}{dx}\]
Proof. The proof follows simply from the definition. We have
Example 3.3 (Product Rule 1) Compute the derivative of the function \(f(x) = e^x\cos x\).
To compute the derivative of \(f(x)\), we simply apply the product rule we just learned. We will have two terms: one involving a derivative of \(e^x\) and another involving a derivative of \(\cos x\). We have:
We already computed the derivatives of \(e^x\)3.3.2 and \(\cos x\)3.3.4. In particular, we know that \(\frac{d}{dx} e^x = e^x\) and \(\frac{d}{dx} \cos x = -\sin x\). Therefore, the derivative becomes
\[\frac{df}{dx} = \frac{d}{dx}(e^x)\cos x + e^{x}\frac{d}{dx}(\cos x) = \boxed{e^x\cos x - e^x\sin x}\]
3.4.3 Chain Rule
A rule that will appear repeatedly in this text is the Chain Rule. This rule describes how to compute the derivative of two functions which are composed:
Theorem 3.3 (Chain Rule) Suppose that \(f(x)\) and \(g(x)\) are two differentiable functions, and we seek the derivative of the function \(f(g(x))\). Its derivative is given by:
\[\frac{d}{dx}f(g(x)) = f'(g(x))\cdot g'(x)\]
We simply use the limit definition of the derivative. We have the following:
The limit of a product is equal to the product of the limits assuming the limits exist (2.4.1). We assumed that both \(f(x)\) and \(g(x)\) are differentiable, so both limits exist. Therefore, we have:
But we assumed \(u = g(x)\). Therefore, \(f'(u) = f'(g(x))\). Hence, we find
\[\frac{d}{dx} f(g(x)) = f'(g(x)) g'(x),\]
as was to be shown.
Example 3.4 (Chain Rule 1) Compute the derivative of the function \(k(x) = e^{\sin x}\).
We begin by identifying the “inner” and “outer” functions. Our outer function is \(f(x) = e^x\), while the inner function is \(g(x) = \sin x\). We must compute the derivative of each. We have:
\[f'(x) = \frac{d}{dx} e^x = e^x\]
In other words, since \(g(x) = \sin x\), then \(f'(g(x)) = e^{\sin x}\). Furthermore, we have
The next rule is a combination of the product rule (3.2) and the chain rule (3.3). It is called the Quotient Rule, which allows us to compute the derivative of the division of two functions:
Theorem 3.4 (Quotient Rule) Suppose you are given two differentiable functions \(f(x)\) and \(g(x)\). Then their quotient is differentiable, too. Its derivative is given by:
To remember this rule, you can use the following mnemonic:
\[ \frac{d}{dx}\left(\frac{f(x)}{g(x)}\right) = \frac{\text{Low}\cdot \text{D high} - \text{High}\cdot \text{D Low}}{\text{Low}^2} \]
This is read as Low D High Minus High D Low over Low Squared. “D High” is the derivative of the function in the numerator, while “D Low” is the derivative of the function in the denominator.
The quotient rule is really just the product rule in disguise, as the following proof illustrates.
To find the derivative \(\frac{d}{dx} (g(x))^{-1}\), we must use the chain rule. We have \(\frac{d}{dx} (g(x))^{-1} = - (g(x))^{-2} \cdot g'(x)\). Therefore, we have the following:
We need \((g(x))^{-2}\) in both terms above. Therefore, we will multiply and divide the first term by \(g(x)\) to obtain:
\[ = f'(x)\cdot g(x) (g(x))^{-2} - f(x)\cdot g'(x)\cdot (g(x))^{-2}\]\[ = \frac{f'(x)\cdot g(x) - f(x)\cdot g'(x)}{g(x)^2}\]
As was to be shown.
Example 3.5 (Quotient Rule 1) Compute the derivative of \(f(x) = \frac{e^x}{x^2}\).
The derivative of this function is a simple application of the quotient rule. We write out each of the pieces described in the mnemonic provided above:
With the quotient rule, we can now determine the derivative of the function \(f(x) = \tan x\):
3.4.5 Tan(x)
Computing the derivative of \(\tan(x)\) is much simpler than the derivatives for \(\sin(x)\) and \(\cos(x)\). This is because \(\tan(x) = \frac{\sin(x)}{\cos(x)}\).
\[\frac{d}{dx}\tan(x) = \frac{d}{dx}\left( \frac{\sin(x)}{\cos(x)}\right) = \frac{\cos(x)\frac{d}{dx}\sin(x) -\sin(x)\frac{d}{dx}\cos(x)}{\cos(x)^2}\]
Where the last quantity follows from the quotient rule, (3.6). We just determined that \(\frac{d}{dx}\sin(x) = \cos(x)\) ((3.3)) and \(\frac{d}{dx}\cos(x) = -\sin(x)\) ((3.4)). Therefore, we have:
These derivative rules will be of great use to us throughout this text. We will find the derivatives of two more functions, then we will proceed to examples.