Calculus

Parametric Equations

1.1 Parametric Equations

🧭 Overview

🧠 One-sentence thesis

Parametric equations allow us to describe curves—including those that are not functions—by expressing both x and y as separate functions of an independent parameter, which is especially useful for modeling motion and circular patterns.

📌 Key points (3–5)

What parametric equations are: both x and y are defined as continuous functions of an independent parameter t (often representing time), generating a set of ordered pairs that trace out a curve.
Why they matter: parametric equations can describe curves that are not functions (e.g., circles, cycloids) and model real-world motion like planetary orbits or projectile paths.
Eliminating the parameter: you can often convert parametric equations into a single equation relating x and y, revealing familiar curve types (parabolas, ellipses, etc.).
Parameterization is flexible: a single curve can be represented by infinitely many different pairs of parametric equations; there is no unique parameterization.
Common confusion: the variables x and y serve two roles—as functions of t (e.g., x(t)) and as coordinate variables in ordered pairs (x, y); the parameter t itself does not appear in the final graph.

📐 What parametric equations are

📐 Definition and structure

Parametric equations: If x and y are continuous functions of t on an interval I, then the equations x = x(t) and y = y(t) are called parametric equations, and t is called the parameter.

The parameter t is an independent variable; both x and y depend on it.
As t varies over the interval I, the functions x(t) and y(t) generate a set of ordered pairs (x, y).
This set of ordered pairs forms the graph, called a parametric curve or plane curve, denoted by C.

🎯 Two roles of x and y

First role: x and y are functions of the independent variable t.
Second role: x and y designate the ordered pairs (x, y) that are plotted.
It is important to distinguish the variables x and y from the functions x(t) and y(t).

🔄 Orientation

The orientation of a curve indicates the direction a point moves along the graph as t increases.
Example: as t progresses from −3 to 2, a point travels along the curve in a specific direction, often shown with arrows on the graph.

🔄 Eliminating the parameter

🔄 What it means

Eliminating the parameter: rewriting the two parametric equations as a single equation relating x and y.
This helps identify the curve type (parabola, ellipse, line, etc.) using prior knowledge of equations in the plane.

🔧 How to eliminate the parameter

Method 1: Solve one equation for t, then substitute into the other.
- Example: If y(t) = 2t + 1, solve for t: t = (y − 1)/2. Substitute into x(t) to get x as a function of y.
Method 2: Use trigonometric identities when sine and cosine are involved.
- Example: If x(t) = 4 cos t and y(t) = 3 sin t, divide to get cos t = x/4 and sin t = y/3, then use cos² t + sin² t = 1 to obtain (x/4)² + (y/3)² = 1 (an ellipse).

⚠️ Domain restrictions

When eliminating the parameter, pay attention to the original limits on t.
These limits impose domain or range restrictions on the resulting equation.
Example: If −2 ≤ t ≤ 6 and x = √(2t + 4), then when t = −2, x = 0; when t = 6, x = 4. The graph is only the portion of the curve where 0 ≤ x ≤ 4.

🔁 Parameterizing a curve

🔁 Going the other direction

Parameterization of a curve: starting with an equation y = f(x) and finding parametric equations that represent it.
There are infinitely many ways to parameterize a given curve.

🛠️ Simple parameterization

The simplest approach: define x(t) = t, then replace x with t in the equation for y.
Example: For y = 2x² − 3, set x(t) = t and y(t) = 2t² − 3.
If there is no restriction on the domain of the original equation, there is no restriction on t.

🎨 Creative parameterization

You have complete freedom in choosing the parameterization.
Example: For y = 2x² − 3, you could also choose x(t) = 3t − 2. Then substitute into y: y(t) = 2(3t − 2)² − 3 = 18t² − 24t + 6.
The only requirement: the range of x(t) must match the domain of the original equation (often all real numbers).

🚴 Cycloids and special curves

🚴 The cycloid

Cycloid: the path traced by a point on the edge of a circle (or bicycle wheel) as the circle rolls along a straight line.

For a circle of radius a, the parametric equations are:
- x(t) = a(t − sin t)
- y(t) = a(1 − cos t)
Physical interpretation: Imagine an ant clinging to the edge of a bicycle tire rolling down a straight road; the ant's path is a cycloid.

🔽 Derivation of the cycloid

The center of the wheel moves along the x-axis at height a: x(t) = at, y(t) = a.
The ant rotates around the center in a circular path (clockwise if the wheel moves left to right): x(t) = −a sin t, y(t) = −a cos t (relative to the center).
Adding these together gives the cycloid equations.

🔵 The hypocycloid

Hypocycloid: the path traced by a point on the edge of a smaller circle rolling inside a larger circle.

General parametric equations (larger circle radius a, smaller circle radius b):
- x(t) = (a − b) cos t + b cos((a − b)/b · t)
- y(t) = (a − b) sin t − b sin((a − b)/b · t)
The ratio a/b determines the number of cusps (pointed corners) on the graph.
Example: If a = 4 and b = 1, the hypocycloid has four cusps.
If a/b is irrational, the hypocycloid has infinitely many cusps and never returns to its starting point (a space-filling curve).

🔄 Curtate and prolate cycloids

Curtate cycloid: the path traced by a point on a spoke of a wheel, closer to the center than the edge (distance b < a from the center).
- The path has less up-and-down motion than a standard cycloid.
Prolate cycloid: the path traced by a point on the flange of a wheel, farther from the center than the edge (distance b > a).
- The path includes loops where the point actually moves backward even as the wheel rolls forward.

🌍 Real-world applications

🌍 Planetary orbits

The orbit of Earth around the Sun is elliptical, with the Sun at one focus.
The day number in a year can be treated as a parameter t that determines Earth's position.
Parametric equations x(t) and y(t) describe the coordinates of Earth's position as a function of time.

🎯 Projectile motion

Example: An airplane drops a package; the trajectory is given by x = 100t, y = −4.9t² + 4000.
The parameter t represents time; x and y give the horizontal and vertical position.
Parametric equations naturally model motion where position depends on time.

🔬 Other applications

The witch of Agnesi curve (x = 2a cot t, y = 2a sin² t) models water waves, spectral line distributions, and the Cauchy probability distribution.
Cycloid curves appear in physics and engineering contexts.

Calculus of Parametric Curves

1.2 Calculus of Parametric Curves

🧭 Overview

🧠 One-sentence thesis

Calculus techniques—derivatives, tangent lines, arc length, area, and surface area—can be extended to parametrically defined curves by expressing rates of change and integrals in terms of the parameter.

📌 Key points (3–5)

Derivative formula for parametric curves: dy/dx equals (dy/dt) divided by (dx/dt), allowing slope calculation without eliminating the parameter.
Arc length of parametric curves: integrate the square root of the sum of squared derivatives with respect to the parameter.
Area under parametric curves: integrate y(t) times x'(t) over the parameter interval.
Common confusion: the parameter t cannot always be eliminated to get y = f(x), but calculus still works directly with parametric equations.
Second derivatives and critical points: can be found by differentiating dy/dx with respect to t and dividing by dx/dt.

📐 Derivatives of parametric equations

📐 The fundamental derivative formula

Theorem: For a plane curve defined by x = x(t) and y = y(t), if x'(t) and y'(t) exist and x'(t) ≠ 0, then dy/dx = (dy/dt) / (dx/dt) = y'(t) / x'(t).

This formula uses the Chain Rule: if the parameter can be eliminated to give y = F(x), then y(t) = F(x(t)), and differentiating gives y'(t) = F'(x(t)) · x'(t).
Rearranging: F'(x(t)) = y'(t) / x'(t), which is dy/dx.
Why it matters: you can find slopes and tangent lines even when the curve cannot be written as y = f(x).

🔍 Finding critical points

A critical point occurs where dy/dx = 0 or does not exist.
When dy/dx = 0: the numerator y'(t) = 0 (horizontal tangent).
When dy/dx is undefined: the denominator x'(t) = 0 (vertical tangent).
Example: For x(t) = t² − 3, y(t) = 2t − 1, we get dy/dx = 2/(2t) = 1/t, which is undefined at t = 0, giving the vertex of a parabola at (−3, −1).

✏️ Tangent line equations

Once you have dy/dx at a specific t-value, use point-slope form: y − y₀ = m(x − x₀).
Calculate x₀ = x(t₀) and y₀ = y(t₀) to find the point of tangency.
Example: For the parabola above at t = 2, the slope is 1/2 and the point is (1, 3), giving tangent line y = (1/2)x + 5/2.

🔁 Second derivatives

🔁 Formula for d²y/dx²

Second derivative formula: d²y/dx² = [(d/dt)(dy/dx)] / (dx/dt)

First find dy/dx as a function of t.
Differentiate that expression with respect to t.
Divide by dx/dt.
Example: If dy/dx = 1/t, then d(1/t)/dt = −1/t², and dividing by dx/dt = 2t gives d²y/dx² = −1/(2t³).

📊 Concavity

The second derivative tells you concavity: positive means concave up, negative means concave down.
Critical points where d²y/dx² changes sign indicate inflection points.

📏 Area under parametric curves

📏 Area formula

Theorem: For a non-self-intersecting curve x = x(t), y = y(t), a ≤ t ≤ b, where x(t) is differentiable, the area under the curve is A = integral from a to b of y(t) · x'(t) dt.

This comes from approximating the area with rectangles of height y(x(t̄ᵢ)) and width x(tᵢ) − x(tᵢ₋₁).
Multiplying and dividing by Δt and taking the limit gives the integral formula.
Example: For the cycloid x(t) = t − sin t, y(t) = 1 − cos t from 0 to 2π, the area is 3π.

⚠️ Don't confuse with arc length

Area under the curve uses y(t) · x'(t).
Arc length (below) uses the square root of (x'(t))² + (y'(t))².

📐 Arc length of parametric curves

📐 Arc length formula

Theorem: For x = x(t), y = y(t), t₁ ≤ t ≤ t₂, where x(t) and y(t) are differentiable, the arc length is s = integral from t₁ to t₂ of sqrt[(dx/dt)² + (dy/dt)²] dt.

Derivation: approximate the curve with line segments of length sqrt[(x(tₖ) − x(tₖ₋₁))² + (y(tₖ) − y(tₖ₋₁))²].
By the Mean Value Theorem, x(tₖ) − x(tₖ₋₁) = x'(t̂ₖ)Δt and similarly for y.
Taking the limit as the partition gets finer gives the integral.

🔄 Connection to standard arc length

If the parameter can be eliminated to give y = F(x), then dy/dt = F'(x) · dx/dt.
Substituting into the parametric arc length formula and simplifying recovers the standard formula: integral of sqrt[1 + (dy/dx)²] dx.

🧮 Example calculation

For a semicircle x(t) = 3 cos t, y(t) = 3 sin t, 0 ≤ t ≤ π:
x'(t) = −3 sin t, y'(t) = 3 cos t.
Arc length = integral from 0 to π of sqrt[9 sin²t + 9 cos²t] dt = integral of 3 dt = 3π.
This matches the known formula πr for a semicircle of radius 3.

🌀 Surface area of revolution

🌀 Surface area formula

Formula: For a curve x = x(t), y = y(t), a ≤ t ≤ b revolved around the x-axis (with y(t) ≥ 0), the surface area is S = 2π · integral from a to b of y(t) · sqrt[(x'(t))² + (y'(t))²] dt.

This is analogous to the standard surface area formula S = 2π · integral of f(x) · sqrt[1 + (f'(x))²] dx.
The factor y(t) is the radius of rotation, and the square root term is the arc length element.

🌍 Example: sphere surface area

For a semicircle x(t) = r cos t, y(t) = r sin t, 0 ≤ t ≤ π:
x'(t) = −r sin t, y'(t) = r cos t.
Surface area = 2π · integral from 0 to π of r sin t · sqrt[r² sin²t + r² cos²t] dt.
Simplifying: 2π · integral of r² sin t dt = 2πr²[−cos t] from 0 to π = 4πr².
This is the well-known formula for the surface area of a sphere.

Polar Coordinates

1.3 Polar Coordinates

🧭 Overview

🧠 One-sentence thesis

This excerpt does not contain substantive content about polar coordinates; it consists only of exercise problems from parametric equations and an introduction to calculus of parametric curves.

📌 Key points (3–5)

The excerpt does not present material on polar coordinates (section 1.3).
The visible content covers exercises on parametric curves (end of section 1.1) and calculus of parametric curves (section 1.2).
Section 1.2 focuses on derivatives, tangent lines, and critical points for parametrically defined curves.
The derivative formula for parametric equations is the central tool presented.
Common confusion: distinguishing when the derivative is zero (critical point) versus undefined (vertical tangent or cusp).

📐 Derivatives of parametric equations

📐 The derivative formula

Derivative of Parametric Equations: For a plane curve defined by x = x(t) and y = y(t), if x'(t) and y'(t) exist and x'(t) ≠ 0, then dy/dx = (dy/dt) / (dx/dt) = y'(t) / x'(t).

This is not the same as taking the derivative of y with respect to x directly; instead, you compute the ratio of the two derivatives with respect to the parameter t.
The formula applies even when the curve cannot be written as y = f(x).
The proof uses the Chain Rule: if y(t) = F(x(t)), then y'(t) = F'(x(t)) · x'(t), so F'(x(t)) = y'(t) / x'(t).

🔍 Critical points in parametric curves

A critical point occurs where dy/dx = 0 or where dy/dx does not exist.
Zero derivative: y'(t) = 0 and x'(t) ≠ 0 → horizontal tangent.
Undefined derivative: x'(t) = 0 (and y'(t) ≠ 0) → vertical tangent or cusp.
Don't confuse: a point where both x'(t) and y'(t) are zero is not covered by the formula and requires separate analysis.

🧮 Worked examples

🧮 Parabola example

Given: x(t) = t² − 3, y(t) = 2t − 1, −3 ≤ t ≤ 4.
Compute: x'(t) = 2t, y'(t) = 2.
Result: dy/dx = 2 / (2t) = 1/t.
Critical point: derivative undefined at t = 0, which gives the point (−3, −1).
Interpretation: this is the vertex of a parabola opening to the right.

🧮 Cubic example

Given: x(t) = 2t + 1, y(t) = t³ − 3t + 4, −2 ≤ t ≤ 5.
Compute: x'(t) = 2, y'(t) = 3t² − 3.
Result: dy/dx = (3t² − 3) / 2.
Critical points: dy/dx = 0 when 3t² − 3 = 0, so t = ±1.
- At t = −1: point (−1, 6), a relative maximum.
- At t = 1: point (3, 2), a relative minimum.

🧮 Circle example

Given: x(t) = 5 cos t, y(t) = 5 sin t, 0 ≤ t ≤ 2π.
Compute: x'(t) = −5 sin t, y'(t) = 5 cos t.
Result: dy/dx = (5 cos t) / (−5 sin t) = −cot t.
Critical points:
- Derivative zero when cos t = 0 (top and bottom of circle).
- Derivative undefined when sin t = 0 (left and right edges of circle).
The table in the excerpt lists t = 0, π/2, π, 3π/2, 2π and their corresponding (x, y) coordinates: (5, 0), (0, 5), (−5, 0), (0, −5), (5, 0).

🧩 Context and scope

🧩 What this excerpt covers

The excerpt is from Chapter 1 (Parametric Equations and Polar Coordinates) but does not present section 1.3 content.
Section 1.2 learning objectives (listed but not fully covered in the excerpt):
- Determine derivatives and tangent equations for parametric curves.
- Find area under a parametric curve.
- Use arc length formula for parametric curves.
- Apply surface area formula for volumes generated by parametric curves.
Only the first objective (derivatives and tangents) is illustrated in the visible examples.

🧩 Missing content

The excerpt does not explain polar coordinates, polar equations, or conversion between Cartesian and polar forms.
Sections on area, arc length, and surface area for parametric curves are mentioned but not shown.
The exercises at the beginning (problems 56–61) involve sketching parametric curves using technology but do not relate to polar coordinates.

Area and Arc Length in Polar Coordinates

1.4 Area and Arc Length in Polar Coordinates

🧭 Overview

🧠 One-sentence thesis

Calculus extends to parametric curves by using the chain rule to find derivatives through the ratio of dy/dt to dx/dt, enabling calculation of slopes, critical points, and tangent lines for curves that cannot be expressed as simple y = f(x) functions.

📌 Key points (3–5)

Core derivative formula: For parametric curves x = x(t) and y = y(t), the derivative dy/dx equals (dy/dt) / (dx/dt), provided dx/dt is not zero.
Critical points: Occur when dy/dt = 0 (horizontal tangent) or when dx/dt = 0 (vertical tangent or undefined derivative).
Why parametric derivatives matter: They allow us to find slopes and tangent lines for curves that cannot be written as y = f(x), such as circles or curves that loop back on themselves.
Common confusion: The derivative dy/dx is not simply dy/dt; you must divide dy/dt by dx/dt—the parameter t acts as an intermediary through the chain rule.
Application scope: The method works for any parameterized curve where both x'(t) and y'(t) exist and x'(t) ≠ 0.

📐 The parametric derivative formula

📐 What the formula says

Theorem 1.1 (Derivative of Parametric Equations): For a plane curve defined by x = x(t) and y = y(t), if x'(t) and y'(t) exist and x'(t) ≠ 0, then dy/dx = (dy/dt) / (dx/dt) = y'(t) / x'(t).

This is not an arbitrary rule; it comes from the chain rule.
The parameter t links x and y, so changes in y with respect to x must account for how both change with respect to t.
Why the ratio: If you eliminate t to get y = F(x), then differentiating y(t) = F(x(t)) by the chain rule gives y'(t) = F'(x(t)) · x'(t), so F'(x(t)) = y'(t) / x'(t).

🔗 Chain rule justification

The excerpt proves this by assuming t can be eliminated to yield y = F(x), so:

Start with y(t) = F(x(t))
Differentiate both sides: y'(t) = F'(x(t)) · x'(t)
Solve for F'(x(t)): F'(x(t)) = y'(t) / x'(t)
Since F'(x(t)) is just dy/dx, the formula follows.

Don't confuse: dy/dx is not the same as dy/dt. You must divide by dx/dt to convert the rate of change with respect to t into the rate of change with respect to x.

🔍 Finding and interpreting critical points

🔍 What makes a point critical

A critical point of a differentiable function y = f(x) is any point x = x₀ where f'(x₀) = 0 or f'(x₀) does not exist.

For parametric curves, this translates to:

dy/dx = 0 when dy/dt = 0 (numerator zero) → horizontal tangent
dy/dx undefined when dx/dt = 0 (denominator zero) → vertical tangent or cusp

🔍 Why critical points matter

They locate relative maxima, minima, and points where the tangent is vertical.
The excerpt shows that even simple parametric curves (like circles) have critical points at natural geometric features (top, bottom, sides).

📊 Worked examples from the excerpt

📊 Parabola opening to the right

Given: x(t) = t² − 3, y(t) = 2t − 1, −3 ≤ t ≤ 4

Compute: x'(t) = 2t, y'(t) = 2
Derivative: dy/dx = 2 / (2t) = 1/t
Critical point: dy/dx is undefined when t = 0
- At t = 0: x(0) = −3, y(0) = −1 → point (−3, −1)
- This is the vertex of the parabola.

Interpretation: The derivative blows up at the vertex because the tangent line is vertical there.

📊 Cubic-like curve

Given: x(t) = 2t + 1, y(t) = t³ − 3t + 4, −2 ≤ t ≤ 5

Compute: x'(t) = 2, y'(t) = 3t² − 3
Derivative: dy/dx = (3t² − 3) / 2
Critical points: dy/dx = 0 when 3t² − 3 = 0 → t = ±1
- At t = −1: x(−1) = −1, y(−1) = 6 → point (−1, 6) is a relative maximum
- At t = 1: x(1) = 3, y(1) = 2 → point (3, 2) is a relative minimum

Interpretation: The numerator vanishing gives horizontal tangents, which correspond to local extrema.

📊 Circle

Given: x(t) = 5 cos t, y(t) = 5 sin t, 0 ≤ t ≤ 2π

Compute: x'(t) = −5 sin t, y'(t) = 5 cos t
Derivative: dy/dx = (5 cos t) / (−5 sin t) = −cot t
Critical points:
- dy/dx = 0 when cos t = 0 → t = π/2, 3π/2 (top and bottom of circle)
- dy/dx undefined when sin t = 0 → t = 0, π, 2π (left and right sides of circle)

t	x(t)	y(t)	Interpretation
0	5	0	Right edge (vertical tangent)
π/2	0	5	Top (horizontal tangent)
π	−5	0	Left edge (vertical tangent)
3π/2	0	−5	Bottom (horizontal tangent)
2π	5	0	Right edge again

Interpretation: The circle's geometric symmetry is reflected in the critical points—horizontal tangents at top/bottom, vertical tangents at left/right.

⚠️ Common pitfalls and reminders

⚠️ When the formula breaks down

The formula requires x'(t) ≠ 0.
If x'(t) = 0, the derivative dy/dx is undefined (often a vertical tangent).
If both x'(t) = 0 and y'(t) = 0 simultaneously, the point may be a cusp or require further analysis.

⚠️ Eliminating the parameter vs. using the formula

You can eliminate t to get y = f(x) and differentiate directly (as shown in the line segment example: y = (3x/2) − 17/2, so dy/dx = 3/2).
But for many curves (circles, loops, etc.), elimination is impossible or impractical—the parametric derivative formula works regardless.

⚠️ Interpreting dy/dx geometrically

dy/dx is the slope of the tangent line to the curve at the point (x(t), y(t)).
It is not the slope of the position vector or the speed; it is purely the geometric slope in the xy-plane.

Calculus of Parametric Curves

1.5 Conic Sections

🧭 Overview

🧠 One-sentence thesis

The derivative of a parametric curve can be calculated using the ratio of the derivatives of its component functions, enabling us to find slopes, critical points, and other calculus properties without eliminating the parameter.

📌 Key points (3–5)

Core formula: The derivative dy/dx for parametric equations x = x(t) and y = y(t) is given by (dy/dt) divided by (dx/dt), provided dx/dt is not zero.
Finding critical points: Critical points occur where dy/dx equals zero (when dy/dt = 0 but dx/dt ≠ 0) or where dy/dx is undefined (when dx/dt = 0).
Why this works: The formula comes from the Chain Rule applied to the relationship y(t) = F(x(t)) when the parameter can be eliminated.
Common confusion: The derivative dy/dx is not simply y'(t) or x'(t) alone—it is the ratio of these two derivatives.
Broader context: This approach extends calculus to curves that cannot be written as y = f(x), including circles, loops, and other complex paths.

📐 The derivative formula

📐 Statement of the theorem

Theorem 1.1 (Derivative of Parametric Equations): Consider a plane curve defined by parametric equations x = x(t) and y = y(t). Suppose that x'(t) and y'(t) exist, and assume that x'(t) ≠ 0. Then the derivative dy/dx is given by:

dy/dx = (dy/dt) / (dx/dt) = y'(t) / x'(t)

This formula allows you to compute the slope of the tangent line at any point on the curve without eliminating the parameter t.
The condition x'(t) ≠ 0 is necessary to avoid division by zero.

🔗 Why the formula works (Chain Rule proof)

Assume the parameter t can be eliminated to give a differentiable function y = F(x).
Then y(t) = F(x(t)).
Differentiating both sides with respect to t using the Chain Rule: y'(t) = F'(x(t)) · x'(t).
Solving for F'(x(t)): F'(x(t)) = y'(t) / x'(t).
Since F'(x(t)) = dy/dx, the theorem is proven.

⚠️ When to use it

The formula works regardless of whether the curve can be described by a function y = f(x).
Example: A circle defined parametrically cannot be written as a single function y = f(x), but the derivative formula still applies.

🔍 Finding critical points

🔍 What are critical points

A critical point of a differentiable function y = f(x) is any point x = x₀ such that either f'(x₀) = 0 or f'(x₀) does not exist.

For parametric curves, critical points occur at parameter values t where:

dy/dx = 0: This happens when dy/dt = 0 and dx/dt ≠ 0.
dy/dx is undefined: This happens when dx/dt = 0.

🎯 How to locate them

Calculate x'(t) and y'(t).
Form the ratio dy/dx = y'(t) / x'(t).
Find where the numerator y'(t) = 0 (horizontal tangents).
Find where the denominator x'(t) = 0 (vertical tangents or undefined derivative).
Substitute these t-values into x(t) and y(t) to get the actual points on the curve.

📊 Worked examples

📊 Example: Parabola opening right

Given: x(t) = t² − 3, y(t) = 2t − 1, for −3 ≤ t ≤ 4

Calculate derivatives: x'(t) = 2t, y'(t) = 2.
Form the ratio: dy/dx = 2 / (2t) = 1/t.
The derivative is undefined when t = 0.
At t = 0: x(0) = −3, y(0) = −1, giving the point (−3, −1).
This point is the vertex of the parabola.

Don't confuse: The derivative being undefined does not always mean a vertical tangent; here it indicates the vertex where the parabola changes direction.

📊 Example: Cubic-like curve

Given: x(t) = 2t + 1, y(t) = t³ − 3t + 4, for −2 ≤ t ≤ 5

Calculate derivatives: x'(t) = 2, y'(t) = 3t² − 3.
Form the ratio: dy/dx = (3t² − 3) / 2.
The derivative is zero when 3t² − 3 = 0, i.e., t = ±1.
At t = −1: x(−1) = −1, y(−1) = 6, giving the point (−1, 6) (relative maximum).
At t = 1: x(1) = 3, y(1) = 2, giving the point (3, 2) (relative minimum).

📊 Example: Circle

Given: x(t) = 5 cos(t), y(t) = 5 sin(t), for 0 ≤ t ≤ 2π

Calculate derivatives: x'(t) = −5 sin(t), y'(t) = 5 cos(t).
Form the ratio: dy/dx = (5 cos(t)) / (−5 sin(t)) = −cot(t).
The derivative is zero when cos(t) = 0: t = π/2, 3π/2 (top and bottom of circle).
The derivative is undefined when sin(t) = 0: t = 0, π, 2π (left and right edges of circle).

t	x(t)	y(t)	Point description
0	5	0	Right edge
π/2	0	5	Top
π	−5	0	Left edge
3π/2	0	−5	Bottom
2π	5	0	Right edge (again)

Key insight: On the left and right edges, the tangent line is vertical (derivative undefined). On the top and bottom, the tangent line is horizontal (derivative equals zero).

Vectors in the Plane

2.1 Vectors in the Plane

🧭 Overview

🧠 One-sentence thesis

Vectors combine magnitude and direction into a single mathematical object that can be added, scaled, and decomposed into components, making them essential tools for representing physical quantities like forces and velocities.

📌 Key points (3–5)

What vectors are: mathematical objects that have both magnitude and direction, not just size alone.
How to combine vectors: addition uses the parallelogram or triangle method; subtraction is defined as adding the negative; scalar multiplication changes length or reverses direction.
Component form and magnitude: a vector v = 〈x, y〉 has magnitude (length) calculated as the square root of (x squared + y squared).
Unit vectors: vectors with magnitude 1; any vector can be turned into a unit vector by dividing by its magnitude; standard unit vectors i and j point along the x and y axes.
Common confusion: magnitude vs. direction—magnitude is a scalar (just a number), while the vector itself carries both size and orientation.

📐 What vectors represent

📐 Definition and purpose

A vector is a mathematical object that has both magnitude and direction.

Magnitude = size or length (a scalar).
Direction = orientation in space.
Vectors are used to represent quantities that cannot be described by a single number alone.
Example: a force of 10 newtons pushing eastward—both the strength (10) and the direction (east) matter.

🔧 Applications in physics and engineering

The excerpt states vectors are "often used in physics and engineering to represent forces and velocities, among other quantities."
Forces: push or pull with a specific direction.
Velocities: speed combined with direction of motion.
Don't confuse: speed (scalar) vs. velocity (vector)—speed has no direction, velocity does.

➕ Operations on vectors

➕ Vector addition

Parallelogram method: place vectors tail-to-tail; the sum is the diagonal of the parallelogram they form.
Triangle method: place the initial point of the second vector at the terminal point of the first; the sum runs from the initial point of the first to the terminal point of the second.
The excerpt describes the triangle method: "placing the initial point of w at the terminal point of v; then the vector sum v + w is the vector with an initial point that coincides with the initial point of v, and with a terminal point that coincides with the terminal point of w."
Example: if vector v moves you 3 steps east and vector w moves you 2 steps north, v + w is the direct path from start to finish.

➖ Vector subtraction

The vector difference v − w is defined as v + (−w) = v + (−1)w.

Subtraction is addition of the negative.
The negative of a vector has the same magnitude but opposite direction.
Example: if w points north, −w points south with the same length.

✖️ Scalar multiplication

Multiplying a vector by a scalar changes its length or reverses its direction.
Positive scalar: stretches or shrinks the vector, keeps direction.
Negative scalar: reverses direction and changes length.
Example: 2v doubles the length of v; −1v flips v to the opposite direction.

📊 Component form and magnitude

📊 Writing vectors in components

A vector is written in component form as v = 〈x, y〉.

x = horizontal component.
y = vertical component.
This breaks the vector into its effect along each axis.
Example: v = 〈3, 4〉 means 3 units in the x-direction and 4 units in the y-direction.

📏 Calculating magnitude

The magnitude of a vector is a scalar: ‖v‖ = square root of (x squared + y squared).

Magnitude is the length of the vector.
It is always a non-negative number (a scalar, not a vector).
Example: for v = 〈3, 4〉, magnitude = square root of (9 + 16) = 5.

🧭 Unit vectors

A unit vector u has magnitude 1 and can be found by dividing a vector by its magnitude: u = (1 / ‖v‖) v.

A unit vector points in the same direction as the original vector but has length exactly 1.
Useful for isolating direction from magnitude.
Example: if v = 〈3, 4〉 with magnitude 5, the unit vector is u = 〈3/5, 4/5〉.

🔤 Standard unit vectors

i = 〈1, 0〉: points along the positive x-axis.
j = 〈0, 1〉: points along the positive y-axis.
Any vector v = 〈x, y〉 can be expressed as v = x i + y j.
This notation separates the x and y contributions explicitly.
Example: v = 〈3, 4〉 = 3i + 4j.

🔄 Special vectors

🔄 The zero vector

The vector with both initial point and terminal point (0, 0).

Magnitude is 0.
No direction (or equivalently, any direction).
Acts as the additive identity: v + 0 = v for any vector v.

Vectors in Three Dimensions

2.2 Vectors in Three Dimensions

🧭 Overview

🧠 One-sentence thesis

The three-dimensional coordinate system extends vector concepts from the plane by building around three perpendicular axes that intersect at a single point, enabling representation of quantities with magnitude and direction in space.

📌 Key points (3–5)

Extension to 3D: The three-dimensional coordinate system is built around three axes that intersect at right angles at a single point.
Vector operations in space: Dot products, cross products, projections, and vector arithmetic extend from 2D to 3D.
Lines and planes: Lines are described by vector equations using direction vectors and position vectors; planes are defined by normal vectors and points.
Common confusion: Distance formulas and equations extend from 2D (two coordinates) to 3D (three coordinates); the structure is analogous but requires an additional dimension.
Applications: Work calculations use dot products of force and displacement vectors; geometric relationships use projections and angles.

📐 Fundamental 3D structures

📐 The coordinate system

The three-dimensional coordinate system is built around a set of three axes that intersect at right angles at a single point.

This is the foundation for representing vectors in space.
The system extends the 2D plane by adding a third perpendicular axis.
All three axes meet at a single origin point.

📏 Distance in space

Distance between two points in space: the square root of the sum of squared differences in all three coordinates (x₂ − x₁)² + (y₂ − y₁)² + (z₂ − z₁)².
This extends the 2D distance formula by adding the z-coordinate difference.
Example: To find how far apart two points are in 3D space, compute differences in x, y, and z, square each, sum them, and take the square root.

🔵 Spheres

A sphere with center (a, b, c) and radius r is defined by: (x − a)² + (y − b)² + (z − c)² = r².
This is the 3D analog of a circle equation.
All points satisfying this equation are exactly distance r from the center.

🧮 Vector operations in 3D

🧮 Component form and magnitude

A vector in 3D is written as v = ⟨x, y, z⟩ (extending the 2D form ⟨x, y⟩).
Magnitude: ‖v‖ = √(x² + y² + z²).
Unit vectors have magnitude 1 and can be found by dividing a vector by its magnitude: u = (1/‖v‖)v.

⊙ Dot product

The dot product of u and v: u · v = u₁v₁ + u₂v₂ + u₃v₃ = ‖u‖‖v‖ cos θ.

The dot product extends to three components in 3D.
It relates to the angle θ between vectors: cos θ = (u · v) / (‖u‖‖v‖).
Used to compute work: W = F · PQ = ‖F‖‖PQ‖ cos θ, where F is force and PQ is displacement.

⊗ Cross product

The cross product of two vectors produces a new vector.

In terms of unit vectors: u × v = (u₂v₃ − u₃v₂)i − (u₁v₃ − u₃v₁)j + (u₁v₂ − u₂v₁)k.
The cross product is unique to 3D (no 2D analog).
The result is a vector perpendicular to both input vectors.

📊 Projections

Type	Formula	Meaning
Vector projection of v onto u	projᵤ v = ((u · v) / ‖u‖²)u	The vector component of v in the direction of u
Scalar projection of v onto u	compᵤ v = (u · v) / ‖u‖	The magnitude of the component of v along u

Scalar projection gives a number (how much of v follows u's direction).
Vector projection gives a vector (the part of v that points along u).

📏 Lines in 3D space

📏 Vector equation of a line

r = r₀ + tv, where r₀ is the position vector of a point P on the line and v is the direction vector.

r₀ = ⟨x₀, y₀, z₀⟩ locates a known point on the line.
v = ⟨a, b, c⟩ gives the direction the line travels.
The parameter t scales the direction vector to reach any point on the line.

📏 Parametric equations

x = x₀ + ta, y = y₀ + tb, z = z₀ + tc.
These separate the vector equation into three coordinate equations.
Each coordinate changes linearly with the parameter t.

📏 Symmetric equations

(x − x₀)/a = (y − y₀)/b = (z − z₀)/c.
This form eliminates the parameter t by setting all three ratios equal.
Useful when you need a relationship without an explicit parameter.

🛩️ Planes in 3D space

🛩️ Vector equation of a plane

n · PQ = 0, where P is a given point in the plane, Q is any point in the plane, and n is a normal vector.

The normal vector n is perpendicular to the plane.
PQ is the vector from the known point P to any point Q in the plane.
The dot product being zero means PQ is perpendicular to n, so Q lies in the plane.

🛩️ Scalar equation

a(x − x₀) + b(y − y₀) + c(z − z₀) = 0.
This expands the vector equation into coordinate form.
The coefficients a, b, c come from the normal vector n = ⟨a, b, c⟩.

🛩️ Distance from a point to a plane

d = ‖projₙ QP‖ = |compₙ QP| = |QP · n| / ‖n‖.
This measures the perpendicular distance from point P to the plane.
The formula uses the scalar projection of the vector QP onto the normal vector.
Don't confuse: this is the shortest (perpendicular) distance, not the distance along any arbitrary direction.

🔧 Physical applications

🔧 Work and force

Work is generally thought of as the amount of energy it takes to move an object; if we represent an applied force by a vector F and the displacement of an object by a vector s, then the work done by the force is the dot product of F and s.

Work formula: W = F · s = ‖F‖‖s‖ cos θ.
Only the component of force in the direction of displacement contributes to work.
Example: If force is perpendicular to displacement (θ = 90°), cos θ = 0 and no work is done.

🔧 Representing quantities

Vectors represent quantities that have both magnitude and direction.
Common uses: forces, velocities, displacements.
The 3D framework allows modeling real-world scenarios where motion and forces occur in all three spatial dimensions.

The Dot Product

2.3 The Dot Product

🧭 Overview

🧠 One-sentence thesis

The dot product provides a way to multiply two vectors that yields a scalar, enabling calculations of projections, angles, and work done by forces.

📌 Key points (3–5)

What the dot product is: a scalar (not a vector) obtained by multiplying corresponding components of two vectors and summing them.
Key property: the dot product is commutative (order doesn't matter).
Geometric interpretation: relates to the angle between vectors and the projection of one vector onto another.
Physical application: work done by a force is computed using the dot product of force and displacement vectors.
Common confusion: dot product vs cross product—dot product yields a scalar; cross product (also mentioned) yields a vector.

🔢 Definition and basic properties

🔢 What the dot product computes

The dot product, or scalar product, of two vectors u = ⟨u₁, u₂, u₃⟩ and v = ⟨v₁, v₂, v₃⟩ is u · v = u₁v₁ + u₂v₂ + u₃v₃.

Multiply each corresponding pair of components, then add all the products together.
The result is a single number (scalar), not a vector.
Example: if u = ⟨2, 3, 1⟩ and v = ⟨4, 0, 5⟩, then u · v = 2×4 + 3×0 + 1×5 = 8 + 0 + 5 = 13.

🔄 Commutativity

The excerpt states:

u · v = v · u

This means the order in which you take the dot product does not change the result.

📐 Geometric interpretations

📏 Scalar projection

Scalar projection of v onto u: compᵤ(v) = (u · v) / ‖u‖

This measures how much of v lies in the direction of u.
It is a signed scalar: positive if v points generally in the same direction as u, negative if opposite.
The formula divides the dot product by the magnitude (length) of u.

🧭 Relationship to angle

The excerpt does not give the explicit cosine formula in the key concepts section, but it mentions:

Work formula: W = ‖F‖ ‖PQ→‖ cos θ

This implies the dot product encodes information about the angle θ between two vectors.

🛠️ Physical application: Work

⚙️ Work done by a force

Work done by a force F to move an object through displacement vector PQ→ is W = F · PQ→ = ‖F‖ ‖PQ→‖ cos θ

Work is the dot product of the force vector and the displacement vector.
Equivalently, it is the product of the magnitudes of both vectors and the cosine of the angle between them.
Physical meaning: only the component of force in the direction of motion contributes to work.
Example: if a force pushes at an angle to the direction of motion, only the part of the force aligned with the motion does work; the perpendicular part does not.

🔍 Why the dot product fits

The dot product naturally captures "how much one vector goes in the direction of another."
When force and displacement are perpendicular (θ = 90°), cos θ = 0, so W = 0 (no work done).
When they are aligned (θ = 0°), cos θ = 1, so W is maximized.

🆚 Distinguishing dot product from cross product

🆚 Output type

Operation	Input	Output	Example use
Dot product	Two vectors	Scalar	Projection, work, angle
Cross product	Two vectors	Vector	The excerpt mentions u × v = (u₂v₃ − u₃v₂)i − (u₁v₃ − u₃v₁)j + (u₁v₂ − u₂v₁)k

Don't confuse: the dot product collapses two vectors into a number; the cross product produces a new vector perpendicular to both inputs.
The excerpt lists both operations but focuses on the dot product in section 2.3.

The Cross Product

2.4 The Cross Product

🧭 Overview

🧠 One-sentence thesis

The cross product of two vectors produces a third vector that is orthogonal to both original vectors, with magnitude and direction determined by the angle between them and the right-hand rule.

📌 Key points (3–5)

What the cross product produces: a vector orthogonal (perpendicular) to both input vectors, not a scalar.
How magnitude is determined: length equals the product of the two vector magnitudes times the sine of the angle between them.
How direction is determined: by the right-hand rule.
Common confusion: cross product vs dot product—the dot product yields a scalar and uses cosine; the cross product yields a vector and uses sine.
Key algebraic properties: the cross product is anti-commutative (order matters and reverses sign) and distributes over addition.

🔧 What the cross product is

🔧 Definition and output type

The cross product u × v of two vectors u = 〈u₁, u₂, u₃〉 and v = 〈v₁, v₂, v₃〉 is a vector orthogonal to both u and v.

Unlike the dot product (which produces a scalar), the cross product produces a vector.
"Orthogonal" means perpendicular—the result is at right angles to both input vectors.
Example: if u points east and v points north, u × v points either straight up or straight down (perpendicular to the ground plane).

📏 Magnitude formula

The length of the cross product is given by:

‖u × v‖ = ‖u‖ · ‖v‖ · sin θ

where θ is the angle between u and v.

Notice this uses sine, not cosine.
When vectors are parallel (θ = 0° or 180°), sin θ = 0, so the cross product has zero length.
When vectors are perpendicular (θ = 90°), sin θ = 1, so the magnitude is maximized.

🧭 Direction: the right-hand rule

The excerpt states direction is given by the right-hand rule.
This is a physical convention: point your right hand's fingers along the first vector (u), curl them toward the second vector (v), and your thumb points in the direction of u × v.

🧮 Computing the cross product

🧮 Algebraic formula

For vectors u = 〈u₁, u₂, u₃〉 and v = 〈v₁, v₂, v₃〉, the cross product is:

u × v = (u₂v₃ − u₃v₂) i − (u₁v₃ − u₃v₁) j + (u₁v₂ − u₂v₁) k

Breaking this down:

The i component: u₂v₃ − u₃v₂
The j component: −(u₁v₃ − u₃v₁) (note the minus sign in front)
The k component: u₁v₂ − u₂v₁

This formula allows direct calculation from components without needing to know the angle.

⚙️ Properties of the cross product

⚙️ Anti-commutativity

u × v = −(v × u)

Order matters: swapping the vectors reverses the sign (flips the direction).

Don't confuse with the dot product, which is commutative: u · v = v · u.

➕ Distributive and scalar properties

The cross product distributes over addition and interacts with scalars:

u × (v + w) = u × v + u × w
c(u × v) = (cu) × v = u × (cv) for any scalar c

🚫 Special cases with zero and identical vectors

u × 0 = 0 × u = 0 (crossing with the zero vector gives the zero vector)
v × v = 0 (a vector crossed with itself is zero, because the angle is 0° and sin 0° = 0)

🔄 Scalar triple product

u · (v × w) = (u × v) · w

This property shows that the dot and cross products can be combined; the result is a scalar (since the final operation is a dot product).

🔍 Cross product vs dot product

Feature	Dot Product	Cross Product
Output type	Scalar (number)	Vector
Formula with angle	‖u‖‖v‖ cos θ	‖u‖‖v‖ sin θ
Orthogonality test	u · v = 0 means u ⊥ v	u × v gives a vector ⊥ to both u and v
Commutativity	u · v = v · u	u × v = −(v × u)
When vectors are parallel	Maximum magnitude	Zero magnitude
When vectors are perpendicular	Zero	Maximum magnitude

The excerpt emphasizes that the cross product is orthogonal to both input vectors, a key distinction from the dot product.

Equations of Lines and Planes in Space

2.5 Equations of Lines and Planes in Space

🧭 Overview

🧠 One-sentence thesis

Lines in three-dimensional space can be described using direction vectors and a point on the line, expressed through vector, parametric, or symmetric equations, and the distance from a point to a line is calculated using the cross product.

📌 Key points (3–5)

Direction vector defines a line: a line in 3D is determined by a direction vector and a point through which it passes.
Three equivalent forms: vector equation, parametric equations, and symmetric equations all describe the same line.
Distance formula uses cross product: the distance from a point not on the line to the line involves the cross product of a connecting vector and the direction vector.
Common confusion: lines in 3D can be skew (not parallel and not intersecting), unlike in 2D where non-parallel lines always intersect.

📐 Describing a line in space

📐 What defines a line

A line in three dimensions is described by a direction vector and a point through which it passes.

The direction vector v = ⟨a, b, c⟩ tells you which way the line points.
The point P = (x₀, y₀, z₀) anchors the line in space.
Together, these two pieces of information uniquely determine the line.

📝 Vector equation form

The vector equation of a line is r = r₀ + tv, where r₀ = ⟨x₀, y₀, z₀⟩ is the position vector of point P, v is the direction vector, and t is a parameter.

This equation says: start at position r₀, then move along direction v by amount t.
The parameter t can be any real number (positive, negative, or zero).
When t = 0, you get the point P itself.

🔢 Alternative equation forms

🔢 Parametric equations

The parametric equations of the line are: x = x₀ + ta, y = y₀ + tb, z = z₀ + tc.

These are obtained by writing out the vector equation component by component.
Each coordinate (x, y, z) is expressed as a function of the parameter t.
Example: if you know t, you can plug it in to find the exact coordinates of a point on the line.

🔢 Symmetric equations

The symmetric equations of the line are: (x − x₀)/a = (y − y₀)/b = (z − z₀)/c.

These eliminate the parameter t by setting all three expressions equal to each other.
Each fraction represents the same value of t.
Don't confuse: symmetric equations are not always usable—if any component of the direction vector (a, b, or c) is zero, that fraction is undefined.

📏 Distance from a point to a line

📏 The distance formula

If L is a line passing through point P with direction vector v, and Q is any point not on L, then the distance from Q to L is d = ‖PQ⃗ × v‖ / ‖v‖.

PQ⃗ is the vector from P (on the line) to Q (the external point).
The cross product PQ⃗ × v gives a vector perpendicular to both, whose magnitude relates to the "perpendicular distance."
Dividing by ‖v‖ normalizes the result to give the actual distance.
Why this works: the cross product magnitude measures the area of a parallelogram; dividing by the base (‖v‖) gives the height, which is the perpendicular distance.

🔀 Relationships between lines in 3D

🔀 Four possible configurations

Relationship	Description
Parallel but not equal	Lines point in the same direction but never meet
Equal	The same line (all points coincide)
Intersecting	Lines cross at exactly one point
Skew	Lines are not parallel and do not intersect

🔀 What makes 3D different

In two dimensions, non-parallel lines must intersect.
In three dimensions, lines can be skew: they do not point in the same direction and they never meet.
Example: imagine one line on the floor and another on the ceiling, pointing in different directions—they are skew.
Don't confuse skew with parallel: skew lines have different direction vectors, while parallel lines have proportional direction vectors.

Quadric Surfaces

2.6 Quadric Surfaces

🧭 Overview

🧠 One-sentence thesis

Quadric surfaces are three-dimensional shapes whose cross-sections (traces) are conic sections, and they can be analyzed by examining how they intersect coordinate planes.

📌 Key points (3–5)

What a cylinder is: a set of parallel lines (rulings) passing through a given curve.
What a trace is: the intersection of a 3D surface with a plane; traces help visualize the shape.
How to find traces: set one coordinate to zero (z = 0 for xy-plane, x = 0 for yz-plane, y = 0 for xz-plane).
What quadric surfaces are: three-dimensional surfaces whose traces are conic sections (circles, ellipses, parabolas, hyperbolas).
Common confusion: a trace is not the entire surface—it is only the curve where the surface meets a particular plane.

🛢️ Cylinders and rulings

🛢️ What a cylinder is

A cylinder (or cylindrical surface) is a set of lines parallel to a given line passing through a given curve.

The parallel lines are called rulings.
A cylinder is not just a "can shape"—it is any surface formed by moving a line parallel to itself along a curve.
Example: if you have a circle in the xy-plane and extend vertical lines through every point on the circle, you get a circular cylinder.

📏 Rulings

Rulings are the parallel lines that make up the cylinder.
They all point in the same direction and pass through the base curve.

✂️ Traces and how to find them

✂️ What a trace is

A trace is the intersection of a three-dimensional surface and a plane.

A trace is a 2D curve that results from "slicing" the 3D surface with a plane.
Traces help you understand the shape of the surface by examining simpler 2D cross-sections.

🔍 How to find traces in coordinate planes

Plane	What to set	Result
xy-plane	z = 0	Trace in the xy-plane
yz-plane	x = 0	Trace in the yz-plane
xz-plane	y = 0	Trace in the xz-plane

To find the trace in a coordinate plane, set the coordinate perpendicular to that plane equal to zero.
Example: to see what the surface looks like in the xy-plane, set z = 0 and simplify the equation.

⚠️ Don't confuse

A trace is only the curve where the surface meets one specific plane—it is not the entire surface.
Different planes give different traces; together, these traces help you visualize the full 3D shape.

🎯 Quadric surfaces

🎯 What quadric surfaces are

Quadric surfaces are three-dimensional surfaces with traces composed of conic sections.

Conic sections include circles, ellipses, parabolas, and hyperbolas.
Every quadric surface can be expressed with an equation of the form Ax squared plus By squared plus Cz squared (the excerpt cuts off, but this is the general form).

🧩 Why traces matter for quadric surfaces

Because the traces are conic sections, you can identify and classify quadric surfaces by examining their traces in different planes.
Example: if all traces are ellipses or circles, the surface might be an ellipsoid; if some traces are hyperbolas, it might be a hyperboloid.

🔄 Relationship between traces and the full surface

The full surface is built from infinitely many traces stacked together.
By finding traces in the xy-, yz-, and xz-planes, you get a good sense of the overall shape.

Cylindrical and Spherical Coordinates

2.7 Cylindrical and Spherical Coordinates

🧭 Overview

🧠 One-sentence thesis

Cylindrical and spherical coordinate systems provide alternative ways to represent points in three-dimensional space, each with specific conversion formulas to and from Cartesian coordinates that simplify certain geometric problems.

📌 Key points (3–5)

Two alternative 3D coordinate systems: cylindrical coordinates extend polar coordinates by adding a z-component; spherical coordinates use distance from origin plus two angles.
Cylindrical coordinates: represented as (r, θ, z) where (r, θ) are polar coordinates in the xy-plane and z is the vertical component.
Spherical coordinates: represented as (ρ, θ, φ) where ρ is distance from origin, θ is the same angle as in cylindrical, and φ is the angle from the positive z-axis.
Conversion formulas exist in both directions: from cylindrical/spherical to Cartesian, from Cartesian to cylindrical/spherical, and between cylindrical and spherical.
Common confusion: the angle φ in spherical coordinates measures from the positive z-axis (not from the xy-plane), and the range is 0 ≤ φ ≤ π.

📐 Cylindrical coordinate system

📐 What cylindrical coordinates represent

In the cylindrical coordinate system, a point in space is represented by the ordered triple (r, θ, z), where (r, θ) represents the polar coordinates of the point's projection in the xy-plane and z represents the point's projection onto the z-axis.

Think of cylindrical coordinates as "polar coordinates plus height."
The r and θ components describe a circle in the horizontal plane.
The z component measures vertical distance, just as in Cartesian coordinates.
Example: A point directly above (3, 0) in the xy-plane at height 5 would be (3, 0°, 5) in cylindrical coordinates.

🔄 Converting cylindrical to Cartesian

The conversion formulas are:

x = r cos θ
y = r sin θ
z = z

These formulas extend the familiar polar-to-Cartesian conversion by keeping z unchanged.

🔄 Converting Cartesian to cylindrical

The conversion formulas are:

r² = x² + y²
tan θ = y/x
z = z

The first two formulas are identical to Cartesian-to-polar conversion in two dimensions.

🌐 Spherical coordinate system

🌐 What spherical coordinates represent

In the spherical coordinate system, a point P in space is represented by the ordered triple (ρ, θ, φ), where ρ is the distance between P and the origin (ρ ≠ 0), θ is the same angle used to describe the location in cylindrical coordinates, and φ is the angle formed by the positive z-axis and line segment OP, where O is the origin and 0 ≤ φ ≤ π.

ρ (rho) measures how far the point is from the origin—the "radius" in 3D.
θ (theta) is the same horizontal angle as in cylindrical coordinates.
φ (phi) measures the angle down from the positive z-axis.
Don't confuse: φ = 0 means the point is on the positive z-axis; φ = π/2 means the point is in the xy-plane; φ = π means the point is on the negative z-axis.

🔄 Converting spherical to Cartesian

The conversion formulas are:

x = ρ sin φ cos θ
y = ρ sin φ sin θ
z = ρ cos φ

Notice that the z-coordinate depends only on ρ and φ, while x and y involve all three spherical coordinates.

🔄 Converting Cartesian to spherical

The conversion formulas are:

ρ² = x² + y² + z²
tan θ = y/x
φ = arccos(z / √(x² + y² + z²))

The first formula is the 3D distance formula. The angle φ is found by taking the arccosine of the z-component divided by the distance from the origin.

🔀 Converting between cylindrical and spherical

🔀 Cylindrical to spherical

The conversion formulas are:

ρ = √(r² + z²)
θ = θ (same angle)
φ = arccos(z / √(r² + z²))

The horizontal angle θ remains unchanged between the two systems.

🔀 Spherical to cylindrical

The conversion formulas are:

r = ρ sin φ
θ = θ (same angle)
z = ρ cos φ

Again, θ is preserved, while r and z are computed from ρ and φ.

🎯 Why alternative coordinates matter

🎯 Simplifying geometric problems

Cylindrical coordinates are natural for objects with circular symmetry around an axis (cylinders, cones).
Spherical coordinates are natural for objects with symmetry around a point (spheres, cones with apex at origin).
Example: The equation of a sphere centered at the origin is simply ρ = constant in spherical coordinates, much simpler than x² + y² + z² = constant².

🎯 Relationship between the systems

System	Components	Best for
Cartesian	(x, y, z)	Rectangular geometry, standard calculations
Cylindrical	(r, θ, z)	Circular symmetry around z-axis
Spherical	(ρ, θ, φ)	Radial symmetry from origin

All three systems can represent any point in 3D space; the choice depends on which makes the problem simplest.

Vector-Valued Functions and Space Curves

3.1 Vector-Valued Functions and Space Curves

🧭 Overview

🧠 One-sentence thesis

Vector-valued functions provide a unified way to represent plane and space curves by expressing position as a function of a parameter, enabling calculus operations component-wise to analyze motion and geometry.

📌 Key points (3–5)

What a vector-valued function is: a function that outputs vectors whose components are real-valued functions of a parameter t.
Plane curves vs space curves: two-component functions trace plane curves; three-component functions trace space curves.
How calculus works: limits, derivatives, and integrals are computed component-by-component on the real-valued functions f, g, and h.
Common confusion: the derivative r′(t) is not just a rate—it is a tangent vector to the curve at that point.
Key geometric objects: the unit tangent vector T(t) is found by normalizing r′(t); velocity and acceleration are the first and second derivatives of position.

📐 Definition and structure

📐 What a vector-valued function is

A vector-valued function is a function of the form r(t) = f(t)i + g(t)j or r(t) = f(t)i + g(t)j + h(t)k, where the component functions f, g, and h are real-valued functions of the parameter t.

The output is a vector, not a single number.
Each component (f, g, h) is an ordinary real-valued function.
The parameter t typically represents time or another independent variable.
Alternate notation: r(t) = ⟨f(t), g(t)⟩ or r(t) = ⟨f(t), g(t), h(t)⟩.

🗺️ Plane curves and space curves

The graph of a vector-valued function of the form r(t) = f(t)i + g(t)j is called a plane curve. The graph of a vector-valued function of the form r(t) = f(t)i + g(t)j + h(t)k is called a space curve.

Type	Form	Dimension
Plane curve	r(t) = f(t)i + g(t)j	2D
Space curve	r(t) = f(t)i + g(t)j + h(t)k	3D

Any representation of a plane or space curve using a vector-valued function is valid.
The excerpt states it is possible to represent an arbitrary plane curve by a vector-valued function.

🧮 Calculus operations

🧮 Limits of vector-valued functions

The limit of a vector-valued function is computed by taking the limit of each component separately:

For plane curves: lim(t→a) r(t) = [lim(t→a) f(t)]i + [lim(t→a) g(t)]j
For space curves: lim(t→a) r(t) = [lim(t→a) f(t)]i + [lim(t→a) g(t)]j + [lim(t→a) h(t)]k
Each component limit is an ordinary real-valued limit.

📈 Derivatives of vector-valued functions

The derivative of a vector-valued function: r′(t) = lim(Δt→0) [r(t + Δt) − r(t)] / Δt

To calculate the derivative, calculate the derivatives of the component functions, then put them back into a new vector-valued function.
Many properties of differentiation from ordinary calculus also apply to vector-valued functions.
Key geometric meaning: the derivative r′(t) is also a tangent vector to the curve.
Don't confuse: r′(t) is not just a scalar rate; it is a vector pointing in the direction of the curve at that instant.

∫ Integrals of vector-valued functions

Indefinite integral: integrate each component separately:

∫[f(t)i + g(t)j + h(t)k] dt = [∫f(t) dt]i + [∫g(t) dt]j + [∫h(t) dt]k

Definite integral: integrate each component over the interval [a, b]:

∫(from a to b)[f(t)i + g(t)j + h(t)k] dt = [∫(from a to b) f(t) dt]i + [∫(from a to b) g(t) dt]j + [∫(from a to b) h(t) dt]k
The antiderivative of a vector-valued function is found by finding the antiderivatives of the component functions, then putting them back together.

🎯 Tangent vectors and motion

🎯 Principal unit tangent vector

Principal unit tangent vector: T(t) = r′(t) / ‖r′(t)‖

The unit tangent vector T is calculated by dividing the derivative of a vector-valued function by its magnitude.
This normalizes the tangent vector to have length 1.
When the tail of the vector is placed at point r(t₀) on the graph, vector v is tangent to curve C.

🚀 Velocity and acceleration

Velocity: the derivative of the position vector:

v(t) = r′(t)

Acceleration: the derivative of velocity:

a(t) = v′(t) = r″(t)

Speed: the magnitude of velocity:

v(t) = ‖v(t)‖ = ‖r′(t)‖ = ds/dt
Speed is a scalar (the magnitude), while velocity is a vector (direction and magnitude).
The acceleration vector can be written as a linear combination of T and N (the unit tangent and principal unit normal vectors).

📏 Arc length and curvature

📏 Arc length

Arc length of a space curve from t = a to t = b:

s = ∫(from a to b) √[(f′(t))² + (g′(t))² + (h′(t))²] dt = ∫(from a to b) ‖r′(t)‖ dt

Arc-length function:

s(t) = ∫(from a to t) √[(f′(u))² + (g′(u))² + (h′(u))²] du = ∫(from a to t) ‖r′(u)‖ du
The arc-length function measures the distance traveled along the curve from a starting point a to a variable point t.

🌀 Curvature

Curvature measures how sharply a curve bends:

κ = ‖T′(t)‖ / ‖r′(t)‖
Alternate formula: κ = ‖r′(t) × r″(t)‖ / ‖r′(t)‖³
For plane curves given as y = y(x): κ = |y″| / [1 + (y′)²]^(3/2)
Curvature quantifies the rate of change of the tangent direction.

🧭 Principal unit normal and binormal vectors

Principal unit normal vector:

N(t) = T′(t) / ‖T′(t)‖

Binormal vector:

B(t) = T(t) × N(t)
The coefficient of the unit tangent vector T when the acceleration vector is written as a linear combination of T and N is called the tangential component of acceleration.

⚙️ Components of acceleration

⚙️ Tangential and normal components

Tangential component of acceleration:

a_T = a · T = (v · a) / ‖v‖

Normal component of acceleration:

a_N = a · N = ‖v × a‖ / ‖v‖ = √(‖a‖² − a_T²)
These components decompose acceleration into parts along the direction of motion (tangential) and perpendicular to it (normal).
The tangential component changes speed; the normal component changes direction.

Calculus of Vector-Valued Functions

3.2 Calculus of Vector-Valued Functions

🧭 Overview

🧠 One-sentence thesis

The calculus of vector-valued functions extends differentiation and integration to functions that output vectors, enabling the analysis of curves in space through component-wise operations.

📌 Key points (3–5)

Core operation: differentiate or integrate each component function separately, then reassemble into a vector-valued function.
Geometric meaning of the derivative: the derivative r′(t) is a tangent vector to the curve at that point.
Unit tangent vector: obtained by dividing the derivative by its magnitude, giving a standardized direction along the curve.
Properties carry over: many differentiation rules from single-variable calculus apply to vector-valued functions.
Common confusion: the derivative is itself a vector-valued function (a tangent vector), not a scalar rate of change.

📐 What vector-valued functions are

📐 Definition and forms

A vector-valued function is a function of the form r(t) = f(t)i + g(t)j or r(t) = f(t)i + g(t)j + h(t)k, where the component functions f, g, and h are real-valued functions of the parameter t.

The input is a single parameter t (often representing time or position along a path).
The output is a vector in two or three dimensions.
Alternative notation: r(t) = ⟨f(t), g(t)⟩ or r(t) = ⟨f(t), g(t), h(t)⟩.

🗺️ Geometric interpretation

The graph of a two-component vector-valued function is called a plane curve.
The graph of a three-component vector-valued function is called a space curve.
Any arbitrary plane curve can be represented by a vector-valued function.

🔄 Differentiation of vector-valued functions

🔄 How to differentiate

To calculate the derivative of a vector-valued function, calculate the derivatives of the component functions, then put them back into a new vector-valued function.

The derivative is computed component-wise:
- If r(t) = f(t)i + g(t)j + h(t)k, then r′(t) = f′(t)i + g′(t)j + h′(t)k.
The formal definition uses the limit: r′(t) = limit as Δt approaches 0 of [r(t + Δt) − r(t)] / Δt.

🧭 Tangent vector interpretation

The derivative r′(t) is a tangent vector to the curve at the point r(t₀).
When the tail of the vector is placed at point r(t₀) on the graph, the vector r′(t₀) is tangent to curve C.
This is the velocity vector if t represents time.

📏 Unit tangent vector

The principal unit tangent vector T(t) is calculated by dividing the derivative of a vector-valued function by its magnitude: T(t) = r′(t) / ‖r′(t)‖.

This normalizes the tangent vector to have length 1.
It gives the direction of motion along the curve without regard to speed.
Don't confuse: r′(t) has both direction and magnitude (speed); T(t) has only direction.

🔧 Properties of differentiation

Many properties from single-variable calculus carry over to vector-valued functions.
The excerpt references properties from "Introduction to Derivatives" that also apply here.
This means rules like the product rule, chain rule, and sum rule extend to vector-valued contexts.

∫ Integration of vector-valued functions

∫ Indefinite integrals

The antiderivative of a vector-valued function is found by finding the antiderivatives of the component functions, then putting them back together in a vector-valued function.

Indefinite integral formula:
- ∫[f(t)i + g(t)j + h(t)k] dt = [∫f(t) dt]i + [∫g(t) dt]j + [∫h(t) dt]k.
Each component is integrated separately.
The result is a new vector-valued function (plus a constant vector).

∫ Definite integrals

The definite integral of a vector-valued function is found by finding the definite integrals of the component functions, then putting them back together.

Definite integral formula from a to b:
- ∫ₐᵇ[f(t)i + g(t)j + h(t)k] dt = [∫ₐᵇ f(t) dt]i + [∫ₐᵇ g(t) dt]j + [∫ₐᵇ h(t) dt]k.
Each component is integrated over the interval [a, b].
The result is a constant vector (not a function).

📊 Key equations reference

📊 Limits

The limit of a vector-valued function is computed component-wise:
- limit as t approaches a of r(t) = [limit as t approaches a of f(t)]i + [limit as t approaches a of g(t)]j + [limit as t approaches a of h(t)]k.
Each component function's limit is calculated separately.

📊 Arc length

Arc length of a space curve from a to b:
- s = ∫ₐᵇ sqrt([f′(t)]² + [g′(t)]² + [h′(t)]²) dt = ∫ₐᵇ ‖r′(t)‖ dt.
This measures the total distance traveled along the curve.
The arc-length function s(t) gives the distance from a fixed starting point a to parameter value t.

📊 Velocity, acceleration, and speed

Concept	Formula	Meaning
Velocity	v(t) = r′(t)	The derivative of the position vector
Acceleration	a(t) = v′(t) = r″(t)	The derivative of velocity
Speed	v(t) = ‖v(t)‖ = ‖r′(t)‖ = ds/dt	The magnitude of the velocity vector

Speed is the rate of change of arc length with respect to time.
Don't confuse velocity (a vector) with speed (a scalar magnitude).

📊 Curvature and related vectors

Curvature κ measures how sharply a curve bends; formulas include:
- κ = ‖T′(t)‖ / ‖r′(t)‖
- κ = ‖r′(t) × r″(t)‖ / ‖r′(t)‖³
- For plane curves y = f(x): κ = |y″| / [1 + (y′)²]^(3/2)
Principal unit normal vector N(t) = T′(t) / ‖T′(t)‖ points toward the center of curvature.
Binormal vector B(t) = T(t) × N(t) is perpendicular to both T and N.
Tangential component of acceleration aₜ = a · T = (v · a) / ‖v‖ measures acceleration along the direction of motion.
Normal component of acceleration aₙ = a · N = ‖v × a‖ / ‖v‖ measures acceleration perpendicular to motion (centripetal).

Arc Length and Curvature

3.3 Arc Length and Curvature

🧭 Overview

🧠 One-sentence thesis

Arc length and curvature quantify how far you travel along a curve and how sharply it bends at each point, using the derivative of the vector-valued function and the arc-length parameterization.

📌 Key points (3–5)

Arc-length function: calculated by integrating the magnitude of the derivative of the vector-valued function; works in both two and three dimensions.
Curvature definition: the curvature of a curve at a point equals the curvature of the inscribed circle at that point; uses arc-length parameterization.
Multiple formulas for curvature: several different formulas exist; for a circle, curvature is the reciprocal of its radius.
Principal unit normal vector: defined as the derivative of the unit tangent vector divided by its magnitude; points toward the concave side of the curve.
Common confusion: the unit tangent vector T(t) is the derivative of r(t) divided by its magnitude, while the principal unit normal vector N(t) is the derivative of T(t) divided by its magnitude—don't mix up which derivative and which magnitude.

📏 Arc Length

📏 The arc-length function

Arc-length function for a vector-valued function: s(t) = integral from a to t of the magnitude of r′(u) with respect to u.

This formula measures the total distance traveled along the curve from parameter value a to parameter value t.
It works by summing up infinitesimal lengths of the curve, each given by the magnitude of the velocity vector r′(u).
Valid in both two and three dimensions.
Example: if r(t) traces a path in space, s(t) tells you how far you've walked along that path by time t.

🌀 Curvature

🌀 What curvature measures

Curvature at a point: defined to be the curvature of the inscribed circle at that point.

The inscribed circle is the circle that best fits the curve at that point—it is tangent to the curve and bends the same way.
Curvature quantifies "how sharply the curve is bending" at each point.
The definition uses arc-length parameterization to ensure the measure is independent of how fast you traverse the curve.

🔢 Formulas for curvature

The excerpt states there are several different formulas for curvature, though it does not list them all.
For a circle, curvature is simple: it equals one divided by the radius.
- A small circle (small radius) has high curvature (sharp bend).
- A large circle (large radius) has low curvature (gentle bend).
Example: a circle of radius 5 has curvature 1/5; a circle of radius 10 has curvature 1/10, so it bends less sharply.

🧭 The Frenet Frame

🧭 Unit tangent vector T(t)

Unit tangent vector T(t): calculated by dividing the derivative of a vector-valued function by its magnitude.

The derivative r′(t) gives the velocity vector; dividing by its magnitude normalizes it to length 1.
T(t) points in the direction the curve is heading at time t.
It is tangent to the curve at each point.

🔄 Principal unit normal vector N(t)

Principal unit normal vector at t: N(t) = T′(t) divided by the magnitude of T′(t).

This is the derivative of the unit tangent vector, normalized to length 1.
N(t) points toward the concave side of the curve—the direction in which the curve is bending.
Don't confuse: T(t) uses the derivative of r(t); N(t) uses the derivative of T(t).

⚡ Binormal vector B(t)

Binormal vector at t: B(t) = T(t) cross N(t).

The cross product of the unit tangent and principal unit normal vectors.
B(t) is perpendicular to both T(t) and N(t), completing a three-dimensional coordinate system.
The Frenet frame of reference is formed by T(t), N(t), and B(t) together.

🔵 The Osculating Circle

🔵 What the osculating circle is

Osculating circle: tangent to a curve at a point and has the same curvature as the tangent curve at that point.

"Osculating" means "kissing"—the circle kisses the curve at that point.
It is the best-fitting circle at that point: it is tangent and bends exactly as sharply as the curve does.
The radius of the osculating circle is the reciprocal of the curvature at that point.
Example: if the curvature at a point is 1/3, the osculating circle has radius 3.

Motion in Space

3.4 Motion in Space

🧭 Overview

🧠 One-sentence thesis

The position function of an object in space determines its velocity and acceleration vectors, which decompose into tangential and normal components that describe how the object speeds up and changes direction along its curved path.

📌 Key points (3–5)

Position, velocity, and acceleration relationship: the first derivative of position gives velocity, the second derivative gives acceleration, and the magnitude of velocity is speed.
Acceleration geometry: the acceleration vector always points toward the concave side of the curve and can be split into tangential (speed change) and normal (direction change) components.
Tangential vs normal components: tangential acceleration measures how fast speed changes; normal acceleration measures how sharply the path curves.
Common confusion: speed vs velocity—speed is the magnitude (scalar) of the velocity vector, not the velocity vector itself.
Historical connection: Kepler's laws describe planetary motion empirically; Newton proved them using calculus, his second law, and universal gravitation.

📐 Position, velocity, and acceleration

📍 Position function r(t)

If r(t) represents the position of an object at time t, then r′(t) represents the velocity and r″(t) represents the acceleration of the object at time t.

Position r(t): a vector-valued function that gives the location of an object at each moment in time.
Velocity v(t): the derivative of position, v(t) = r′(t); it points in the direction of motion.
Acceleration a(t): the derivative of velocity, a(t) = v′(t) = r″(t); it describes how velocity changes over time.

🏃 Speed vs velocity

The magnitude of the velocity vector is speed.

Speed v(t): the scalar quantity v(t) = ‖v(t)‖ = ‖r′(t)‖ = ds/dt (the rate of change of arc length).
Speed is always non-negative; velocity is a vector with both magnitude and direction.
Don't confuse: speed is not the velocity vector divided by its magnitude (that would be the unit tangent vector T(t)); speed is simply the magnitude itself.

🎯 Components of acceleration

🔀 Geometric direction of acceleration

The acceleration vector always points toward the concave side of the curve defined by r(t).

The concave side is the "inside" of the curve where it bends.
This means acceleration pulls the object toward the center of curvature.
Example: an object moving along a circular path has acceleration pointing toward the center of the circle.

📊 Tangential and normal components

The tangential and normal components of acceleration a_T and a_N are the projections of the acceleration vector onto the unit tangent and unit normal vectors to the curve.

Component	Formula	What it measures
Tangential a_T	a · T = (v · a) / ‖v‖	How fast the speed is changing
Normal a_N	a · N = ‖v × a‖ / ‖v‖ = √(‖a‖² - a_T²)	How sharply the path is curving

🧮 Understanding the split

Tangential component a_T: measures acceleration along the direction of motion (speeding up or slowing down).
- If a_T > 0, the object is speeding up.
- If a_T < 0, the object is slowing down.
- If a_T = 0, speed is constant (uniform motion along the curve).
Normal component a_N: measures acceleration perpendicular to the direction of motion (changing direction).
- a_N is always non-negative.
- Larger a_N means sharper turning.
- If a_N = 0, the path is a straight line.
Don't confuse: both components together reconstruct the full acceleration vector; neither alone tells the complete story of motion.

🪐 Kepler's laws and Newton's proof

🌌 Kepler's three laws

Kepler's three laws of planetary motion describe the motion of objects in orbit around the Sun. His third law can be modified to describe motion of objects in orbit around other celestial objects as well.

The excerpt does not detail the three laws themselves, only that they describe orbital motion.
Kepler's laws apply not just to the Sun but to any central gravitating body (e.g., moons orbiting planets).

🔬 Newton's contribution

Newton was able to use his law of universal gravitation in conjunction with his second law of motion and calculus to prove Kepler's three laws.

Newton provided the theoretical foundation: he showed that Kepler's empirical observations follow mathematically from fundamental physical principles.
The tools Newton used:
- Law of universal gravitation: the force between two masses.
- Second law of motion: force equals mass times acceleration (F = ma).
- Calculus: the mathematical machinery to handle derivatives and integrals of vector-valued functions.
This connection illustrates how the calculus of vector-valued functions (derivatives, acceleration) applies to real-world celestial mechanics.

🧰 Key formulas summary

🧰 Motion formulas

The excerpt provides these key relationships:

Velocity: v(t) = r′(t)
Acceleration: a(t) = v′(t) = r″(t)
Speed: v(t) = ‖v(t)‖ = ‖r′(t)‖ = ds/dt
Tangential acceleration: a_T = a · T = (v · a) / ‖v‖
Normal acceleration: a_N = a · N = ‖v × a‖ / ‖v‖ = √(‖a‖² - a_T²)

🧭 Related vectors

Unit tangent vector: T(t) = r′(t) / ‖r′(t)‖
Principal unit normal vector: N(t) = T′(t) / ‖T′(t)‖
Binormal vector: B(t) = T(t) × N(t)

These vectors form the Frenet frame of reference, a moving coordinate system that travels along the curve with the object.

Functions of Several Variables

4.1 Functions of Several Variables

🧭 Overview

🧠 One-sentence thesis

Functions of several variables extend single-variable calculus by mapping ordered pairs or tuples of real numbers to output values, requiring new concepts like partial derivatives, gradients, and constrained optimization to analyze their behavior.

📌 Key points (3–5)

Domain and range generalization: A function of two variables maps ordered pairs (x, y) to real numbers z, with domain in ℝ² and range in ℝ
Visualization through surfaces and level curves: Graphs become surfaces in 3D space; level curves (contour lines) show points where the function equals a constant
Critical points and extrema: Critical points occur where both partial derivatives equal zero or don't exist; second derivative test determines if they're maxima, minima, or saddle points
Common confusion: Unlike single-variable functions, critical points in multivariable functions can be saddle points (neither max nor min), requiring discriminant D to classify
Practical applications: Optimization problems with constraints (like maximizing profit subject to budget limits) use Lagrange multipliers

📐 Basic Definitions and Structure

📐 Function of two variables

A function of two variables z = f(x, y) maps each ordered pair (x, y) in a subset D of the real plane ℝ² to a unique real number z. The set D is called the domain of the function.

The domain consists of all valid input pairs
The range is the set of all possible output values z
Example: f(x, y) = x² - xy + 3y² takes pairs and produces numbers

📐 Domain restrictions

Common restrictions arise from:

Square roots requiring non-negative radicands: √(9 - x² - y²) needs x² + y² ≤ 9
Denominators that cannot be zero
Logarithms requiring positive arguments

Example: For g(x, y) = √(9 - x² - y²), the domain is a disk of radius 3 centered at the origin (including the boundary circle).

📐 Three or more variables

Functions can extend to any number of variables:

f(x, y, z) maps points in ℝ³ to real numbers
Domain restrictions combine all variable constraints
Example: f(x, y, z) = (3x - 4y + 2z)/√(9 - x² - y² - z²) requires x² + y² + z² < 9

🗺️ Visualization Methods

🗺️ Graphs as surfaces

For z = f(x, y), the graph consists of ordered triples (x, y, z)
Creates a two-dimensional surface in three-dimensional space
Points above xy-plane when z > 0, below when z < 0
Example: f(x, y) = √(9 - x² - y²) graphs as a hemisphere of radius 3

🗺️ Level curves (contour maps)

A level curve of a function f(x, y) for value c is the set of points satisfying f(x, y) = c.

Analogous to contour lines on topographical maps
Each curve shows where function has constant value
Curves close together indicate steep changes
Don't confuse: Level curves are drawn in the xy-plane, not in 3D space

Example: For g(x, y) = √(9 - x² - y²), level curves for c = 0, 1, 2, 3 are concentric circles centered at origin with radii √(9-c²).

🗺️ Vertical traces

A vertical trace is the intersection of the surface with a vertical plane x = a or y = b.

Obtained by fixing one variable to a constant
Results in a curve in the xz-plane or yz-plane
Helps understand the surface's cross-sectional shape
Example: For f(x, y) = sin(x)cos(y), traces parallel to xz-plane are cosine curves

🔍 Partial Derivatives

🔍 Definition and computation

The partial derivative of f with respect to x is: ∂f/∂x = lim[h→0] (f(x+h, y) - f(x, y))/h

Treat all other variables as constants
Differentiate with respect to the variable of interest
Notation: ∂f/∂x, fx, or ∂z/∂x
All single-variable differentiation rules apply

Example: For f(x, y) = x² - 3xy + 2y²:

∂f/∂x = 2x - 3y (treating y as constant)
∂f/∂y = -3x + 4y (treating x as constant)

🔍 Geometric interpretation

∂f/∂x represents the slope of the tangent line parallel to the x-axis
∂f/∂y represents the slope of the tangent line parallel to the y-axis
The partial derivative shows the instantaneous rate of change in one direction
Don't confuse: Partial derivatives are not the same as the total rate of change in an arbitrary direction (that requires directional derivatives)

🔍 Higher-order partial derivatives

Second-order partial derivatives include:

fxx: differentiate fx with respect to x
fyy: differentiate fy with respect to y
fxy: differentiate fx with respect to y (mixed partial)
fyx: differentiate fy with respect to x (mixed partial)

Clairaut's Theorem: If fxy and fyx are continuous, then fxy = fyx (order doesn't matter for mixed partials).

🎯 Critical Points and Extrema

🎯 Critical points definition

A point (x₀, y₀) is a critical point if:

fx(x₀, y₀) = fy(x₀, y₀) = 0, OR

At least one partial derivative doesn't exist

Critical points are candidates for local extrema
Not all critical points are extrema (saddle points exist)
Must check both partial derivatives simultaneously

🎯 Second derivative test

The discriminant D = fxx(x₀, y₀)·fyy(x₀, y₀) - [fxy(x₀, y₀)]² determines:

Condition	Classification
D > 0 and fxx > 0	Local minimum
D > 0 and fxx < 0	Local maximum
D < 0	Saddle point
D = 0	Test inconclusive

Example: For f(x, y) = x² + xy + y² - x - y + 1, find critical points by setting fx = fy = 0, then compute D to classify.

🎯 Saddle points

A saddle point occurs where both partial derivatives equal zero but the function has neither a local max nor min.

Named for resemblance to a horse saddle
Function increases in some directions, decreases in others
Example: f(x, y) = x² - y² has a saddle point at origin
Don't confuse: Zero gradient doesn't guarantee an extremum

∇ Gradient and Directional Derivatives

∇ Gradient vector

The gradient of f(x, y) is: ∇f(x, y) = (∂f/∂x)i + (∂f/∂y)j

Vector pointing in direction of steepest ascent
Magnitude ||∇f|| gives rate of steepest increase
Perpendicular to level curves at each point
Generalizes to any number of variables

Example: For f(x, y) = 3x² - 2xy + y², the gradient is ∇f = (6x - 2y)i + (-2x + 2y)j.

∇ Directional derivative

The directional derivative in direction u is: Dᵤf(x, y) = ∇f(x, y) · u

Measures rate of change in direction of unit vector u
u = (cos θ)i + (sin θ)j for angle θ
Maximum when u points in direction of ∇f
Minimum when u points opposite to ∇f

Example: To find rate of change of f at point (2, -1) in direction of vector v = -i + 2j, first normalize v to get unit vector u, then compute ∇f(2, -1) · u.

∇ Properties of the gradient

Three key properties:

If ∇f(x₀, y₀) = 0, then Dᵤf(x₀, y₀) = 0 for any direction u
Maximum directional derivative is ||∇f||, occurring when u points in direction of ∇f
Minimum directional derivative is -||∇f||, occurring when u points opposite to ∇f

🎪 Optimization with Constraints

🎪 Lagrange multipliers method

To optimize f(x, y) subject to constraint g(x, y) = 0, solve:

∇f(x₀, y₀) = λ∇g(x₀, y₀)

g(x₀, y₀) = 0

λ (lambda) is the Lagrange multiplier
At constrained extrema, gradients are parallel
System typically yields multiple equations to solve simultaneously
Works because gradient must be normal to constraint curve at extremum

Example: To maximize f(x, y) = xy subject to x + 2y = 7:

Set up: ∇f = λ∇g where g(x, y) = x + 2y - 7
Get equations: y = λ, x = 2λ, x + 2y = 7
Solve to find critical point

🎪 Multiple constraints

For two constraints g(x, y, z) = 0 and h(x, y, z) = 0:

Use two multipliers: ∇f = λ₁∇g + λ₂∇h
Solve system with both constraint equations
More equations but same principle

🎪 Practical applications

Common optimization scenarios:

Maximizing profit subject to budget constraints
Minimizing cost subject to production requirements
Finding extreme distances subject to geometric constraints
Don't confuse: The multiplier λ itself isn't usually the answer; it's a tool to find the optimal point

Example: A company with profit function f(x, y) = 48x + 96y - x² - 2xy - 9y² and budget constraint 20x + 4y = 216 uses Lagrange multipliers to find x and y that maximize profit.

Differentiation of Functions of Several Variables

This excerpt covers the foundational concepts of multivariable calculus, including how to define, visualize, differentiate, and optimize functions of two or more variables. The material progresses from basic definitions through partial derivatives, gradients, and culminates in constrained optimization using Lagrange multipliers—essential tools for real-world applications in economics, engineering, and physics.

Limits and Continuity

4.2 Limits and Continuity

🧭 Overview

🧠 One-sentence thesis

Limits and continuity for functions of two variables are studied using δ disks centered around a given point, where a limit exists if the function values stay arbitrarily close to a target value for all points within the disk.

📌 Key points (3–5)

Tool for studying limits: use a δ disk (an open disk of radius δ) centered around a given point.
What a limit means: for any point in a δ ball centered at point P, the function value at that point is arbitrarily close to the limit value.
Extension to multiple variables: the concept generalizes from single-variable calculus to functions of two or more variables.
Common confusion: a δ disk is a two-dimensional region (all points within distance δ), not just points on a circle; similarly, a δ ball extends this to three dimensions.

🎯 The δ disk approach

🎯 What a δ disk is

An open disk of radius δ centered at point (a, b): all points in the plane lying at a distance of less than δ from (a, b).

This is the two-dimensional analog of an interval in single-variable calculus.
"Open" means the boundary circle itself is not included—only points strictly inside.
Example: if δ = 0.1 and the center is (2, 3), the disk includes all points closer than 0.1 to (2, 3).

🌐 Extension to three dimensions

All points in ℝ³ lying at a distance of less than δ from (x₀, y₀, z₀): a δ ball.

The excerpt mentions "a δ ball centered at a point P" when discussing functions of several variables.
The same principle applies: consider all points within a small sphere around P.
Don't confuse: a disk is 2D (for functions of two variables), a ball is 3D (for functions of three variables).

📏 Definition of a limit

📏 How limits work for two variables

The excerpt states:

A function of several variables has a limit if for any point in a δ ball centered at point P, the value of the function at that point is arbitrarily close to the limit value.

Breaking this down:

"Arbitrarily close" means we can make the function value as close as we want to the limit by choosing δ small enough.
The key requirement: this must hold for all points inside the δ disk/ball, not just along a single path.
Example: if we claim the limit at P is L, then for every point within a tiny disk around P, f at that point should be nearly equal to L.

🔍 Why this matters for continuity

Continuity at a point requires the limit to exist and equal the function value at that point.
The δ disk approach ensures we check behavior in all directions around P, not just along the x-axis or y-axis.
This prevents false conclusions from checking only one path (a common pitfall in multivariable calculus).

🗺️ Context: functions of several variables

🗺️ Connection to earlier concepts

The excerpt places this section after "Functions of Several Variables," which covered:

Graphs as surfaces in ℝ³.
Level curves and contour maps.
Vertical traces.

The δ disk method builds on these visualizations:

A δ disk on the domain corresponds to a small patch on the surface graph.
Checking the limit means ensuring the surface height stays close to a target value over that entire patch.

🧩 Terminology summary

Term	Dimension	Definition from excerpt
δ disk	2D	Open disk of radius δ centered at (a, b)
δ ball	3D	All points in ℝ³ at distance less than δ from (x₀, y₀, z₀)
Limit	Any	Function value arbitrarily close to limit value for all points in δ region

Partial Derivatives

4.3 Partial Derivatives

🧭 Overview

🧠 One-sentence thesis

Partial derivatives extend differentiation to functions of multiple independent variables by treating all but one variable as constants during differentiation.

📌 Key points (3–5)

What a partial derivative is: a derivative for functions with more than one independent variable.
How to calculate: treat all other variables as constants and apply usual differentiation rules to the variable of interest.
Higher-order partials: can be calculated the same way as higher-order derivatives in single-variable calculus.
Common confusion: don't forget that "partial" means you hold other variables fixed—it's not a total derivative with respect to all variables at once.

🔧 What partial derivatives are

🔧 Definition and purpose

A partial derivative is a derivative involving a function of more than one independent variable.

Unlike ordinary derivatives (one input, one output), partial derivatives handle functions like f(x, y) or f(x, y, z) where multiple inputs affect the output.
The "partial" aspect means you focus on how the function changes with respect to one variable at a time.
Example: if temperature depends on both position x and time t, the partial derivative with respect to x tells you how temperature changes as you move in space, holding time constant.

🧮 How to compute partial derivatives

🧮 The core rule

Treat all other variables as constants and use the usual differentiation rules for the variable you're differentiating with respect to.
This is the key technique: if you're finding the partial derivative with respect to x, then y, z, and any other variables are treated as fixed numbers.
Example: for f(x, y) = x squared times y, the partial derivative with respect to x treats y as a constant, so you differentiate x squared times y as if y were just a number.

🔁 Higher-order partial derivatives

Higher-order partial derivatives can be calculated in the same way as higher-order derivatives in single-variable calculus.
You can take a partial derivative, then take another partial derivative of the result (with respect to the same or a different variable).
Don't confuse: "higher-order" means repeated differentiation, not a different technique—just apply the same "hold other variables constant" rule each time.

🔗 Related concepts in the chapter

🔗 Chain rule connection

The excerpt mentions the chain rule for functions of more than one variable:

The chain rule involves partial derivatives with respect to all the independent variables.
When each independent variable also depends on other variables, tree diagrams help derive the formulas.
The generalized chain rule formula given is: the partial derivative of w with respect to t sub j equals the sum of (partial of w with respect to x sub i) times (partial of x sub i with respect to t sub j) for all i from 1 to m.

🔗 Directional derivatives and gradient

The excerpt also references:

Directional derivative: represents a rate of change of a function in any given direction (not just along a coordinate axis).
Gradient: can be used in a formula to calculate the directional derivative; it indicates the direction of greatest change.
The gradient in two dimensions is: gradient of f(x, y) equals (partial of f with respect to x) times i plus (partial of f with respect to y) times j.
The gradient in three dimensions adds a k component with the partial derivative with respect to z.

🔗 Discriminant for optimization

The excerpt lists a discriminant formula:

D equals (second partial of f with respect to x at point (x sub 0, y sub 0)) times (second partial of f with respect to y at the same point) minus (mixed partial derivative with respect to x and y at the same point) squared.
This discriminant is used in maxima/minima problems to classify critical points.

📐 Applications mentioned

📐 Tangent planes and approximations

Partial derivatives are used to construct tangent planes to surfaces (the analog of tangent lines for curves).
Tangent planes can approximate function values near known points.
The total differential uses partial derivatives to approximate the change in a function z = f(x sub 0, y sub 0) at a point for given changes delta x and delta y.

📐 Lagrange multipliers

The excerpt lists formulas for the method of Lagrange multipliers:

One constraint: gradient of f at (x sub 0, y sub 0) equals lambda times gradient of g at the same point, with the constraint g(x sub 0, y sub 0) = 0.
Two constraints: gradient of f at (x sub 0, y sub 0, z sub 0) equals lambda sub 1 times gradient of g plus lambda sub 2 times gradient of h, with both constraints equal to zero.
These methods use partial derivatives (via the gradient) to find extrema subject to constraints.

Tangent Planes and Linear Approximations

4.4 Tangent Planes and Linear Approximations

🧭 Overview

🧠 One-sentence thesis

Tangent planes extend the idea of tangent lines to surfaces, enabling approximation of function values near known points and requiring the function to be "smooth" (differentiable) at that point.

📌 Key points (3–5)

Core analogy: A tangent plane to a surface is the two-variable analog of a tangent line to a curve.
Practical use: Tangent planes approximate function values near known points.
Differentiability requirement: A function is differentiable at a point if it is "smooth" there—no corners or discontinuities.
Total differential tool: The total differential approximates the change in a function z = f(x₀, y₀) at point (x₀, y₀) for given changes Δx and Δy.
Common confusion: Differentiability is not just about partial derivatives existing; it requires smoothness (no sharp edges or breaks).

📐 From tangent lines to tangent planes

📏 The geometric analogy

The analog of a tangent line to a curve is a tangent plane to a surface for functions of two variables.

In single-variable calculus, a tangent line touches a curve at one point and approximates the curve nearby.
For functions of two variables (surfaces in three-dimensional space), the tangent plane plays the same role.
The plane "touches" the surface at a point and lies flat against it, capturing the local behavior.

🔍 Why this matters

Just as tangent lines help estimate function values near a point in one dimension, tangent planes do the same in two dimensions.
The tangent plane is the best linear (flat) approximation to the surface at that point.

🧮 Using tangent planes for approximation

📊 Approximating function values

What it does: Tangent planes can be used to approximate values of functions near known values.
How it works: If you know the function value and behavior at one point, the tangent plane gives you a simple linear formula to estimate nearby values.
Example: If you know f at (x₀, y₀) and the tangent plane there, you can estimate f at a nearby point (x₀ + Δx, y₀ + Δy) without computing f directly.

🧰 The total differential

The total differential can be used to approximate the change in a function z = f(x₀, y₀) at the point (x₀, y₀) for given values of Δx and Δy.

The total differential is a formula that captures how much z changes when x and y change by small amounts Δx and Δy.
It uses the partial derivatives at (x₀, y₀) to build a linear approximation.
Key idea: Instead of computing the exact change in f, the total differential gives a quick linear estimate based on the tangent plane.
Don't confuse: The total differential is an approximation of the actual change, not the exact change; it works well only when Δx and Δy are small.

✅ Differentiability and smoothness

🌊 What differentiability means

A function is differentiable at a point if it is "smooth" at that point (i.e., no corners or discontinuities exist at that point).

Smooth means the surface has no sharp edges, corners, or breaks at that point.
If a function is differentiable, the tangent plane exists and provides a good local approximation.
Why it matters: Only differentiable functions can be reliably approximated by tangent planes; non-smooth points break the approximation.

⚠️ Common pitfall

Don't assume a function is differentiable just because partial derivatives exist.
The excerpt emphasizes no corners or discontinuities—the function must be smooth in all directions at that point.
Example: A function might have partial derivatives but still have a sharp ridge; such a point is not differentiable.

🔗 Connection to other concepts

🧩 Relationship to partial derivatives

Tangent planes are built using partial derivatives (the rates of change in the x and y directions).
The excerpt places tangent planes after the section on partial derivatives, showing that partial derivatives are the building blocks.

🧭 Relationship to the chain rule and directional derivatives

The excerpt lists tangent planes before the chain rule and directional derivatives, suggesting tangent planes are a foundational tool.
Tangent planes capture local linear behavior, which is essential for understanding how functions change along any direction (directional derivatives) and how changes propagate through compositions (chain rule).

The Chain Rule

4.5 The Chain Rule

🧭 Overview

🧠 One-sentence thesis

The chain rule for multivariable functions extends the single-variable chain rule by summing the products of partial derivatives along all paths connecting dependent and independent variables.

📌 Key points (3–5)

Core idea: when a function depends on variables that themselves depend on other variables, the chain rule tracks how changes propagate through all intermediate variables.
Two independent variables case: the derivative with respect to each independent variable is the sum of products of partial derivatives along each path.
Generalized form: for a function of m variables, each depending on n independent variables, the partial derivative with respect to any independent variable sums m terms (one per intermediate variable).
Tree diagrams as a tool: the excerpt emphasizes that tree diagrams help derive and visualize the chain rule formulas by showing all dependency paths.
Common confusion: don't confuse partial derivatives (holding some variables constant) with total derivatives (accounting for all dependencies through the chain rule).

🔗 Chain rule for two independent variables

🔗 The basic setup

When z depends on x and y, and both x and y depend on two independent variables u and v, we need to track how z changes with respect to u and v.

Chain rule for two independent variables:

dz/du = (∂z/∂x) · (∂x/∂u) + (∂z/∂y) · (∂y/∂u)

dz/dv = (∂z/∂x) · (∂x/∂v) + (∂z/∂y) · (∂y/∂v)

🧩 How it works

Each formula sums two terms because there are two intermediate variables (x and y).
Each term is a product: the partial derivative of z with respect to an intermediate variable, multiplied by the partial derivative of that intermediate variable with respect to the independent variable.
Example: to find how z changes with u, we account for both "z changes through x" and "z changes through y."

🌳 Tree diagram interpretation

The excerpt states that tree diagrams are useful for deriving these formulas:

Each path from z down to u (or v) through an intermediate variable (x or y) corresponds to one product term.
Sum all the products along all paths to get the total derivative.

🔢 Generalized chain rule

🔢 The formula

Generalized chain rule: ∂w/∂t_j = (∂w/∂x₁)(∂x₁/∂t_j) + (∂w/∂x₂)(∂x₂/∂t_j) + ⋯ + (∂w/∂x_m)(∂x_m/∂t_j)

w is a function of m intermediate variables x₁, x₂, …, x_m.
Each intermediate variable depends on independent variables t_j.
The partial derivative of w with respect to any t_j is the sum of m product terms.

🧮 Structure of the sum

Number of terms: m (one for each intermediate variable).
Each term: (partial of w with respect to one intermediate variable) × (partial of that intermediate variable with respect to t_j).
This generalizes the two-variable case: if m = 2, the formula reduces to the two-term sum shown earlier.

🌲 Using tree diagrams

The excerpt emphasizes that tree diagrams help derive the generalized formula:

Draw a tree with w at the top, intermediate variables x₁, …, x_m in the middle, and independent variables t_j at the bottom.
Each branch from w to t_j through an intermediate variable x_i gives one product term.
Sum all such products to get ∂w/∂t_j.

🧭 Directional derivatives and the gradient

🧭 Directional derivative in two dimensions

Directional derivative (two dimensions):

D_u f(a, b) = limit as h approaches 0 of [f(a + h cos θ, b + h sin θ) - f(a, b)] / h

Or equivalently: D_u f(x, y) = f_x(x, y) cos θ + f_y(x, y) sin θ

The directional derivative measures the rate of change of f in the direction given by angle θ.
The second formula shows it as a weighted sum of partial derivatives, where the weights are the direction cosines.

🧭 Gradient in two dimensions

Gradient (two dimensions): ∇f(x, y) = f_x(x, y) i + f_y(x, y) j

The gradient is a vector whose components are the partial derivatives of f.
It points in the direction of greatest increase of f.

🧭 Three-dimensional versions

Gradient (three dimensions): ∇f(x, y, z) = f_x(x, y, z) i + f_y(x, y, z) j + f_z(x, y, z) k

Directional derivative (three dimensions): D_u f(x, y, z) = ∇f(x, y, z) · u = f_x(x, y, z) cos α + f_y(x, y, z) cos β + f_z(x, y, z) cos γ

The three-dimensional gradient has three components (one for each coordinate).
The directional derivative is the dot product of the gradient and the unit direction vector u.
The direction cosines (cos α, cos β, cos γ) are the components of u.

🔍 Connection to the chain rule

The directional derivative formula is a special case of the chain rule: moving in direction u means the coordinates change according to the direction cosines, and the chain rule sums the contributions from each coordinate.
Don't confuse: the gradient is a vector field (defined at every point), while the directional derivative is a scalar (the rate of change in a specific direction).

🎯 Optimization tools

🎯 Discriminant for critical points

Discriminant: D = f_xx(x₀, y₀) f_yy(x₀, y₀) - [f_xy(x₀, y₀)]²

The discriminant is used to classify critical points of a function of two variables.
It combines second-order partial derivatives at a point (x₀, y₀).
The excerpt does not explain the classification rules, but the formula is part of the toolkit for maxima/minima problems.

🎯 Method of Lagrange multipliers (one constraint)

Lagrange multipliers, one constraint:

∇f(x₀, y₀) = λ ∇g(x₀, y₀)

g(x₀, y₀) = 0

Used to find extrema of f subject to the constraint g = 0.
At an extremum, the gradient of f is parallel to the gradient of g (scaled by λ, the Lagrange multiplier).
The constraint equation g = 0 must also be satisfied.

🎯 Method of Lagrange multipliers (two constraints)

Lagrange multipliers, two constraints:

∇f(x₀, y₀, z₀) = λ₁ ∇g(x₀, y₀, z₀) + λ₂ ∇h(x₀, y₀, z₀)

g(x₀, y₀, z₀) = 0

h(x₀, y₀, z₀) = 0

Extends the method to two constraints g = 0 and h = 0.
The gradient of f is now a linear combination of the gradients of g and h, with multipliers λ₁ and λ₂.
Both constraint equations must be satisfied.

🔍 Why Lagrange multipliers relate to the chain rule

The method implicitly uses the chain rule: moving along the constraint surface means the variables are not independent, and the chain rule describes how f changes when constrained.
Don't confuse: Lagrange multipliers find extrema subject to constraints, while the chain rule describes how derivatives propagate through dependencies.

Directional Derivatives and the Gradient

4.6 Directional Derivatives and the Gradient

🧭 Overview

🧠 One-sentence thesis

The directional derivative measures how fast a function changes in any chosen direction, and the gradient vector both computes that rate and points toward the direction of steepest increase.

📌 Key points (3–5)

What the directional derivative measures: the rate of change of a function in any given direction (not just along the x or y axes).
How the gradient helps: the gradient vector can be used in a formula to calculate the directional derivative.
What the gradient indicates: the direction of greatest change (steepest increase) of a function of more than one variable.
Common confusion: directional derivatives are not the same as partial derivatives—partial derivatives measure change along coordinate axes only, while directional derivatives work in any direction.
Connection to earlier concepts: the gradient is built from partial derivatives (the components are the partial derivatives with respect to each variable).

📐 Directional derivative concept

📐 What it measures

A directional derivative represents a rate of change of a function in any given direction.

Unlike partial derivatives (which measure change along the x-axis or y-axis), the directional derivative measures change along any direction you choose.
It answers: "If I move from point (a, b) in a specific direction, how quickly does the function value change?"
Example: imagine a temperature function over a region—moving northeast gives a different rate of temperature change than moving due east.

🧮 Formula in two dimensions

The excerpt provides two equivalent forms:

Limit definition: D_u f(a, b) = limit as h approaches 0 of [f(a + h cos θ, b + h sin θ) - f(a, b)] / h
- Here θ is the angle of the direction, and h is a small step size.
Computational formula: D_u f(x, y) = f_x(x, y) cos θ + f_y(x, y) sin θ
- This shows the directional derivative is a weighted combination of the partial derivatives, where the weights are the cosine and sine of the direction angle.

🧮 Formula in three dimensions

D_u f(x, y, z) = gradient of f(x, y, z) · u = f_x(x, y, z) cos α + f_y(x, y, z) cos β + f_z(x, y, z) cos γ
- Here α, β, γ are the direction angles (or direction cosines) of the unit vector u.
- The dot product structure shows the directional derivative is the projection of the gradient onto the direction vector.

🎯 The gradient vector

🎯 Definition and components

Gradient (two dimensions): ∇f(x, y) = f_x(x, y) i + f_y(x, y) j

Gradient (three dimensions): ∇f(x, y, z) = f_x(x, y, z) i + f_y(x, y, z) j + f_z(x, y, z) k

The gradient is a vector whose components are the partial derivatives of the function.
In two dimensions: the i-component is the partial derivative with respect to x, the j-component is the partial derivative with respect to y.
In three dimensions: add a k-component for the partial derivative with respect to z.

🧭 Role in computing directional derivatives

The excerpt states: "The gradient can be used in a formula to calculate the directional derivative."
Specifically, in three dimensions: D_u f = ∇f · u (the dot product of the gradient and the unit direction vector).
This means: once you know the gradient, you can find the rate of change in any direction by taking the dot product with that direction's unit vector.

🔺 Direction of greatest change

The excerpt emphasizes: "The gradient indicates the direction of greatest change of a function of more than one variable."
Why: the dot product ∇f · u is maximized when u points in the same direction as ∇f (because dot product = |∇f| |u| cos(angle), and cosine is largest when angle = 0).
Implication: if you want to increase the function value as quickly as possible, move in the direction of the gradient.
Example: on a hillside (height function), the gradient points uphill in the steepest direction; moving perpendicular to the gradient means moving along a level curve (no height change).

🔗 Relationship to other concepts

🔗 Partial derivatives vs directional derivatives

Concept	What it measures	Direction
Partial derivative	Rate of change along a single coordinate axis	Fixed (x-axis, y-axis, etc.)
Directional derivative	Rate of change in any chosen direction	Arbitrary (specified by angle or unit vector)

Don't confuse: partial derivatives are special cases of directional derivatives (when the direction is along a coordinate axis).
The directional derivative formula shows how partial derivatives combine to give the rate in any direction.

🔗 Connection to the chain rule

The excerpt lists the chain rule formulas at the top, which involve sums of products of partial derivatives.
The directional derivative formula (D_u f = f_x cos θ + f_y sin θ) has a similar structure: it is a weighted sum of partial derivatives.
Both concepts rely on breaking down change into contributions from each variable.

🔗 Gradient and level curves

The gradient is perpendicular to level curves (curves where the function value is constant).
Why: along a level curve, the function does not change, so the directional derivative in that direction is zero; the gradient (direction of greatest increase) must be perpendicular to directions of zero change.
Example: contour lines on a topographic map are level curves; the gradient (steepest ascent) is perpendicular to those contours.

Maxima/Minima Problems

4.7 Maxima/Minima Problems

🧭 Overview

🧠 One-sentence thesis

Critical points of a function of two variables are the starting points for finding maxima and minima, and the discriminant helps classify whether each critical point is a maximum, minimum, or saddle point.

📌 Key points (3–5)

What a critical point is: any point where the function's behavior changes in a way that might indicate a maximum, minimum, or saddle point.
How to classify critical points: use the discriminant D, which combines second partial derivatives at the critical point.
The discriminant formula: D equals the product of the two unmixed second partials minus the square of the mixed partial.
Common confusion: not all critical points are maxima or minima—some are saddle points where the function increases in one direction and decreases in another.
Connection to Lagrange multipliers: when optimizing subject to constraints, the gradient of the objective function equals a weighted sum of the gradients of the constraint functions.

🔍 Finding critical points

🔍 What a critical point is

A critical point of the function f(x, y) is any point where the function's first partial derivatives are zero or undefined.

Critical points are candidates for local maxima, local minima, or saddle points.
The excerpt does not complete the definition, but the context (maxima/minima problems) and the discriminant formula indicate that critical points are where both first partial derivatives vanish.
Example: if the partial derivative with respect to x is zero and the partial derivative with respect to y is zero at (x₀, y₀), then (x₀, y₀) is a critical point.

🧮 Classifying critical points with the discriminant

🧮 The discriminant formula

Discriminant: D = f_xx(x₀, y₀) · f_yy(x₀, y₀) − [f_xy(x₀, y₀)]²

The discriminant combines three second partial derivatives evaluated at the critical point (x₀, y₀):
- f_xx: the second partial derivative with respect to x twice.
- f_yy: the second partial derivative with respect to y twice.
- f_xy: the mixed partial derivative (first with respect to x, then y).
The formula multiplies the two unmixed second partials and subtracts the square of the mixed partial.

🔢 How the discriminant classifies

The sign and magnitude of D determine the nature of the critical point:
- If D > 0 and f_xx > 0, the point is a local minimum.
- If D > 0 and f_xx < 0, the point is a local maximum.
- If D < 0, the point is a saddle point (neither a maximum nor a minimum).
- If D = 0, the test is inconclusive.
Don't confuse: a critical point is not automatically a maximum or minimum; the discriminant is needed to decide.

🎯 Optimization with constraints

🎯 Method of Lagrange multipliers (one constraint)

When finding maxima or minima subject to a constraint g(x, y) = 0, the method of Lagrange multipliers is used.
The key equations are:
- The gradient of f at (x₀, y₀) equals λ times the gradient of g at (x₀, y₀).
- The constraint g(x₀, y₀) = 0 must hold.
In words: at the optimal point, the direction of steepest ascent of the objective function is parallel to the direction of steepest ascent of the constraint.
λ (lambda) is called the Lagrange multiplier; it is an unknown scalar that weights the constraint gradient.

🎯 Method of Lagrange multipliers (two constraints)

When there are two constraints g(x, y, z) = 0 and h(x, y, z) = 0, the method extends to three dimensions.
The key equations are:
- The gradient of f at (x₀, y₀, z₀) equals λ₁ times the gradient of g plus λ₂ times the gradient of h.
- Both constraints g(x₀, y₀, z₀) = 0 and h(x₀, y₀, z₀) = 0 must hold.
Two Lagrange multipliers (λ₁ and λ₂) are introduced, one for each constraint.
Example: to maximize a function of three variables subject to two simultaneous conditions, solve the system of gradient equations together with the two constraint equations.

🧩 Related concepts from the chapter

🧩 Directional derivatives and the gradient

The directional derivative measures the rate of change of a function in any given direction.
The gradient can be used in a formula to calculate the directional derivative: D_u f = ∇f · u (the dot product of the gradient and the unit direction vector).
The gradient indicates the direction of greatest increase of the function.
Connection to maxima/minima: at a critical point, the gradient is zero, so the directional derivative in every direction is zero.

🧩 Partial derivatives and higher-order derivatives

A partial derivative is a derivative of a function of more than one independent variable, taken with respect to one variable while treating the others as constants.
Higher-order partial derivatives (like f_xx, f_yy, f_xy) are calculated by differentiating partial derivatives again.
These higher-order partials appear in the discriminant formula used to classify critical points.

Lagrange Multipliers

4.8 Lagrange Multipliers

🧭 Overview

🧠 One-sentence thesis

The method of Lagrange multipliers provides a systematic four-step strategy for solving optimization problems where an objective function must be maximized or minimized subject to one or more constraints.

📌 Key points (3–5)

What the method solves: optimization problems that combine an objective function with one or more constraints.
How to apply it: use a four-step problem-solving strategy with Lagrange multipliers.
When to use it: when you need to find extrema (maximum or minimum values) of a function subject to constraint equations.
Common confusion: Lagrange multiplier problems differ from unconstrained maxima/minima problems (section 4.7) because they involve constraint equations that restrict the domain.

🎯 What is an optimization problem with constraints

🎯 Core definition

An optimization problem combined with one or more constraints is an example of an optimization problem.

The objective function is the function you want to maximize or minimize.
The constraint(s) are equations that restrict which points are allowed in the solution.
Example: maximize f(x, y) = x squared times y subject to the constraint x squared plus y squared equals 4.

🔗 Relationship to unconstrained problems

In section 4.7 (Maxima/Minima Problems), we found extrema by:
- Finding critical points where both partial derivatives equal zero
- Using the second derivative test with a discriminant
Don't confuse: Lagrange multiplier problems add constraint equations, so you cannot simply set partial derivatives to zero—you must account for the restrictions.

🛠️ The method of Lagrange multipliers

🛠️ Four-step problem-solving strategy

The excerpt states that we "apply the method of Lagrange multipliers using a four-step problem-solving strategy," but does not detail the four steps in this summary section.

The method is specifically designed for constrained optimization.
It provides a systematic approach rather than ad-hoc techniques.

📐 Example applications from exercises

The excerpt includes two worked examples at the exercise level:

Example 1: Find maximum and minimum values of f(x, y) = x squared times y, subject to x squared plus y squared equals 4.

Example 2: Find maximum and minimum values of f(x, y) = x squared minus y squared, subject to x plus 6y equals 4.

These show the method works for both equality constraints (equations).
The constraint can be a circle, line, or other curve/surface.

🔄 Connection to other optimization techniques

🔄 When Lagrange multipliers are needed

Unconstrained problems (section 4.7): use critical points and the second derivative test directly.
Constrained problems (section 4.8): use Lagrange multipliers when the feasible region is defined by constraint equations.

🔄 Why constraints matter

Constraints limit which points can be solutions.
The maximum or minimum on a constrained region may occur at points that are not critical points of the objective function alone.
Example: maximizing f(x, y) over a circle is different from maximizing it over the entire plane.

📊 Context in the chapter structure

📊 Position in differentiation sequence

The excerpt shows Lagrange multipliers as section 4.8, the final topic in Chapter 4 (Differentiation of Functions of Several Variables):

Section	Topic
4.3	Partial Derivatives
4.4	Tangent Planes and Linear Approximations
4.5	The Chain Rule
4.6	Directional Derivatives and the Gradient
4.7	Maxima/Minima Problems (unconstrained)
4.8	Lagrange Multipliers (constrained)

This placement shows Lagrange multipliers as the culmination of differentiation techniques.
It builds on gradient concepts from 4.6 and extrema techniques from 4.7.

📊 Transition to integration

Chapter 4 ends with Lagrange multipliers.
Chapter 5 begins with multiple integration topics.
This shows the method closes the differentiation portion of multivariable calculus.

Double Integrals over Rectangular Regions

5.1 Double Integrals over Rectangular Regions

🧭 Overview

🧠 One-sentence thesis

Double integrals over rectangular regions extend single-variable integration to functions of two variables, enabling calculation of volumes, areas, and average values by taking limits of double Riemann sums and evaluating them as iterated integrals using Fubini's theorem.

📌 Key points (3–5)

What double integrals represent: the limit of a double Riemann sum that approximates the volume of a solid bounded above by a function of two variables over a rectangular region.
How to evaluate: use Fubini's theorem to write a double integral as an iterated integral (integrate one variable at a time).
What they calculate: area of a region, volume under a surface, and average value of a function of two variables over a rectangular region.
Properties matter: properties of double integrals help simplify computation and find bounds on values.

📐 What double integrals are

📐 Definition and construction

A double integral represents the limit of a double Riemann sum that approximates the volume of a solid bounded above by a function of two variables over a rectangular region.

Start with a double Riemann sum: divide the rectangular region into small pieces, evaluate the function at sample points, multiply by area, and sum.
Take the limit as the number of subdivisions approaches infinity in both directions.
The result is the double integral, which gives the exact volume of the solid.

🧱 From approximation to exact value

The double Riemann sum is an approximation.
By taking the limit (as the mesh gets finer), the approximation becomes exact.
This process mirrors how single integrals work, but now in two dimensions.

🔄 How to evaluate double integrals

🔄 Fubini's theorem

Fubini's theorem allows you to write a double integral as an iterated integral.
An iterated integral means you integrate one variable at a time, holding the other fixed.
This converts a two-dimensional problem into two one-dimensional problems.

📝 Iterated integral notation

The excerpt shows the form: integrate the inner variable first (with respect to one variable), then integrate the result with respect to the other variable.
The order of integration can sometimes be chosen for convenience.
Example: to find the volume under a surface over a rectangle, set up the double integral and then evaluate it by integrating first with respect to y, then with respect to x (or vice versa).

🛠️ Properties and applications

🛠️ Properties of double integrals

Properties of double integrals are useful to simplify computation.
They also help find bounds on the values of integrals.
These properties work similarly to properties of single integrals (linearity, comparison, etc.).

📊 What double integrals calculate

Quantity	What it represents
Volume	Volume under a surface (the function) and above the rectangular region
Area	Area of the region itself (when the function is constant 1)
Average value	Average value of a function of two variables over the rectangular region

All three applications use the same double integral framework.
The interpretation depends on what the function represents and what you are measuring.

🌍 Extensions beyond rectangles

🌍 General regions

The excerpt mentions that double integrals can be extended to general bounded regions (not just rectangles).
A general bounded region can be enclosed inside a rectangular region.
To evaluate over a general region, sketch it and express it as a Type I or Type II region (or a union of such regions).

🔁 Type I and Type II regions

Type I region: for each x in an interval, y ranges between two functions of x.
Type II region: for each y in an interval, x ranges between two functions of y.
The iterated integral limits change to reflect these variable boundaries.
Don't confuse: the rectangular region has constant limits; general regions have limits that are functions.

🌀 Polar coordinates

When the region or function has circular symmetry, it is often more convenient to use polar coordinates.
The area element changes: dA becomes r dr d(theta).
Conversion formulas: x = r cos(theta), y = r sin(theta).
This is a coordinate transformation that simplifies certain integrals.

Double Integrals over General Regions

5.2 Double Integrals over General Regions

🧭 Overview

🧠 One-sentence thesis

Double integrals can be extended from rectangular regions to general bounded regions by enclosing them in rectangles and expressing them as Type I or Type II regions, enabling the same volume, area, and average-value calculations.

📌 Key points (3–5)

General bounded regions: any region that can be enclosed inside a rectangular region, allowing double integrals to work beyond rectangles.
Type I vs Type II regions: two ways to describe a non-rectangular region using iterated integrals with variable limits.
How to evaluate: sketch the region, classify it as Type I or Type II (or a union of such regions that overlap only on boundaries), then set up the iterated integral.
Common confusion: Type I and Type II differ in which variable has constant limits and which has function limits—don't mix up the order of integration.
Applications remain the same: volumes, areas, and average values work just like they do for rectangular regions.

📐 What is a general bounded region

📐 Definition and key idea

A general bounded region D on the plane is a region that can be enclosed inside a rectangular region.

This definition extends double integrals beyond simple rectangles.
The excerpt emphasizes that we use the idea of enclosing the region in a rectangle to define the double integral over it.
Example: a circular disk, a triangle, or any irregular shape that fits inside some rectangle can be a general bounded region.

🔗 Connection to rectangular regions

The excerpt states we can "use this idea to define a double integral over a general bounded region."
In other words, the rectangular-region theory is the foundation; general regions build on it by restriction or extension.

🔀 Type I and Type II regions

🔀 Two ways to describe a region

The excerpt mentions expressing a region "as a Type I or as a Type II region."

Region type	Iterated integral form	What it means
Type I	integral from a to b, then from g₁(x) to g₂(x) of f(x,y) dy dx	Outer integral has constant x-limits; inner integral has y-limits that depend on x
Type II	integral from c to d, then from h₁(y) to h₂(y) of f(x,y) dx dy	Outer integral has constant y-limits; inner integral has x-limits that depend on y

The excerpt shows the formulas at the top: Type I integrates dy first (with limits depending on x), Type II integrates dx first (with limits depending on y).
Don't confuse: the order of integration and which variable has function limits are the key differences.

🧩 Unions of regions

The excerpt notes that a region can be "a union of several Type I or Type II regions that overlap only on their boundaries."
This means if a single region is too complicated, you can split it into simpler pieces, integrate over each, and add the results.
The boundary-overlap condition ensures no double-counting.

🛠️ How to evaluate the integral

🛠️ Step-by-step process

The excerpt gives a clear procedure:

Sketch the region: visualize the general bounded region D.
Classify the region: decide whether it is Type I, Type II, or a union of such regions.
Set up the iterated integral: use the appropriate limits (constant for one variable, functions for the other).
Evaluate: compute the iterated integral as usual.

Example: if the region is bounded above and below by curves y = g₁(x) and y = g₂(x) for x from a to b, it is Type I; integrate dy from g₁(x) to g₂(x), then dx from a to b.
Example: if the region is bounded left and right by curves x = h₁(y) and x = h₂(y) for y from c to d, it is Type II; integrate dx from h₁(y) to h₂(y), then dy from c to d.

⚠️ Common pitfall

Don't confuse the order: Type I means the inner integral is with respect to y (so y-limits depend on x), and Type II means the inner integral is with respect to x (so x-limits depend on y).
The excerpt's formulas make this explicit: Type I ends with "dy dx" and Type II ends with "dx dy."

📊 Applications over general regions

📊 Same uses as rectangular regions

The excerpt states:

"We can use double integrals to find volumes, areas, and average values of a function over general regions, similarly to calculations over rectangular regions."

Volume: the double integral of f(x,y) over D gives the volume under the surface z = f(x,y) above the region D.
Area: the double integral of 1 over D gives the area of D.
Average value: the double integral of f divided by the area of D gives the average value of f over D.
The key point: the techniques and interpretations are the same; only the region's shape and the integral's limits change.

🔧 Improper integrals

The excerpt mentions:

"We can use Fubini's theorem for improper integrals to evaluate some types of improper integrals."

This extends the iterated-integral approach to cases where the region or the function has unbounded behavior.
Fubini's theorem allows switching the order of integration, which can simplify evaluation even when limits go to infinity or the integrand has singularities.

Double Integrals in Polar Coordinates

5.3 Double Integrals in Polar Coordinates

🧭 Overview

🧠 One-sentence thesis

Polar coordinates provide a convenient alternative coordinate system for evaluating double integrals when the region or function exhibits circular symmetry, by transforming the area element and integration bounds accordingly.

📌 Key points (3–5)

When to use polar coordinates: situations with circular symmetry make polar coordinates more convenient than rectangular coordinates.
Key transformation for area: the area element dA in rectangular coordinates becomes r dr dθ in polar coordinates (the extra r factor is crucial).
Coordinate conversion formulas: x = r cos θ, y = r sin θ for converting functions, and r² = x² + y² for converting regions.
Integration regions: polar integrals can be evaluated over polar rectangular regions or general polar regions using iterated integrals.
Common confusion: don't forget the r factor in the area element—it's not just dr dθ but r dr dθ.

🔄 When and why to use polar coordinates

🎯 Circular symmetry advantage

Polar coordinates are often convenient when dealing with situations that have circular symmetry.
The excerpt emphasizes this is a choice of convenience, not necessity—you can still use rectangular coordinates, but polar may simplify the work.
Example: regions that are circles, sectors, or annuli are naturally described by bounds on r and θ rather than complex x and y inequalities.

🔁 Similar structure to rectangular integrals

Polar double integrals use an iterated integral structure similar to those used with rectangular double integrals.
You still integrate over a region, but the region is described in terms of r (radius) and θ (angle) instead of x and y.

📐 The polar area element transformation

📏 The crucial r factor

The area element dA in polar coordinates becomes r dr dθ.

This is the most important transformation rule for polar integrals.
In rectangular coordinates, a small rectangle has area dA = dx dy.
In polar coordinates, a small "polar rectangle" (bounded by two radii and two circular arcs) has area dA = r dr dθ.
Don't confuse: the area element is not simply dr dθ; the extra r factor accounts for the geometry of polar coordinates (arcs get longer as radius increases).

🧮 Double integral over a polar rectangular region

The excerpt defines the double integral over a polar rectangular region R as a limit of sums.
The sum involves f(r*, θ*) multiplied by the area element, which includes the r* factor: r* Δr Δθ.
This shows why the r appears: it's built into the geometry of how we partition the polar region.

🔀 Converting between coordinate systems

🔀 Rectangular to polar conversion

The excerpt provides three key formulas for converting an integral from rectangular to polar coordinates:

What to convert	Formula
x-coordinate	x = r cos θ
y-coordinate	y = r sin θ
Area element	dA = r dr dθ

To convert an integral in rectangular coordinates to polar coordinates, substitute these three transformations.
Example: if you have an integral of f(x, y) dx dy, replace x with r cos θ, y with r sin θ, and dx dy with r dr dθ.

🔄 Polar to rectangular (for regions)

The excerpt also mentions r² = x² + y².
This formula is useful for converting region descriptions: a circle x² + y² = 4 becomes r² = 4, or r = 2 in polar coordinates.
This simplifies the bounds of integration when the region has circular boundaries.

🗺️ Integration over polar regions

📦 Polar rectangular regions

A polar rectangular region is bounded by constant values of r and θ.
The double integral becomes a limit of double sums as the partition gets finer (m, n → ∞).
The iterated integral structure allows you to integrate first with respect to one variable, then the other.

🌐 General polar regions

Double integral over a general polar region: the integral from θ = α to θ = β, and for each θ, r ranges from h₁(θ) to h₂(θ).

General polar regions allow the radius bounds to depend on the angle θ.
The iterated integral is written as: integrate f(r, θ) r dr dθ, where the inner integral is over r from h₁(θ) to h₂(θ), and the outer integral is over θ from α to β.
This is analogous to Type I and Type II regions in rectangular coordinates, but now the "height" of the region varies with angle instead of with x or y.
Example: a region bounded by two curves in polar form (like r = 1 and r = 2 + cos θ) can be integrated by letting r vary between these bounds for each fixed θ.

Triple Integrals

5.4 Triple Integrals

🧭 Overview

🧠 One-sentence thesis

Triple integrals extend double integrals to three dimensions by using Fubini's theorem to compute volumes, averages, and other properties of solids through iterated integration in any order.

📌 Key points (3–5)

Fubini's theorem for triple integrals: allows computing a triple integral over a rectangular box as an iterated integral in any of six possible orderings.
Volume computation: the volume of a general solid region E is found by integrating 1 over that region.
Order flexibility: interchanging the order of integration does not change the answer and can simplify computation.
Average value formula: the average of a function over a three-dimensional region uses the triple integral divided by the volume.
Common confusion: triple integrals work like double integrals but add a third variable—the same principles of iterated integration apply.

🧮 Fubini's theorem in three dimensions

🧮 The core theorem

Fubini's theorem for triple integrals: if f(x, y, z) is continuous on a rectangular box B = [a, b] × [c, d] × [e, f], then the triple integral over B equals the iterated integral in any order.

The theorem states that the triple integral of f(x, y, z) over box B can be written as an iterated integral.
One standard form: integrate first with respect to x from a to b, then y from c to d, then z from e to f.
The excerpt emphasizes "also equal to any of the other five possible orderings"—there are six total ways to order x, y, z.

🔄 Six possible orderings

Because there are three variables, there are 3! = 6 ways to arrange the integration order.
All six orderings give the same final answer for continuous functions.
Example: integrating dx dy dz, dx dz dy, dy dx dz, dy dz dx, dz dx dy, or dz dy dx all yield the same result.

📦 Computing volumes and averages

📦 Volume of a solid region

To compute the volume of a general solid bounded region E, use the triple integral V(E) = triple integral over E of 1 dV.

The integrand is simply 1 (the constant function).
Integrating 1 over the region accumulates the "amount of space" inside E.
This is the three-dimensional analogue of using a double integral to find area.

📊 Average value of a function

The average value of a function over a general three-dimensional region is f_ave = (1 / V(E)) × triple integral over E of f(x, y, z) dV.

First compute the volume V(E) of the region.
Then compute the triple integral of f over E.
Divide the integral by the volume to get the average.
This formula mirrors the average value formula for functions of one or two variables.

🔧 Practical techniques

🔧 Interchanging order of integration

The excerpt states: "Interchanging the order of the iterated integrals does not change the answer."
Why it matters: "interchanging the order of integration can help simplify the computation."
Some orderings may lead to easier integrals (simpler antiderivatives or cleaner bounds).
Don't confuse: changing order is always valid for continuous functions on rectangular boxes, but for general regions you must adjust the limits of integration accordingly.

🔧 Setting up iterated integrals

For a rectangular box, the limits are constants for each variable.
For general regions, the limits for inner integrals may depend on the outer variables.
The excerpt does not detail general regions for triple integrals, but the principle is the same as for double integrals over non-rectangular regions.

🌐 Connection to other coordinate systems

🌐 Cylindrical and spherical coordinates

The excerpt briefly mentions:

Coordinate system	Iterated integral form	Volume element
Cylindrical	Integrate over θ, r, z with limits depending on the region	r dz dr dθ
Spherical	Integrate over θ, ρ, φ with limits depending on the region	ρ² sin φ dφ dρ dθ

These are alternative ways to set up triple integrals when the region has circular or spherical symmetry.
The volume element changes: in cylindrical it includes an extra r factor; in spherical it includes ρ² sin φ.
Use these coordinate systems to simplify integrals over regions that are naturally described by angles and radii.

Triple Integrals in Cylindrical and Spherical Coordinates

5.5 Triple Integrals in Cylindrical and Spherical Coordinates

🧭 Overview

🧠 One-sentence thesis

Triple integrals can be evaluated in cylindrical or spherical coordinates by using iterated integrals with appropriate volume elements (r for cylindrical, ρ² sin φ for spherical), which simplifies computation when the region or function has circular or spherical symmetry.

📌 Key points (3–5)

Cylindrical coordinates: use the iterated integral with volume element r dz dr dθ to evaluate triple integrals when the region has circular symmetry around an axis.
Spherical coordinates: use the iterated integral with volume element ρ² sin φ dφ dρ dθ when the region or function has spherical symmetry.
Volume elements differ: cylindrical uses r, spherical uses ρ² sin φ—these factors account for the geometry of the coordinate system.
Common confusion: the extra factors (r or ρ² sin φ) are not optional; they are required to correctly measure volume in non-Cartesian coordinates.
When to use which: cylindrical is convenient for circular symmetry (e.g., around the z-axis); spherical is convenient for symmetry around a point.

🔄 Cylindrical coordinate integration

🔄 The iterated integral form

To evaluate a triple integral in cylindrical coordinates, use the iterated integral: integral from θ = α to θ = β, integral from r = g₁(θ) to r = g₂(θ), integral from z = u₁(r, θ) to z = u₂(r, θ) of f(r, θ, z) r dz dr dθ.

The function f is expressed in terms of r, θ, and z instead of x, y, z.
The limits of integration describe the region in cylindrical coordinates: θ ranges over angles, r over radii, z over heights.

📐 The volume element r dz dr dθ

The factor r appears in the integrand as part of the volume element.
This r accounts for the "stretching" of volume as you move away from the z-axis in polar-like coordinates.
Don't confuse: this r is not part of the original function f; it is required by the coordinate system geometry.
Example: even if f(r, θ, z) = 1 (constant), you still integrate r dz dr dθ to find volume.

🎯 When to use cylindrical coordinates

Cylindrical coordinates are most useful when the region or function has circular symmetry around an axis (typically the z-axis).
The excerpt mentions this is "often convenient" for such symmetry, similar to how polar coordinates simplify double integrals with circular regions.

🌐 Spherical coordinate integration

🌐 The iterated integral form

To evaluate a triple integral in spherical coordinates, use the iterated integral: integral from θ = α to θ = β, integral from ρ = g₁(θ) to ρ = g₂(θ), integral from φ = u₁(r, θ) to φ = u₂(r, θ) of f(ρ, θ, φ) ρ² sin φ dφ dρ dθ.

The function f is expressed in terms of ρ (distance from origin), θ (azimuthal angle), and φ (polar angle).
The limits describe the region in spherical coordinates: θ ranges over azimuthal angles, ρ over radii from the origin, φ over polar angles.

📐 The volume element ρ² sin φ dφ dρ dθ

The factor ρ² sin φ is the volume element in spherical coordinates.
This factor accounts for the geometry of spherical "shells" and "wedges" in three-dimensional space.
Don't confuse: ρ² sin φ is not part of the original function; it is required by the spherical coordinate system.
Example: to find the volume of a sphere, you would integrate ρ² sin φ dφ dρ dθ (with f = 1).

🎯 When to use spherical coordinates

Spherical coordinates are most useful when the region or function has spherical symmetry around a point (typically the origin).
The excerpt implies this choice parallels the use of polar coordinates for circular symmetry in double integrals.

🔀 Comparison of coordinate systems

Coordinate system	Volume element	Best for	Key factor
Cylindrical	r dz dr dθ	Circular symmetry around an axis	r
Spherical	ρ² sin φ dφ dρ dθ	Spherical symmetry around a point	ρ² sin φ

⚠️ Don't forget the volume element

Both cylindrical and spherical coordinates require an extra factor (r or ρ² sin φ) that does not appear in rectangular coordinates.
These factors are not optional; omitting them will give incorrect results.
The excerpt emphasizes these factors by including them explicitly in the iterated integral formulas.

🔗 Connection to earlier concepts

🔗 Building on polar coordinates

The excerpt references section 5.3 (Double Integrals in Polar Coordinates), where the area element dA becomes r dr dθ.
Cylindrical coordinates extend this idea to three dimensions by adding a z-coordinate and using r dz dr dθ.
The same principle applies: the extra r (or ρ² sin φ) corrects for the coordinate system's geometry.

🔗 Fubini's theorem in new coordinates

The excerpt mentions Fubini's theorem in section 5.4 for rectangular coordinates.
The iterated integrals in cylindrical and spherical coordinates follow the same principle: integrate one variable at a time, respecting the limits.
The order of integration can be changed, but the volume element must always be included.

Calculating Centers of Mass and Moments of Inertia

5.6 Calculating Centers of Mass and Moments of Inertia

🧭 Overview

🧠 One-sentence thesis

Double integrals allow us to calculate the mass, center of mass, and moments of a flat object (lamina) by integrating its density function over the region it occupies.

📌 Key points (3–5)

What a lamina is: a flat, two-dimensional object with a density function that may vary from point to point.
How to find mass: integrate the density function over the entire region using a double integral.
What moments measure: moments about the x-axis and y-axis capture how mass is distributed relative to those axes.
How to find the center of mass: use the moments divided by the total mass to get the coordinates x-bar and y-bar.
Common confusion: moments are not the center of mass itself—they are intermediate quantities; you must divide each moment by the mass to get the center coordinates.

🧱 Mass of a lamina

🧱 What a lamina is

Lamina R: a flat, two-dimensional region in the plane.

The excerpt describes a lamina as region R with a density function ρ(x, y) at any point (x, y).
Density may vary across the lamina—it is not necessarily uniform.
Example: a thin metal plate where some parts are thicker or denser than others.

⚖️ How to calculate mass

Mass m: the double integral of the density function over the region R.

Formula (in words): mass equals the double integral over R of ρ(x, y) dA.
The area element dA represents a tiny piece of the region; multiplying by density gives the mass of that piece.
Summing (integrating) over all pieces gives the total mass.
Don't confuse: mass is the total quantity; it is a single number, not a function.

📐 Moments about the axes

📐 What moments measure

Moment about the x-axis (M_x): the double integral over R of y ρ(x, y) dA.

Moment about the y-axis (M_y): the double integral over R of x ρ(x, y) dA.

Moments capture how mass is distributed relative to an axis.
For M_x, each piece of mass is weighted by its y-coordinate (distance from the x-axis).
For M_y, each piece of mass is weighted by its x-coordinate (distance from the y-axis).
Example: if most of the mass is far from the x-axis (large y values), M_x will be large.

🔍 Why moments matter

Moments are intermediate quantities used to find the center of mass.
They are not the center of mass themselves; they must be divided by the total mass.
The excerpt shows the center of mass coordinates as y-bar equals M_x divided by m, and x-bar equals M_y divided by m.

🎯 Center of mass

🎯 What the center of mass represents

The center of mass (x-bar, y-bar) is the "balance point" of the lamina.
It is the weighted average position of all the mass.
The excerpt gives the formulas (in words):
- y-bar equals M_x divided by m
- x-bar equals M_y divided by m

🧮 How to compute it

Step	What to do	Formula (in words)
1. Find mass	Integrate density over R	m = double integral of ρ(x, y) dA
2. Find M_x	Integrate y times density over R	M_x = double integral of y ρ(x, y) dA
3. Find M_y	Integrate x times density over R	M_y = double integral of x ρ(x, y) dA
4. Divide	Divide each moment by mass	y-bar = M_x / m; x-bar = M_y / m

Don't confuse: the moment M_x involves y (not x), and M_y involves x (not y)—this is because each moment measures distance from the opposite axis.

🔗 Connection to other concepts

🔗 Relationship to double integrals over regions

The excerpt is part of a chapter on multiple integration.
Earlier sections covered double integrals over rectangular and general regions, and in polar coordinates.
The same techniques (Fubini's theorem, iterated integrals, choosing Type I or Type II regions) apply here.
Example: if the lamina is a circular disk, polar coordinates may simplify the calculation.

🔗 Moments of inertia

The section title mentions "moments of inertia," but the excerpt does not provide formulas or details for them.
The excerpt focuses on mass, moments about axes, and center of mass.
(No further information is given in the excerpt about moments of inertia.)

Change of Variables in Multiple Integrals

5.7 Change of Variables in Multiple Integrals

🧭 Overview

🧠 One-sentence thesis

A transformation allows us to convert integrals from one coordinate system to another by mapping regions through a change of variables, with the Jacobian determinant accounting for how area or volume scales during the transformation.

📌 Key points (3–5)

What a transformation does: maps a region in one coordinate system (like u,v space) to a region in another coordinate system (like x,y space) through a function.
One-to-one requirement: a transformation is one-to-one if no two different points map to the same image point, ensuring the mapping doesn't overlap.
The Jacobian's role: the absolute value of the Jacobian determinant measures how much area (in 2D) or volume (in 3D) gets stretched or compressed during the transformation.
Common confusion: the Jacobian is not just a formula to memorize—it represents the scaling factor that corrects for distortion when changing variables.
Why it matters: change of variables can transform difficult integrals into simpler ones by choosing a more natural coordinate system for the problem.

🔄 What Transformations Do

🔄 Definition of a transformation

A transformation T is a function that transforms a region G in one plane (space) into a region R in another plane (space) by a change of variables.

In two dimensions: T(u,v) = (x,y) maps points from the uv-plane to the xy-plane.
In three dimensions: T(u,v,w) = (x,y,z) maps points from uvw-space to xyz-space.
Think of it as a "coordinate translator" that converts between different ways of describing the same geometric region.

Example: Polar coordinates use transformation T(r,θ) = (r cos θ, r sin θ) to convert from polar to rectangular coordinates.

🎯 One-to-one transformations

A transformation T: G → R defined as T(u,v) = (x,y) (or T(u,v,w) = (x,y,z)) is said to be a one-to-one transformation if no two points map to the same image point.

One-to-one means the transformation is reversible—each point in the target region came from exactly one point in the source region.
This prevents "folding" or "overlapping" in the mapping.
Without this property, we might count some parts of the region multiple times when integrating.

Don't confuse: A transformation can be defined everywhere but still not be one-to-one if it maps different points to the same location.

📐 The Change of Variables Formula

📐 Two-dimensional case

If f is continuous on R, then:

The integral over region R in xy-coordinates equals the integral over region S in uv-coordinates
Formula: double integral over R of f(x,y) dA equals double integral over S of f(g(u,v), h(u,v)) times absolute value of ∂(x,y)/∂(u,v) du dv
Here x = g(u,v) and y = h(u,v) describe how to convert from uv to xy

The term ∂(x,y)/∂(u,v) is the Jacobian determinant.

📦 Three-dimensional case

If F is continuous on R, then:

Triple integral over R of F(x,y,z) dV equals triple integral over G of F(g(u,v,w), h(u,v,w), k(u,v,w)) times absolute value of ∂(x,y,z)/∂(u,v,w) du dv dw
This can also be written using J(u,v,w) to denote the Jacobian
The formula becomes: triple integral over G of H(u,v,w) times absolute value of J(u,v,w) du dv dw

🔍 Why the absolute value

The Jacobian can be negative, but area and volume are always positive
Taking the absolute value ensures we measure the magnitude of the scaling, not its sign
The sign of the Jacobian relates to orientation, but for computing integrals we only care about the size of the scaling factor

🧮 The Jacobian Determinant

🧮 What the Jacobian measures

The Jacobian determinant ∂(x,y)/∂(u,v) (in 2D) or ∂(x,y,z)/∂(u,v,w) (in 3D) quantifies how much the transformation stretches or compresses area or volume.

If the Jacobian has absolute value 2, then small regions double in area/volume under the transformation
If the Jacobian has absolute value 0.5, then small regions shrink to half their area/volume
The Jacobian varies from point to point, so different parts of the region may scale differently

🔢 Computing the Jacobian

In two dimensions, the Jacobian is a 2×2 determinant:

∂(x,y)/∂(u,v) is the determinant of the matrix with first row (∂x/∂u, ∂x/∂v) and second row (∂y/∂u, ∂y/∂v)

In three dimensions, the Jacobian is a 3×3 determinant:

∂(x,y,z)/∂(u,v,w) is the determinant of the matrix with rows containing the partial derivatives of x, y, and z with respect to u, v, and w

Example: For polar coordinates where x = r cos θ and y = r sin θ, the Jacobian is r (which is why we have the extra "r" factor in polar integrals).

⚠️ When the Jacobian is zero

If the Jacobian equals zero at some point, the transformation is not regular at that point:

The transformation may collapse a region to a lower dimension
The transformation may not be one-to-one near that point
We typically require the Jacobian to be nonzero throughout the region of integration

🎯 Strategy for Using Change of Variables

🎯 When to use it

Change of variables is most useful when:

The region of integration has a complicated shape in one coordinate system but a simple shape in another
The integrand simplifies significantly in a different coordinate system
Standard transformations (polar, cylindrical, spherical) don't quite fit but suggest a related transformation

📝 Steps to apply the method

Identify the transformation: determine functions x = g(u,v), y = h(u,v) (and z = k(u,v,w) in 3D)
Find the new region: determine what region G in uv-space corresponds to region R in xy-space
Compute the Jacobian: calculate the determinant ∂(x,y)/∂(u,v) and take its absolute value
Substitute and integrate: replace x and y in the integrand with g(u,v) and h(u,v), multiply by the absolute Jacobian, and integrate over G

🔗 Connection to familiar transformations

The general change of variables formula includes familiar cases as special instances:

Polar coordinates: Jacobian is r
Cylindrical coordinates: Jacobian is r
Spherical coordinates: Jacobian is ρ² sin φ

These are all applications of the general change of variables theorem with specific transformation functions.

Vector Fields

6.1 Vector Fields

🧭 Overview

🧠 One-sentence thesis

Vector fields assign vectors to points in space and are fundamental tools for describing distributed quantities like forces and velocities across regions, with conservative fields being those that can be expressed as gradients of scalar functions.

📌 Key points (3–5)

What a vector field is: an assignment of a vector to each point in a region of 2D or 3D space.
How to represent them: in 2D as F(x, y) = ⟨P(x, y), Q(x, y)⟩; in 3D as F(x, y, z) = ⟨P(x, y, z), Q(x, y, z), R(x, y, z)⟩.
Real-world applications: describing forces, velocities, and other vector quantities in physics, engineering, meteorology, and oceanography.
Conservative vs general fields: a vector field F is conservative if it equals the gradient of some scalar function f (i.e., ∇f = F); not all vector fields have this property.
How to visualize: sketch by examining the defining equation to determine relative magnitudes at different locations, then draw enough vectors to see a pattern.

📐 Definition and structure

📐 What a vector field assigns

Vector field: In ℝ², an assignment of a vector F(x, y) to each point (x, y) of a subset D of ℝ²; in ℝ³, an assignment of a vector F(x, y, z) to each point (x, y, z) of a subset D of ℝ³.

A vector field is not a single vector—it is a function that outputs a vector for every point in a region.
The domain D is a subset of the plane (2D) or space (3D).
Each point gets its own vector, which may vary in magnitude and direction.

🧮 Mathematical representation

In two dimensions:

F(x, y) = ⟨P(x, y), Q(x, y)⟩
Or equivalently: F(x, y) = P(x, y) i + Q(x, y) j
P and Q are scalar functions (component functions) that depend on position.

In three dimensions:

F(x, y, z) = ⟨P(x, y, z), Q(x, y, z), R(x, y, z)⟩
Or equivalently: F(x, y, z) = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k
P, Q, and R are the three component functions.

🌍 Applications and visualization

🌍 Where vector fields are used

The excerpt states that vector fields can describe the distribution of vector quantities such as:

Forces: how force varies across a region
Velocities: flow patterns in fluids or air

Common application areas include:

Physics
Engineering
Meteorology
Oceanography

Example: A velocity field in oceanography assigns a velocity vector (speed and direction) to each point in the ocean, showing how water flows at different locations.

🎨 How to sketch a vector field

The excerpt describes a practical approach:

Examine the defining equation to determine relative magnitudes in various locations.
Draw enough vectors to determine a pattern.

You don't need to draw a vector at every point—just enough to see the overall behavior.
Look at how the magnitude and direction change as you move through the region.
The pattern reveals the field's structure (e.g., circulation, sources, sinks).

🔄 Conservative vector fields

🔄 Definition and significance

Conservative vector field: A vector field F is called conservative if there exists a scalar function f such that ∇f = F.

"Conservative" means the field can be derived from a scalar potential function.
The gradient operator ∇ applied to f produces the vector field F.
Not all vector fields are conservative—this is a special property.

🔍 Why the distinction matters

The excerpt introduces conservative fields as a key concept but does not elaborate on consequences in this section. However, the distinction is important enough to be highlighted:

If F = ∇f, the field has a potential function.
This property affects line integrals and circulation (mentioned in later key equations).

Don't confuse: A general vector field vs. a conservative field. Conservative fields have the additional structure of being gradients; general fields may not have a scalar potential.

🧩 Related concepts

🧩 Unit vector fields

The excerpt mentions:

A vector field in which the magnitude of every vector is 1.

This is a special type of vector field where all vectors have been normalized.
Useful for showing direction without magnitude information.
Example: A unit tangent vector field along a curve shows direction of motion without speed.

🧩 Source-free vector fields

The excerpt briefly mentions:

If F = ⟨P, Q⟩ is a source-free vector field, then stream function g is a function such that P = g_y and Q = −g_x.

"Source-free" means the field has no points where vectors originate or terminate (no divergence).
Such fields can be described by a stream function g.
The components P and Q are related to partial derivatives of g.

Don't confuse: Source-free fields (related to divergence) with conservative fields (related to curl and gradients)—these are different properties.

Line Integrals

6.2 Line Integrals

🧭 Overview

🧠 One-sentence thesis

Line integrals extend the concept of single-variable integration to curves in space, allowing us to integrate scalar functions or vector fields along paths.

📌 Key points (3–5)

What line integrals generalize: they extend the notion of single-variable integrals to curves in two or three dimensions.
Two types: scalar line integrals (integrating a scalar function along a curve) and vector line integrals (integrating a vector field along a curve).
How vector line integrals work: they compute the integral of the dot product of a vector field F with the unit tangent vector T of curve C with respect to arc length.
Common confusion: scalar vs vector line integrals—scalar integrals measure accumulation of a function along a path; vector integrals measure work or circulation of a field along a path.
Key formula structures: scalar integrals use arc length ds; vector integrals use dot products with tangent vectors or differential elements dr.

🔢 Types of line integrals

📏 Scalar line integral

A scalar line integral is a surface integral in which the integrand is a scalar function.

Integrates a scalar function f(x, y, z) along a curve C.
The formula involves the arc length element ds.
Written in words: the integral from a to b of f evaluated at r(t) times the square root of the sum of squares of the derivatives of x, y, and z with respect to t, all multiplied by dt.
Example: computing the mass of a wire with varying density along its length.

⚡ Vector line integral

The vector line integral of vector field F along curve C is the integral of the dot product of F with unit tangent vector T of C with respect to arc length.

Integrates a vector field along a curve.
Measures how much the vector field "flows along" the curve.
The integral of F · ds equals the integral of F · T ds.
Can be computed as the integral from a to b of F(r(t)) · r'(t) dt.
Alternative form: the integral of P dx + Q dy + R dz along C.
Example: calculating work done by a force field on an object moving along a path.

🔄 Flux along a curve

Flux measures flow across (perpendicular to) a curve rather than along it.
Formula: the integral of F · n(t) divided by the norm of n(t), with respect to ds.
Computed as the integral from a to b of F(r(t)) · n(t) dt.
Don't confuse: circulation (vector line integral with tangent) vs flux (integral with normal vector).

🧮 Computing line integrals

🎯 Parametrization approach

Both scalar and vector line integrals require parametrizing the curve C as r(t) for t from a to b.
The curve is described by position vector r(t) = (x(t), y(t), z(t)).
Derivatives x'(t), y'(t), z'(t) appear in the formulas.

📐 Scalar line integral calculation

Evaluate the scalar function f at the parametrized curve points r(t).
Multiply by the magnitude of the velocity vector (arc length element).
The arc length element is the square root of the sum of squares of x'(t), y'(t), and z'(t).
Integrate with respect to the parameter t from a to b.

🎲 Vector line integral calculation

Two equivalent formulations exist:
1. Dot product form: integral of F(r(t)) · r'(t) dt from a to b
2. Component form: integral of P(r(t)) times dx/dt plus Q(r(t)) times dy/dt plus R(r(t)) times dz/dt, all integrated from a to b
The derivative r'(t) serves as the tangent vector to the curve.
Example: if F represents a force field and C is a path, the vector line integral gives the work done.

🔗 Special properties and theorems

🌟 Fundamental Theorem for Line Integrals

For gradient fields (conservative fields), the line integral depends only on endpoints.
Formula in words: the integral of the gradient of f dotted with dr equals f evaluated at r(b) minus f evaluated at r(a).
This is analogous to the Fundamental Theorem of Calculus for single-variable integrals.

⭕ Circulation in conservative fields

For a conservative field (where F = gradient of f) over a closed curve C that encloses a simply connected region:
The circulation (closed line integral) equals zero.
Written as: the closed integral of gradient of f · dr = 0.
Don't confuse: this only applies to conservative (gradient) fields, not all vector fields.

🌊 Connection to Green's theorem

Green's theorem relates line integrals around closed curves to double integrals over the enclosed region.
Circulation form: closed integral of P dx + Q dy equals the double integral of (Q_x - P_y) dA over region D.
Flux form: closed integral of F · dr equals the double integral of (Q_x - P_y) dA.
The boundary C must enclose the region D.

📚 Context in vector calculus

🧩 Vector field definitions

Dimension	Vector field definition
Two-dimensional (ℝ²)	Assignment of vector F(x, y) to each point (x, y) in subset D; written as ⟨P(x, y), Q(x, y)⟩ or P(x, y)i + Q(x, y)j
Three-dimensional (ℝ³)	Assignment of vector F(x, y, z) to each point (x, y, z) in subset D; written as ⟨P(x, y, z), Q(x, y, z), R(x, y, z)⟩ or with i, j, k components

🔧 Related concepts

Conservative field: a vector field F for which there exists a scalar function f such that the gradient of f equals F.
Unit vector field: a vector field in which the magnitude of every vector is 1.
Riemann sum foundation: line integrals are defined similarly to single-variable integrals, using Riemann sums as the underlying concept.

Conservative Vector Fields

6.3 Conservative Vector Fields

🧭 Overview

🧠 One-sentence thesis

Conservative vector fields allow line integrals to be calculated using only endpoint values of a potential function, making them path-independent and causing their circulation over closed curves to vanish.

📌 Key points (3–5)

Fundamental Theorem for Line Integrals: generalizes the single-variable Fundamental Theorem of Calculus to higher dimensions, enabling easier calculation of line integrals for conservative fields.
Path independence: the line integral of a conservative field depends only on the potential function's values at the endpoints, not on the path taken.
Testing for conservativeness: use the cross-partial property; if a field has this property and the domain is simply connected, the field is conservative.
Common confusion: domain requirements matter—curves may need to be closed, simple, or both; regions may need to be connected or simply connected for theorems to apply.
Circulation property: circulation of a conservative vector field over a closed curve in a simply connected domain equals zero.

🔑 What makes a field conservative

🔑 Definition and structure

A vector field F is called conservative if there exists a scalar function f such that the gradient of f equals F.

In symbols: ∇f = F, where f is the potential function.
The field is "conservative" because it can be derived from a single scalar function.
Example: if you know the potential function f, you can recover the entire vector field F by taking its gradient.

🧪 The cross-partial property test

The excerpt states: "we can test whether F is conservative by using the cross-partial property."
How it works: if F has the cross-partial property and the domain is simply connected, then F is conservative.
This test provides a practical way to determine conservativeness without first finding a potential function.
Don't confuse: having the cross-partial property alone is not enough—the domain must also be simply connected.

🧮 Fundamental Theorem for Line Integrals

🧮 The theorem

The line integral of the gradient of f over a curve C equals the difference in f evaluated at the endpoints:

∫_C ∇f · dr = f(r(b)) − f(r(a))

r(a) and r(b) are the starting and ending points of the curve.
This is a direct generalization of the single-variable Fundamental Theorem of Calculus to higher dimensions.

⚡ Why it matters

Easier calculation: instead of integrating along the entire curve, you only need to evaluate the potential function at two points.
Path independence: the integral depends only on the endpoints, not on the specific path taken between them.
Example: if you know f at the start and end of a curve, you can immediately compute the line integral without parameterizing the curve.

🔄 Path independence and circulation

🔄 Path independence explained

The excerpt emphasizes: "Conservative fields are independent of path."
What this means: the line integral from point A to point B has the same value regardless of which curve you choose to connect them.
This property follows directly from the Fundamental Theorem for Line Integrals—only endpoint values matter.

🔁 Circulation over closed curves

Key result: "The circulation of a conservative vector field on a simply connected domain over a closed curve is zero."
In symbols: ∮_C ∇f · dr = 0, where C is a closed curve enclosing a simply connected region.
Why: if you start and end at the same point, f(r(b)) − f(r(a)) = 0 because r(a) = r(b).
Don't confuse: this zero-circulation property only holds when the domain is simply connected (no holes).

🛠️ Finding potential functions

🛠️ Problem-Solving Strategy

The excerpt mentions: "If F is conservative, we can find a potential function by using the Problem-Solving Strategy."
Process overview:
1. Verify that F has the cross-partial property and the domain is simply connected.
2. Use the strategy (details in the referenced section) to construct the potential function f.
Once you have f, you can use the Fundamental Theorem for Line Integrals to evaluate line integrals easily.

🧩 Domain requirements

Requirement	When needed	Why it matters
Closed curve	For circulation theorems	Ensures the path returns to the starting point
Simple curve	For certain theorems	Avoids self-intersections that complicate analysis
Connected domain	For path independence	Ensures any two points can be joined by a curve
Simply connected domain	For conservative field tests and circulation	No holes means the cross-partial property guarantees conservativeness

The excerpt warns: "The theorems in this section require curves that are closed, simple, or both, and regions that are connected or simply connected."
Don't confuse: a connected domain may still have holes; simply connected means no holes.

Green's Theorem

6.4 Green’s Theorem

🧭 Overview

🧠 One-sentence thesis

Green's theorem connects line integrals around a closed curve to double integrals over the region the curve encloses, providing a powerful tool for converting between circulation/flux along boundaries and area integrals of partial derivatives.

📌 Key points (3–5)

What Green's theorem does: relates a line integral around a closed boundary curve to a double integral over the enclosed region.
Two forms: circulation form (for work integrals) and flux form (for flux integrals), both involving partial derivatives Q_x minus P_y.
Key requirement: the curve C must be the boundary of region D; the theorem links what happens on the boundary to what happens inside.
Common confusion: Green's theorem is a special case in two dimensions; it generalizes to Stokes' theorem (for surfaces in 3D) and the divergence theorem (for volumes).
Why it matters: simplifies calculations by converting difficult line integrals into area integrals (or vice versa), and reveals deep connections between boundary behavior and interior properties.

🔄 The two forms of Green's theorem

🔄 Circulation form

Green's theorem, circulation form: the line integral around C of P dx + Q dy equals the double integral over D of (Q_x − P_y) dA, where C is the boundary of D.

What it says in words: the circulation (line integral) of a vector field F = (P, Q) around a closed curve equals the double integral of the expression "Q_x minus P_y" over the region inside.
The left side is a line integral along the boundary; the right side is an area integral over the interior.
The partial derivatives Q_x and P_y measure how the field components change in perpendicular directions.
Example: to find the work done by a force field around a closed loop, you can instead integrate the difference of partial derivatives over the enclosed area.

💨 Flux form

Green's theorem, flux form: the line integral around C of F · dr equals the double integral over D of (Q_x − P_y) dA, where C is the boundary of D.

What it says in words: the flux of a vector field across the boundary curve equals the double integral of the same expression (Q_x − P_y) over the region.
The flux form measures how much of the field flows outward across the boundary.
The formula structure is identical to the circulation form, but the interpretation differs: flux measures crossing rate, circulation measures tendency to move along the curve.
Don't confuse: both forms use the same partial derivative expression (Q_x − P_y), but they apply to different physical interpretations (flux vs circulation).

🔗 Extended version

Green's theorem, extended version: the line integral around the boundary ∂D of F · dr equals the double integral over D of (Q_x − P_y) dA.

Uses the notation ∂D to emphasize "boundary of D."
This is the same theorem, just written with more formal boundary notation.
Reinforces that the curve C must be the complete boundary of the region D.

🧮 Connection to curl and divergence

🌀 Curl in two dimensions

Curl: ∇ × F = (R_y − Q_z) i + (P_z − R_x) j + (Q_x − P_y) k

In two dimensions (where the field F = (P, Q) has no z-component), the curl simplifies.
The expression (Q_x − P_y) that appears in Green's theorem is the k-component of the curl.
Why this matters: Green's theorem can be viewed as saying "the circulation around a boundary equals the integral of the curl over the interior."
This connects Green's theorem to the concept of rotation or "spinning" of the field inside the region.

📐 Divergence

Divergence: ∇ · F = P_x + Q_y + R_z

Divergence measures how much a field spreads out or converges at a point.
In the flux form of Green's theorem, the expression (Q_x − P_y) relates to how the field behaves across the boundary.
The divergence theorem (listed in the excerpt) generalizes this idea to three dimensions: the flux through a closed surface equals the integral of divergence over the enclosed volume.

🔗 Relationship to other theorems

🔗 Fundamental Theorem for Line Integrals

Fundamental Theorem for Line Integrals: the line integral of ∇f · dr equals f(r(b)) − f(r(a)).

This theorem says that for a conservative field (one that is the gradient of a potential function), the line integral depends only on the endpoints.
Connection to Green's theorem: if F is conservative (F = ∇f), then the circulation around any closed curve is zero, which is consistent with Green's theorem when the curl (Q_x − P_y) is zero.

🔄 Conservative fields and circulation

Circulation of a conservative field over curve C that encloses a simply connected region: the closed line integral of ∇f · dr equals 0.

A conservative field has zero circulation around any closed curve in a simply connected region.
Green's theorem explains why: if F = ∇f, then the curl (Q_x − P_y) is zero everywhere, so the double integral over D is zero.
Don't confuse: this is a special case of Green's theorem, not a separate result; it follows when the field is conservative.

🌐 Generalizations to higher dimensions

The excerpt lists:

Stokes' theorem: the line integral of F · dr around a curve C equals the surface integral of curl F · dS over a surface S bounded by C.
Divergence theorem: the triple integral of div F over a volume E equals the surface integral of F · dS over the boundary surface S.

Theorem	Dimension	What it connects
Green's theorem	2D	Line integral around boundary ↔ area integral of (Q_x − P_y)
Stokes' theorem	3D surfaces	Line integral around boundary ↔ surface integral of curl
Divergence theorem	3D volumes	Surface integral over boundary ↔ volume integral of divergence

Green's theorem is the 2D case; Stokes' and divergence theorems extend the idea to surfaces and volumes.
All three theorems share the same core idea: relate what happens on a boundary to what happens in the interior.

🧰 Practical use and calculation

🧰 When to use Green's theorem

Simplification: if a line integral around a closed curve is difficult to compute directly, convert it to a double integral over the enclosed region (or vice versa).
Requirement: the curve C must be closed and must be the boundary of region D.
Example: to find the circulation of a field around a complicated closed path, compute the area integral of (Q_x − P_y) instead.

🧰 Key identities involving curl and divergence

The excerpt lists two important identities:

Divergence of curl is zero: ∇ · (∇ × F) = 0.
Curl of a gradient is the zero vector: ∇ × (∇f) = 0.

These identities explain why:

Conservative fields (which are gradients) have zero curl, so their circulation around closed curves is zero.
Fields that are curls of other fields have zero divergence, which has implications for flux integrals.

🧰 Flux and circulation definitions

Flux: measures the rate at which a field crosses a given line (in 2D) or surface (in 3D).
Circulation: measures the tendency of a field to move in the same direction as a given closed curve.
Green's theorem provides formulas for both: the circulation form uses the line integral of P dx + Q dy, and the flux form uses the line integral of F · dr, both equal to the same area integral of (Q_x − P_y).

Divergence and Curl

6.5 Divergence and Curl

🧭 Overview

🧠 One-sentence thesis

Divergence and curl are two fundamental operations on vector fields—divergence measures how much a field flows outward at a point, while curl measures the tendency of the field to rotate around that point.

📌 Key points (3–5)

Divergence is scalar: it measures "outflowing-ness" of a vector field at a point (outflow minus inflow).
Curl is a vector: it measures the tendency of particles to rotate about an axis pointing in the curl's direction.
Physical interpretation: if v is a fluid velocity field, divergence at a point is the net outflow of fluid; curl captures rotational tendency.
Conservative field test: a vector field with a simply connected domain is conservative if and only if its curl is zero.
Common confusion: divergence produces a number (scalar), curl produces a direction (vector)—they measure different aspects of field behavior.

🌊 Divergence: measuring outflow

🌊 What divergence is

Divergence of a vector field: a scalar function that measures the "outflowing-ness" of a vector field.

Divergence is not a vector; it is a single number at each point.
It captures how much the field is "spreading out" or "converging in" at that location.

💧 Fluid interpretation

If v is the velocity field of a fluid, the divergence of v at a point tells you:
- Outflow of fluid minus inflow at that point.
Positive divergence → more fluid leaving than entering (source behavior).
Negative divergence → more fluid entering than leaving (sink behavior).
Zero divergence → balanced flow (no net creation or destruction of fluid).

Example: Imagine a point in a flowing river—if water spreads out from that point in all directions, divergence is positive; if water converges toward it, divergence is negative.

🌀 Curl: measuring rotation

🌀 What curl is

Curl of a vector field: a vector field that measures the tendency of particles to rotate about an axis.

Curl is not a scalar; it is a vector at each point.
The direction of the curl vector indicates the axis of rotation.
The magnitude indicates the strength of the rotational tendency.

🔄 Rotational tendency at a point

The curl at point P measures how particles near P tend to rotate.
The axis of rotation points in the direction of the curl vector at P.
Example: If you place a tiny paddle wheel in a fluid at point P, the curl tells you which way the wheel's axis would point and how fast it would spin.

Don't confuse: curl measures local rotation (spinning around a point), not overall circulation around a large loop.

🧭 Curl and conservative fields

🧭 The zero-curl test

A vector field F with a simply connected domain is conservative if and only if its curl is zero.
This is a key diagnostic: if you compute the curl and find it is zero everywhere, the field is conservative (path-independent).
Conversely, if the curl is nonzero anywhere, the field is not conservative.

🔗 Connection to earlier concepts

Recall from section 6.3: conservative fields are path-independent and have potential functions.
The zero-curl condition is the cross-partial property in vector form.
Example: If curl(F) = 0 in a simply connected region, then line integrals of F depend only on endpoints, not on the path taken.

📊 Divergence vs curl comparison

Property	Divergence	Curl
Output type	Scalar (a number)	Vector (has direction and magnitude)
What it measures	Outflowing-ness (net outflow minus inflow)	Rotational tendency (spinning around an axis)
Physical meaning (fluid)	Net rate of fluid leaving a point	Tendency of fluid to rotate at a point
Key application	Detecting sources/sinks	Testing if a field is conservative (curl = 0)

Don't confuse: both are properties of vector fields, but they capture completely different behaviors—one is about expansion/contraction, the other about rotation.

Surface Integrals

6.6 Surface Integrals

🧭 Overview

🧠 One-sentence thesis

Surface integrals extend line integrals to one higher dimension by integrating over two-parameter surfaces instead of one-parameter curves, allowing calculation over scalar functions or vector fields on oriented or non-oriented surfaces.

📌 Key points (3–5)

Dimension extension: Surface integrals are to line integrals what line integrals are to single-variable integrals—each step adds one dimension to the domain of integration.
Parameterization: Surfaces require two parameters to describe them, unlike curves which need only one parameter.
Orientation: Some surfaces can be oriented (given a consistent direction), but others like the Möbius strip cannot be oriented at all.
Common confusion: Don't confuse the number of parameters (surfaces need two) with the dimension of the ambient space (surfaces can live in 3D space).
Two types of integrands: Surface integrals can integrate scalar functions or vector fields, similar to how line integrals have scalar and vector versions.

📐 Parameterization and dimension

📐 How surfaces are parameterized

Surfaces can be parameterized, just as curves can be parameterized. In general, surfaces must be parameterized with two parameters.

A curve needs one parameter (like t) to trace out points along its length.
A surface needs two parameters (like u and v) to cover the entire two-dimensional surface.
This is the key structural difference: moving from 1D curves to 2D surfaces requires an additional parameter.

🔢 The dimensional ladder

The excerpt builds a hierarchy of integration domains:

Integral type	Domain	Dimension of domain
Single-variable	Line segment on x-axis	1D (one parameter)
Line integral	Curve in plane or space	1D (one parameter)
Surface integral	Surface in plane or space	2D (two parameters)

Each step up adds one dimension to the domain of integration.
The ambient space (where the domain lives) can be higher-dimensional, but the domain itself has its own intrinsic dimension.

🧭 Orientation of surfaces

🧭 What orientation means

Surfaces can sometimes be oriented, just as curves can be oriented.

Orientation gives a surface a consistent "direction" or "side."
Just as a curve can be traversed in two directions, a surface can have two sides (like "inside" and "outside" of a sphere).
Not all surfaces can be oriented—the structure of the surface itself determines whether orientation is possible.

🎀 The Möbius strip exception

Some surfaces, such as a Möbius strip, cannot be oriented.

The Möbius strip is the key example of a non-orientable surface.
If you try to assign a consistent "side" to a Möbius strip, you find that following the surface continuously brings you back to the opposite side.
Don't confuse: A surface being "in 3D space" does not guarantee it can be oriented; orientation depends on the surface's topology, not where it lives.

🧮 Structure of surface integrals

🧮 What a surface integral integrates over

A surface integral is like a line integral in one higher dimension. The domain of integration of a surface integral is a surface in a plane or space, rather than a curve in a plane or space.

The domain is no longer a one-dimensional curve but a two-dimensional surface.
This is the defining characteristic: you are "adding up" contributions across an entire surface, not just along a curve.

📊 Two types of integrands

The integrand of a surface integral can be a scalar function or a vector field.

The excerpt distinguishes two cases:

Integrand type	What it represents	Calculation method
Scalar function	A value at each point on the surface	Use Equation 6.19 (mentioned)
Vector field	A vector at each point on the surface	(Method not detailed in this excerpt)

Scalar surface integrals: integrate a scalar function over the surface (like integrating temperature over a sheet of metal).
Vector surface integrals: integrate a vector field over the surface (like calculating flux through a surface).
Example: If you want to find the total mass of a curved membrane with varying density, you would use a scalar surface integral; if you want to measure how much fluid flows through a surface, you would use a vector surface integral.

🔗 Connection to line integrals

The excerpt emphasizes the parallel structure:

Line integrals also come in two types: scalar line integrals (for mass of a wire) and vector line integrals (for work done).
Surface integrals follow the same pattern but in one higher dimension.
The choice between scalar and vector depends on what physical or mathematical quantity you want to calculate, not on the surface itself.

Stokes' Theorem

6.7 Stokes’ Theorem

🧭 Overview

🧠 One-sentence thesis

Stokes' theorem connects a flux integral over a surface to a line integral around the surface's boundary, serving as a higher-dimensional version of Green's theorem and the Fundamental Theorem of Calculus.

📌 Key points (3–5)

What Stokes' theorem relates: a flux integral over a surface to a line integral around the boundary of that surface.
How it extends other theorems: it is a higher-dimensional version of Green's theorem, which itself is a higher-dimensional version of the Fundamental Theorem of Calculus.
Practical use: transforms difficult surface integrals into easier line integrals, or vice versa.
Simplification strategy: line integrals can be evaluated using the simplest surface that has the given boundary curve.
Real-world application: Stokes' theorem can be used to derive Faraday's law, which relates the curl of an electric field to the rate of change of the magnetic field.

🔗 The fundamental relationship

🔗 What Stokes' theorem connects

Stokes' theorem relates a flux integral over a surface to a line integral around the boundary of the surface.

The surface integral measures flux (flow through the surface).
The line integral measures circulation around the boundary curve of that surface.
This is analogous to how Green's theorem relates a double integral over a region to a line integral around its boundary, but now in one higher dimension.

🧱 How it fits into the hierarchy of theorems

Fundamental Theorem of Calculus: relates an integral over an interval to values at the boundary (endpoints).
Green's theorem: extends this to two dimensions—relates a double integral over a region to a line integral around the boundary.
Stokes' theorem: extends Green's theorem to three dimensions—relates a surface integral to a line integral around the surface's boundary.
The excerpt emphasizes that Stokes' theorem is "another version of the Fundamental Theorem of Calculus in higher dimensions."

🔄 Practical applications

🔄 Transforming integrals

Stokes' theorem allows you to convert between two types of integrals:
- A difficult surface integral → an easier line integral.
- A difficult line integral → an easier surface integral.
The choice depends on which form is simpler to compute in a given problem.

🎯 Choosing the simplest surface

When evaluating a line integral around a boundary curve C, you can use any surface that has C as its boundary.
Strategy: pick the simplest surface with that boundary to make the surface integral easier.
Example: if a curve C bounds multiple surfaces, choose the one with the easiest parameterization or simplest geometry.

⚡ Physical application

⚡ Faraday's law

Faraday's law relates the curl of an electric field to the rate of change of the corresponding magnetic field.
Stokes' theorem can be used to derive this law.
This shows how the theorem bridges pure mathematics and physics, connecting field behavior (curl) to boundary effects (circulation).

🧩 Context from related sections

🧩 Green's theorem (Section 6.4)

Green's theorem comes in two forms: circulation form (integrand F · T) and flux form (integrand F · N).
It transforms line integrals into double integrals or vice versa.
Stokes' theorem is the three-dimensional generalization of Green's theorem.

🧩 Divergence and curl (Section 6.5)

Divergence: a scalar function measuring "outflowing-ness" of a vector field.
Curl: a vector field measuring the tendency of particles to rotate about an axis.
A vector field with a simply connected domain is conservative if and only if its curl is zero.
Stokes' theorem involves curl because it relates surface flux to boundary circulation.

🧩 Surface integrals (Section 6.6)

Surfaces are parameterized with two parameters (unlike curves, which use one parameter).
Some surfaces can be oriented; others (like a Möbius strip) cannot.
Surface integrals can have scalar functions or vector fields as integrands.
The area of surface S is the integral over S of dS.
Stokes' theorem builds on surface integrals by connecting them to line integrals.

🧩 The divergence theorem (Section 6.8)

The divergence theorem relates a surface integral across a closed surface to a triple integral over the enclosed solid.
It is a higher-dimensional version of the flux form of Green's theorem.
Like Stokes' theorem, it can transform difficult integrals into easier ones and can be used to derive physical laws (Gauss' law in electrostatics).

Theorem	Relates	Dimension	Physical application
Green's theorem	Double integral ↔ line integral	2D	Source-free fields, conservative fields
Stokes' theorem	Surface integral ↔ line integral	3D	Faraday's law (electromagnetism)
Divergence theorem	Triple integral ↔ surface integral	3D	Gauss' law (electrostatics)

The Divergence Theorem

6.8 The Divergence Theorem

🧭 Overview

🧠 One-sentence thesis

The divergence theorem connects surface integrals over closed surfaces to triple integrals over the enclosed solid, extending the Fundamental Theorem of Calculus to higher dimensions and enabling easier computation of flux integrals and derivation of fundamental physical laws.

📌 Key points (3–5)

What it relates: A surface integral across a closed surface S to a triple integral over the solid region enclosed by S.
How it extends earlier theorems: The divergence theorem is a higher-dimensional version of the flux form of Green's theorem, which itself extends the Fundamental Theorem of Calculus.
Practical utility: It transforms difficult flux integrals into easier triple integrals (or vice versa), depending on which is simpler to compute.
Physical applications: The theorem can be used to derive Gauss's law, a fundamental principle in electrostatics.
Common confusion: Like Green's theorem, the divergence theorem requires a closed surface; it does not apply to open surfaces.

🔗 Connection to earlier theorems

🔗 The hierarchy of fundamental theorems

The excerpt places the divergence theorem in a progression:

1D: The Fundamental Theorem of Calculus relates an integral over an interval to values at the boundary (endpoints).
2D: Green's theorem (flux form) relates a double integral over a region to a line integral around its boundary.
3D: The divergence theorem relates a triple integral over a solid to a surface integral over its boundary.

Each step increases the dimension by one while preserving the core idea: an integral over a region equals an integral over its boundary.

🧩 Relationship to Green's theorem

The divergence theorem is a higher dimensional version of the flux form of Green's theorem.

Green's theorem (flux form) works in the plane with a closed curve as the boundary.
The divergence theorem works in space with a closed surface as the boundary.
Both involve divergence: Green's theorem uses 2D divergence; the divergence theorem uses 3D divergence.

🔄 Computational flexibility

🔄 Transforming integrals

The divergence theorem allows you to choose the easier computation:

If a flux integral over a closed surface S is difficult, convert it to a triple integral over the enclosed solid.
If a triple integral over a solid is difficult, convert it to a surface integral over the boundary.

Example scenario: Computing flux through a complicated closed surface might be hard, but if the divergence of the vector field is simple (e.g., a constant), the triple integral becomes straightforward.

⚠️ Closed surface requirement

Don't confuse: The divergence theorem applies only to closed surfaces (surfaces that completely enclose a solid region, like a sphere or cube). It does not work for open surfaces (like a hemisphere without its base).

🔬 Physical applications

⚡ Gauss's law

The excerpt states:

The divergence theorem can be used to derive Gauss' law, a fundamental law in electrostatics.

Gauss's law relates the electric flux through a closed surface to the charge enclosed.
The divergence theorem provides the mathematical machinery to connect the surface integral (flux) to a volume integral (total charge).
This demonstrates how abstract mathematical theorems underpin concrete physical laws.

🌊 Interpretation of divergence

Recall from section 6.5 (mentioned in the excerpt):

The divergence of a vector field is a scalar function. Divergence measures the "outflowing-ness" of a vector field.

If v is a velocity field of a fluid, divergence at a point measures net outflow minus inflow.
The divergence theorem then says: total divergence inside a region (sources minus sinks) equals net flux out through the boundary.
This makes physical sense: what flows out of a region must come from sources inside.

📐 Mathematical structure

📐 The theorem statement

Though the excerpt doesn't give the full formula, the context indicates:

The divergence theorem relates a surface integral across closed surface S to a triple integral over the solid enclosed by S.

In words: The flux of a vector field F through a closed surface S equals the triple integral of the divergence of F over the solid region enclosed by S.

🧮 When to use it

Situation	Action
Flux integral over closed surface is hard	Convert to triple integral using divergence theorem
Triple integral over solid is hard	Convert to surface integral using divergence theorem
Need to derive physical laws	Use divergence theorem to connect local properties (divergence) to global properties (flux)

Key insight: The theorem is bidirectional—it works both ways, so choose the direction that simplifies your problem.

The Divergence Theorem

🧭 Overview

🧠 One-sentence thesis

The divergence theorem connects surface integrals over closed surfaces to triple integrals over enclosed solids, extending the Fundamental Theorem of Calculus to three dimensions and enabling easier computation of flux integrals and derivation of fundamental physical laws.

📌 Key points (3–5)

What it relates: A surface integral across a closed surface S to a triple integral over the solid region enclosed by S.
How it extends earlier theorems: The divergence theorem is a higher-dimensional version of the flux form of Green's theorem, which itself extends the Fundamental Theorem of Calculus.
Practical utility: It transforms difficult flux integrals into easier triple integrals (or vice versa), depending on which is simpler to compute.
Physical applications: The theorem can be used to derive Gauss's law, a fundamental principle in electrostatics.
Common confusion: Like Green's theorem, the divergence theorem requires a closed surface; it does not apply to open surfaces.

🔗 Connection to earlier theorems

🔗 The hierarchy of fundamental theorems

The excerpt places the divergence theorem in a progression:

1D: The Fundamental Theorem of Calculus relates an integral over an interval to values at the boundary (endpoints).
2D: Green's theorem (flux form) relates a double integral over a region to a line integral around its boundary.
3D: The divergence theorem relates a triple integral over a solid to a surface integral over its boundary.

Each step increases the dimension by one while preserving the core idea: an integral over a region equals an integral over its boundary.

🧩 Relationship to Green's theorem

The divergence theorem is a higher dimensional version of the flux form of Green's theorem.

Green's theorem (flux form) works in the plane with a closed curve as the boundary.
The divergence theorem works in space with a closed surface as the boundary.
Both involve divergence: Green's theorem uses 2D divergence; the divergence theorem uses 3D divergence.

🔄 Computational flexibility

🔄 Transforming integrals

The divergence theorem allows you to choose the easier computation:

If a flux integral over a closed surface S is difficult, convert it to a triple integral over the enclosed solid.
If a triple integral over a solid is difficult, convert it to a surface integral over the boundary.

⚠️ Closed surface requirement

🔬 Physical applications

⚡ Gauss's law

The excerpt states:

The divergence theorem can be used to derive Gauss' law, a fundamental law in electrostatics.

Gauss's law relates the electric flux through a closed surface to the charge enclosed.
The divergence theorem provides the mathematical machinery to connect the surface integral (flux) to a volume integral (total charge).
This demonstrates how abstract mathematical theorems underpin concrete physical laws.

🌊 Interpretation of divergence

Recall from section 6.5 (mentioned in the excerpt):

The divergence of a vector field is a scalar function. Divergence measures the "outflowing-ness" of a vector field.

If v is a velocity field of a fluid, divergence at a point measures net outflow minus inflow.
The divergence theorem then says: total divergence inside a region (sources minus sinks) equals net flux out through the boundary.
This makes physical sense: what flows out of a region must come from sources inside.

📐 Mathematical structure

📐 The theorem statement

Though the excerpt doesn't give the full formula, the context indicates:

The divergence theorem relates a surface integral across closed surface S to a triple integral over the solid enclosed by S.

In words: The flux of a vector field F through a closed surface S equals the triple integral of the divergence of F over the solid region enclosed by S.

🧮 When to use it

Situation	Action
Flux integral over closed surface is hard	Convert to triple integral using divergence theorem
Triple integral over solid is hard	Convert to surface integral using divergence theorem
Need to derive physical laws	Use divergence theorem to connect local properties (divergence) to global properties (flux)

Key insight: The theorem is bidirectional—it works both ways, so choose the direction that simplifies your problem.

Second-Order Linear Equations

7.1 Second-Order Linear Equations

🧭 Overview

🧠 One-sentence thesis

Second-order linear differential equations require finding two linearly independent solutions for the homogeneous case, and for nonhomogeneous equations, the general solution combines the complementary solution with a particular solution.

📌 Key points (3–5)

Classification: Second-order differential equations can be linear or nonlinear, homogeneous or nonhomogeneous.
Homogeneous solution structure: The general solution is built from two linearly independent solutions combined with arbitrary constants.
Solution method depends on characteristic roots: Distinct real roots, repeated real roots, or complex conjugate roots each lead to different forms of the general solution.
Nonhomogeneous equations: Require both the complementary (homogeneous) solution and a particular solution.
Common confusion: Homogeneous vs nonhomogeneous—homogeneous means the right-hand side equals zero for all x; nonhomogeneous means it is nonzero for some x.

🔍 Equation types and classification

🔍 Linear second-order differential equation form

A second-order differential equation that can be written in the form: a₂(x) y″ + a₁(x) y′ + a₀(x) y = r(x).

The equation involves the second derivative y″, first derivative y′, and the function y itself.
Coefficients a₂(x), a₁(x), a₀(x) can depend on x.
The right-hand side r(x) determines whether the equation is homogeneous or nonhomogeneous.

🔀 Homogeneous vs nonhomogeneous

Type	Definition	Key characteristic
Homogeneous	r(x) = 0 for all x	Right-hand side is zero everywhere
Nonhomogeneous	r(x) ≠ 0 for some x	Right-hand side is nonzero at some points

The complementary equation is the homogeneous version: a₂(x) y″ + a₁(x) y′ + a₀(x) y = 0.
Don't confuse: A nonhomogeneous equation becomes homogeneous only when r(x) is identically zero, not just zero at isolated points.

⚙️ Constant coefficient equations

Special case: ay″ + by′ + cy = 0, where a, b, c are constants (not functions of x).
This form allows solution methods based on characteristic equations.

🧩 Solving homogeneous equations

🧩 Linear independence requirement

To find a general solution for a homogeneous second-order differential equation, we must find two linearly independent solutions.

Linear independence means: c₁ f₁(x) + c₂ f₂(x) + ⋯ + cₙ fₙ(x) = 0 for all x only when all constants are zero.
If y₁(x) and y₂(x) are linearly independent solutions, the general solution is: y(x) = c₁ y₁(x) + c₂ y₂(x).
The constants c₁ and c₂ are arbitrary and determined by initial or boundary conditions.

🔑 Characteristic equation method

For constant-coefficient equations, find the roots of the characteristic equation.
The form of the general solution depends on the nature of these roots:
- Distinct real roots: Lead to one form of solution.
- Single repeated real root: Lead to a different form.
- Complex conjugate roots: Lead to yet another form involving sines and cosines.

📐 Initial and boundary conditions

Initial conditions or boundary conditions specify the solution uniquely.
Exception: Sometimes there is no solution or infinitely many solutions depending on the conditions.

🎯 Solving nonhomogeneous equations

🎯 Two-part solution structure

The general solution to a nonhomogeneous equation is: y(x) = c₁ y₁(x) + c₂ y₂(x) + yₚ(x).

First part: c₁ y₁(x) + c₂ y₂(x) is the general solution to the complementary (homogeneous) equation.
Second part: yₚ(x) is any particular solution to the nonhomogeneous equation.
The particular solution contains no arbitrary constants.

🔧 Method of undetermined coefficients

A method that involves making a guess about the form of the particular solution, then solving for the coefficients in the guess.

Used when r(x) is a combination of polynomials, exponential functions, sines, and cosines.
Assume a solution in the same form as r(x).
Multiply by x as necessary until the assumed solution is linearly independent of the complementary solution.
Substitute the assumed solution into the differential equation to find the coefficients.

🔄 Variation of parameters

A method that involves looking for particular solutions in the form yₚ(x) = u(x) y₁(x) + v(x) y₂(x), where y₁ and y₂ are linearly independent solutions to the complementary equation.

Solve a system of equations to find u(x) and v(x).
This method works more generally than undetermined coefficients.

🌊 Physical applications

🌊 Simple harmonic motion

Motion described by the equation x(t) = c₁ cos(ωt) + c₂ sin(ωt), as exhibited by an undamped spring-mass system in which the mass continues to oscillate indefinitely.

Governed by the equation: x″ + ω² x = 0.
Alternative form: x(t) = A sin(ωt + φ).
The system oscillates without damping (no energy loss).

⚡ RLC series circuit

A complete electrical path consisting of a resistor, an inductor, and a capacitor; a second-order, constant-coefficient differential equation can be used to model the charge on the capacitor.

Circuit equation: L (d²q/dt²) + R (dq/dt) + (1/C) q = E(t).
L is inductance, R is resistance, C is capacitance, E(t) is voltage source.
This is a nonhomogeneous equation when E(t) ≠ 0.

🔨 Forced harmonic motion

Equation: mx″ + bx′ + kx = f(t).
Includes damping (bx′ term) and external forcing function f(t).
The solution approaches a steady-state solution in the long term, related to the forcing function.
Don't confuse: Simple harmonic motion (no damping, no forcing) vs forced harmonic motion (includes external force).

7.2 Nonhomogeneous Linear Equations

🧭 Overview

🧠 One-sentence thesis

To solve a nonhomogeneous linear second-order differential equation, you combine the general solution of the complementary (homogeneous) equation with any particular solution to the nonhomogeneous equation.

📌 Key points (3–5)

Two-part solution structure: the general solution to a nonhomogeneous equation equals the general solution to the complementary equation plus a particular solution.
What makes an equation nonhomogeneous: the equation has the form a₂(x)y″ + a₁(x)y′ + a₀(x)y = r(x), where r(x) is not zero for some x.
Method of undetermined coefficients: when r(x) is a combination of polynomials, exponentials, sines, and cosines, assume a solution in the same form as r(x) and solve for the coefficients.
Common confusion: the assumed solution must be linearly independent of the complementary solution—multiply by x as necessary until this is achieved.
Long-term behavior: the solution approaches the steady-state solution related to the forcing function.

🔍 Nonhomogeneous vs complementary equations

🔍 What makes an equation nonhomogeneous

A nonhomogeneous linear second-order differential equation: an equation that can be written in the form a₂(x)y″ + a₁(x)y′ + a₀(x)y = r(x), but r(x) ≠ 0 for some value of x.

The key difference is the right-hand side r(x).
When r(x) = 0 for all x, the equation is homogeneous (called the complementary equation).
The nonzero r(x) is often called the "forcing function."

🧩 The complementary equation

Complementary equation: a₂(x)y″ + a₁(x)y′ + a₀(x)y = 0

This is the homogeneous version of the nonhomogeneous equation.
You solve this first to get the general solution c₁y₁(x) + c₂y₂(x), where y₁ and y₂ are linearly independent solutions.
Don't confuse: the complementary equation is not the final answer; it is only one part of the full solution.

🧱 Structure of the general solution

🧱 Two-part formula

The general solution to the nonhomogeneous equation is:

y(x) = c₁y₁(x) + c₂y₂(x) + yₚ(x)

Where:

c₁y₁(x) + c₂y₂(x) is the general solution to the complementary equation.
yₚ(x) is any particular solution to the nonhomogeneous equation.

🔑 What is a particular solution

Particular solution yₚ(x): a solution of a differential equation that contains no arbitrary constants.

It is a single, specific function that satisfies the nonhomogeneous equation.
You only need to find one particular solution; any one will work.
Example: if the nonhomogeneous equation has r(x) = 3x², you look for a specific yₚ(x) that makes the equation true, without any c₁ or c₂.

🧷 Why this structure works

The complementary solution handles the homogeneous part (the "natural" behavior).
The particular solution handles the forcing function r(x).
Together, they form the complete solution to the nonhomogeneous problem.

🛠️ Method of undetermined coefficients

🛠️ When to use it

Use this method when r(x) is a combination of:
- Polynomials
- Exponential functions
- Sines
- Cosines

🎯 How the method works

Assume a solution form: guess that yₚ(x) has the same form as r(x).
Check linear independence: the assumed solution must be linearly independent of the general solution to the complementary equation.
Multiply by x if needed: if your guess is not linearly independent, multiply by x (or x², etc.) until it is.
Substitute and solve: plug the assumed yₚ(x) into the differential equation and solve for the unknown coefficients.

Example: if r(x) is a polynomial like 3x², assume yₚ(x) = Ax² + Bx + C and solve for A, B, and C.

⚠️ Linear independence requirement

Don't confuse: you cannot use a form that is already part of the complementary solution.
If your initial guess matches a term in c₁y₁(x) + c₂y₂(x), multiply by x until the guess is independent.
This ensures the particular solution adds new information, not redundant terms.

🔄 Variation of parameters (alternative method)

🔄 What it is

Variation of parameters: a method that involves looking for particular solutions in the form yₚ(x) = u(x)y₁(x) + v(x)y₂(x), where y₁ and y₂ are linearly independent solutions to the complementary equations, and then solving a system of equations to find u(x) and v(x).

This method works even when r(x) is not a simple combination of polynomials, exponentials, sines, and cosines.
You use the complementary solutions y₁ and y₂ as building blocks, but allow the "coefficients" u and v to be functions, not constants.

📊 Applications and long-term behavior

📊 Physical systems modeled by nonhomogeneous equations

System	Equation form	Meaning
Forced harmonic motion	mx″ + bx′ + kx = f(t)	Mass-spring system with external force f(t)
RLC series circuit	L(d²q/dt²) + R(dq/dt) + (1/C)q = E(t)	Charge q on a capacitor with voltage source E(t)

Both involve a forcing function on the right-hand side (f(t) or E(t)).
The solution describes how the system responds to external input.

🌊 Steady-state solution

Steady-state solution: in the long term, the solution approaches this solution related to the forcing function.

Over time, transient effects from initial conditions fade.
The system settles into a pattern driven by the forcing function.
Example: in forced harmonic motion, after initial oscillations die down, the motion follows the frequency and form of the external force.

🔁 Simple harmonic motion (related concept)

Simple harmonic motion: motion described by the equation x(t) = c₁cos(ωt) + c₂sin(ωt), as exhibited by an undamped spring-mass system in which the mass continues to oscillate indefinitely.

This is a special case when there is no forcing function and no damping (homogeneous equation).
The equation is x″ + ω²x = 0.
Don't confuse: simple harmonic motion is homogeneous; forced harmonic motion is nonhomogeneous.

Applications of Second-Order Differential Equations

7.3 Applications

🧭 Overview

🧠 One-sentence thesis

Second-order constant-coefficient differential equations model physical systems like spring-mass oscillators and RLC circuits, where the behavior depends on whether damping forces cause overdamping, critical damping, or underdamping.

📌 Key points (3–5)

Spring-mass systems: modeled by mx″ + bx′ + kx = f(t), where m is mass, b is damping coefficient, k is spring constant, and f(t) is external force.
Three damping regimes: the discriminant b² − 4mk determines whether the system is overdamped (no oscillation), critically damped (boundary case), or underdamped (decaying oscillation).
Common confusion: all three damping cases involve the same differential equation; only the relative size of b² versus 4mk changes the qualitative behavior.
Forced vs unforced motion: when f(t) ≠ 0, the solution splits into a transient part (dies out) and a steady-state part (governs long-term behavior).
RLC circuits: charge on a capacitor obeys the same mathematical form, with inductance L, resistance R, and capacitance C playing analogous roles.

🔧 Spring-mass system model

🔧 The governing equation

Spring-mass differential equation: mx″ + bx′ + kx = f(t)

m: mass of the object.
b: coefficient of the damping force (e.g., friction or air resistance).
k: spring constant (stiffness).
f(t): any net external forces acting on the system.

The equation comes from examining all forces on the spring-mass system.

🎯 Physical meaning of each term

mx″: inertia (mass times acceleration).
bx′: damping force proportional to velocity.
kx: restoring force from the spring (Hooke's law).
f(t): driving or external force.

Example: A mass hanging from a spring in air experiences gravity (external force), spring pull (restoring), and air drag (damping).

🌊 Damping regimes

🌊 Simple harmonic motion (b = 0)

When b = 0, there is no damping force.
The system exhibits simple harmonic motion: pure oscillation with constant amplitude.
The motion continues indefinitely without decay.

🔍 The discriminant b² − 4mk

The behavior when b ≠ 0 depends on the sign of b² − 4mk:

Condition	Regime	Behavior
b² − 4mk > 0	Overdamped	No oscillation; system returns to equilibrium slowly
b² − 4mk = 0	Critically damped	No oscillation; fastest return to equilibrium without overshooting
b² − 4mk < 0	Underdamped	Oscillatory behavior with amplitude decreasing over time

🛑 Overdamped (b² − 4mk > 0)

The system does not exhibit oscillatory behavior.
It slowly returns to equilibrium without crossing it.
High damping dominates; the system is "sluggish."

⚖️ Critically damped (b² − 4mk = 0)

The system does not oscillate.
It returns to equilibrium as quickly as possible without overshooting.
Boundary case: any slight reduction in damping would cause oscillation.

Don't confuse: critically damped is the threshold—just enough damping to prevent oscillation, but no more.

🌀 Underdamped (b² − 4mk < 0)

The system exhibits oscillatory behavior.
The amplitude of oscillations decreases over time due to damping.
The system eventually settles to equilibrium, but crosses it multiple times.

Example: A car's shock absorber is designed to be underdamped so it absorbs bumps with a few small bounces rather than one large overshoot or a slow sag.

🚀 Forced motion and steady-state

🚀 When f(t) ≠ 0

The external force f(t) drives the system continuously.
The solution to the differential equation is the sum of two parts:
- Transient solution: dies out over time.
- Steady-state solution: persists and governs the long-term behavior.

📈 Steady-state solution

The steady-state solution describes what the system does after initial transients have decayed.
It depends on the form and frequency of the external force f(t).
Example: A mass on a spring driven by a periodic force will eventually oscillate at the driving frequency, with amplitude and phase determined by the system parameters.

Don't confuse: the transient solution is part of the general solution but becomes negligible over time; the steady-state solution dominates in the long run.

⚡ RLC series circuit analogy

⚡ The circuit equation

RLC circuit differential equation: L(d²q/dt²) + R(dq/dt) + (1/C)q = E(t)

L: inductance.
R: resistance.
C: capacitance.
q: charge on the capacitor.
E(t): applied voltage (external electromotive force).

🔗 Correspondence with spring-mass systems

The RLC circuit equation has the same mathematical form as the spring-mass equation:

Spring-mass	RLC circuit	Role
m (mass)	L (inductance)	Inertia / energy storage
b (damping)	R (resistance)	Dissipation
k (spring constant)	1/C (inverse capacitance)	Restoring force / energy storage
f(t) (external force)	E(t) (voltage)	Driving term

The same analysis of damping regimes applies: overdamped, critically damped, and underdamped circuits.
The discriminant becomes R² − 4L/C (analogous to b² − 4mk).

Example: An underdamped RLC circuit will show decaying oscillations in charge and current, just as an underdamped spring-mass system shows decaying oscillations in position.

7.4 Series Solutions of Differential Equations

🧭 Overview

🧠 One-sentence thesis

Power series representations can be used to find solutions to differential equations by differentiating term by term and matching coefficients to establish relationships between the series coefficients.

📌 Key points (3–5)

Core method: Represent the unknown function as a power series, differentiate term by term, substitute into the differential equation, and find relationships between coefficients
When to use: This approach is valuable when the forcing term r(x) is not a simple combination of polynomials, exponentials, or trigonometric functions
Key technique: After substitution, collect like powers of the variable and equate coefficients to zero to determine the series coefficients
Relationship to other methods: This complements the method of undetermined coefficients (for simple forcing terms) and variation of parameters (for more complex cases)
Common confusion: The excerpt only briefly mentions this method in a review context—it states the technique exists but does not provide detailed examples or derivations

🔧 The power series method

🔧 What power series solutions are

Power series representations of functions can sometimes be used to find solutions to differential equations.

Instead of finding a closed-form solution (like y = e^x or y = sin(x)), we express the solution as an infinite series
The solution takes the form: a sum from n=0 to infinity of (coefficient times x to the power n)
This is particularly useful when standard methods fail or when the equation has variable coefficients

🔧 How the method works

The excerpt states: "Differentiate the power series term by term and substitute into the differential equation to find relationships between the power series coefficients."

Step-by-step process:

Assume the solution has the form of a power series (sum of coefficient·x^n terms)
Compute derivatives of this series by differentiating each term
Substitute both the series and its derivatives into the original differential equation
Collect terms with the same power of x
Set the coefficient of each power of x equal to zero (since the equation must hold for all x)
Solve the resulting relationships to find the coefficients

Example scenario: If you have a differential equation with a complicated coefficient function (not constant), standard exponential or trigonometric solutions won't work. The power series method lets you build the solution piece by piece.

🔧 Connection to other solution methods

The excerpt places this in context with two other approaches:

Method	When to use	Key feature
Undetermined coefficients	r(x) is polynomial, exponential, sine, or cosine	Guess the form of the particular solution
Variation of parameters	r(x) is more complex	Uses Cramer's rule or similar techniques to find u'(x) and v'(x)
Power series	r(x) is not a simple combination, or equation has variable coefficients	Build solution as infinite series by matching coefficients

Don't confuse: The power series method is not about guessing a finite formula—it constructs an infinite series representation term by term.

📚 Context and applications

📚 Where this appears in the course

This section (7.4) is part of Chapter 7 on Second-Order Differential Equations, following:

7.1–7.2: Basic solution methods for second-order linear equations
7.3: Applications to spring-mass systems and RLC circuits
7.4: Series solutions (this section)

📚 Why series solutions matter

Many important differential equations in physics and engineering (like Bessel's equation, Legendre's equation) require series solutions
The excerpt mentions "Bessel functions" in the review exercises, which are typically defined as series solutions
When closed-form solutions don't exist, series solutions provide a practical way to compute approximate values

📚 Limitations of the excerpt

Important note: The excerpt provided is from a chapter review section, not the full instructional content. It only states that the method exists and gives a one-sentence description. A complete treatment would include:

Detailed worked examples showing the coefficient-matching process
Discussion of convergence (where does the series solution actually work?)
Specific named equations (Bessel, Legendre, etc.)
Techniques for finding recurrence relations between coefficients

Summary: Series solutions extend the toolkit for solving differential equations beyond exponential and trigonometric functions. By representing the solution as a power series and matching coefficients after substitution, we can solve equations that resist other methods. The excerpt confirms this technique exists but does not provide substantive detail—it appears in a brief review list rather than as full instructional material.