Legendre transformation

In mathematics and physics, the Legendre transformation, named after Adrien-Marie Legendre, is an involutive transformation on the real-valued convex functions of one real variable. In physical problems, it is used to convert functions of one quantity (such as position, pressure, or temperature) into functions of the conjugate quantity (momentum, volume, and entropy, respectively). In this way, it is commonly used in classical mechanics to derive the Hamiltonian formalism out of the Lagrangian formalism and in thermodynamics to derive the thermodynamic potentials, as well as in the solution of differential equations of several variables.

The function is defined on the interval . The difference takes a maximum at . Thus, .

For sufficiently smooth functions on the real line, the Legendre transform of a function can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other. This can be expressed in Euler's derivative notation as

where means a function such that

or, equivalently, as and in Lagrange's notation.

The generalization of the Legendre transformation to affine spaces and non-convex functions is known as the convex conjugate (also called the Legendre–Fenchel transformation), which can be used to construct a function's convex hull.


Let   be an interval, and   a convex function; then its Legendre transform is the function   defined by


where   denotes the supremum, and the domain   is


The transform is always well-defined when   is convex.

The generalization to convex functions   on a convex set   is straightforward:   has domain


and is defined by


where   denotes the dot product of   and  .

The function   is called the convex conjugate function of  . For historical reasons (rooted in analytic mechanics), the conjugate variable is often denoted  , instead of  . If the convex function   is defined on the whole line and is everywhere differentiable, then


can be interpreted as the negative of the  -intercept of the tangent line to the graph of   that has slope  .

The Legendre transformation is an application of the duality relationship between points and lines. The functional relationship specified by   can be represented equally well as a set of   points, or as a set of tangent lines specified by their slope and intercept values.

Understanding the transform in terms of derivativesEdit

For a differentiable convex function   on the real line with an invertible first derivative, the Legendre transform   can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other. Explicitly, for a differentiable convex function   on the real line with a first derivative   with inverse  , the Legendre transform   (with derivative   with inverse  ) can be specified, up to an additive constant, by the condition that   and   are inverse functions of each other, i.e.,   and  .

To see this, first note that if   is differentiable and   is a critical point of the function of  , then the supremum is achieved at   (by convexity). Therefore,  .

Suppose that   is invertible and let   denote its inverse. Then for each  , the point   is the unique critical point of  . Indeed,   and so  . Hence we have   for each  . By differentiating with respect to   we find


Since   this simplifies to  . In other words,   and   are inverses.

In general, if   is an inverse of  , then   and so integration provides a constant   so that  .

In practical terms, given  , the parametric plot of   versus   amounts to the graph of   versus  .

In some cases (e.g. thermodynamic potentials, below), a non-standard requirement is used, amounting to an alternative definition of f * with a minus sign,



  • The Legendre transform of a convex function is convex.
Let us show this for the case of a doubly differentiable   with a non zero (and hence positive, due to convexity) double derivative.
For a fixed  , let   maximize  . Then  , noting that   depends on  . Thus,
The derivative of   is itself differentiable with a positive derivative and hence strictly monotonic and invertible.
Thus   where  , meaning that   is defined so that  .
Note that   is also differentiable with the following derivative,
Thus   is the composition of differentiable functions, hence differentiable.
Applying the product rule and the chain rule yields
so   is convex.
  • It follows that the Legendre transformation is an involution, i.e.,  :
By using the above equalities for  ,   and its derivative,


Example 1Edit

ex is plotted in red and its Legendre transform in dashed blue.

The exponential function   has   as a Legendre transform, since their respective first derivatives ex and ln p are inverse functions of each other.

This example illustrates that the respective domains of a function and its Legendre transform need not agree.

Example 2Edit

Let f(x) = cx2 defined on ℝ, where c > 0 is a fixed constant.

For x* fixed, the function of x, x*xf(x) = x*xcx2 has the first derivative x* – 2cx and second derivative −2c; there is one stationary point at x = x*/2c, which is always a maximum.

Thus, I* = ℝ and


The first derivatives of f, 2cx, and of f *, x*/(2c), are inverse functions to each other. Clearly, furthermore,


namely f ** = f.

Example 3Edit

Let f(x) = x2 for xI = [2, 3].

For x* fixed, x*xf(x) is continuous on I compact, hence it always takes a finite maximum on it; it follows that I* = ℝ.

The stationary point at x = x*/2 is in the domain [2, 3] if and only if 4 ≤ x* ≤ 6, otherwise the maximum is taken either at x = 2, or x = 3. It follows that


Example 4Edit

The function f(x) = cx is convex, for every x (strict convexity is not required for the Legendre transformation to be well defined). Clearly x*xf(x) = (x* − c)x is never bounded from above as a function of x, unless x* − c = 0. Hence f* is defined on I* = {c} and f*(c) = 0.

One may check involutivity: of course x*xf*(x*) is always bounded as a function of x* ∈ {c}, hence I ** = ℝ. Then, for all x one has


and hence f **(x) = cx = f(x).

Example 5: several variablesEdit



be defined on X = ℝn, where A is a real, positive definite matrix.

Then f is convex, and


has gradient p − 2Ax and Hessian −2A, which is negative; hence the stationary point x = A−1p/2 is a maximum.

We have X* = ℝn, and


Behavior of differentials under Legendre transformsEdit

The Legendre transform is linked to integration by parts,   pdx = d(px) − xdp.

Let f be a function of two independent variables x and y, with the differential


Assume that it is convex in x for all y, so that one may perform the Legendre transform in x, with p the variable conjugate to x. Since the new independent variable is p, the differentials dx and dy devolve to dp and dy, i.e., we build another function with its differential expressed in terms of the new basis dp and dy.

We thus consider the function g(p, y) = fpx so that


The function -g(p, y) is the Legendre transform of f(x, y), where only the independent variable x has been supplanted by p. This is widely used in thermodynamics, as illustrated below.


Analytical mechanicsEdit

A Legendre transform is used in classical mechanics to derive the Hamiltonian formulation from the Lagrangian formulation, and conversely. A typical Lagrangian has the form


where   are coordinates on Rn × Rn, M is a positive real matrix, and


For every q fixed,   is a convex function of  , while   plays the role of a constant.

Hence the Legendre transform of   as a function of v is the Hamiltonian function,


In a more general setting,   are local coordinates on the tangent bundle   of a manifold  . For each q,   is a convex function of the tangent space Vq. The Legendre transform gives the Hamiltonian   as a function of the coordinates (p, q) of the cotangent bundle  ; the inner product used to define the Legendre transform is inherited from the pertinent canonical symplectic structure. In this abstract setting, the Legendre transformation corresponds to the tautological one-form.


The strategy behind the use of Legendre transforms in thermodynamics is to shift from a function that depends on a variable to a new (conjugate) function that depends on a new variable, the conjugate of the original one. The new variable is the partial derivative of the original function with respect to the original variable. The new function is the difference between the original function and the product of the old and new variables. Typically, this transformation is useful because it shifts the dependence of, e.g., the energy from an extensive variable to its conjugate intensive variable, which can usually be controlled more easily in a physical experiment.

For example, the internal energy is an explicit function of the extensive variables entropy, volume, and chemical composition


which has a total differential


Stipulating some common reference state, by using the (non-standard) Legendre transform of the internal energy, U, with respect to volume, V, the enthalpy may be defined by writing


which is now explicitly function of the pressure P, since


The enthalpy is suitable for description of processes in which the pressure is controlled from the surroundings.

It is likewise possible to shift the dependence of the energy from the extensive variable of entropy, S, to the (often more convenient) intensive variable T, resulting in the Helmholtz and Gibbs free energies. The Helmholtz free energy, A, and Gibbs energy, G, are obtained by performing Legendre transforms of the internal energy and enthalpy, respectively,


The Helmholtz free energy is often the most useful thermodynamic potential when temperature and volume are controlled from the surroundings, while the Gibbs energy is often the most useful when temperature and pressure are controlled from the surroundings.

An example – variable capacitorEdit

As another example from physics, consider a parallel-plate capacitor, in which the plates can move relative to one another. Such a capacitor would allow transfer of the electric energy which is stored in the capacitor into external mechanical work, done by the force acting on the plates. One may think of the electric charge as analogous to the "charge" of a gas in a cylinder, with the resulting mechanical force exerted on a piston.

Compute the force on the plates as a function of x, the distance which separates them. To find the force, compute the potential energy, and then apply the definition of force as the gradient of the potential energy function.

The energy stored in a capacitor of capacitance C(x) and charge Q is


where the dependence on the area of the plates, the dielectric constant of the material between the plates, and the separation x are abstracted away as the capacitance C(x). (For a parallel plate capacitor, this is proportional to the area of the plates and inversely proportional to the separation.)

The force F between the plates due to the electric field is then


If the capacitor is not connected to any circuit, then the charges on the plates remain constant as they move, and the force is the negative gradient of the electrostatic energy


However, suppose, instead, that the voltage between the plates V is maintained constant by connection to a battery, which is a reservoir for charge at constant potential difference; now the charge is variable instead of the voltage, its Legendre conjugate. To find the force, first compute the non-standard Legendre transform,


The force now becomes the negative gradient of this Legendre transform, still pointing in the same direction,


The two conjugate energies happen to stand opposite to each other, only because of the linearity of the capacitance—except now Q is no longer a constant. They reflect the two different pathways of storing energy into the capacitor, resulting in, for instance, the same "pull" between a capacitor's plates.

Probability theoryEdit

In large deviations theory, the rate function is defined as the Legendre transformation of the logarithm of the moment generating function of a random variable. An important application of the rate function is in the calculation of tail probabilities of sums of i.i.d. random variables.


Legendre transformation arises naturally in microeconomics in the process of finding the supply S(P) of some product given a fixed price P on the market knowing the cost function C(Q), i.e. the cost for the producer to make/mine/etc. Q units of the given product.

A simple theory explains the shape of the supply curve based solely on the cost function. Let us suppose the market price for a one unit of our product is P. For a company selling this good, the best strategy is to adjust the production Q so that its profit is maximized. We can maximize the profit


by differentiating with respect to Q and solving


Qopt represents the optimal quantity Q of goods that the producer is willing to supply, which is indeed the supply itself:


If we consider the maximal profit as a function of price,  , we see that it is the Legendre transform of the cost function  .

Geometric interpretationEdit

For a strictly convex function, the Legendre transformation can be interpreted as a mapping between the graph of the function and the family of tangents of the graph. (For a function of one variable, the tangents are well-defined at all but at most countably many points, since a convex function is differentiable at all but at most countably many points.)

The equation of a line with slope   and  -intercept   is given by   For this line to be tangent to the graph of a function   at the point   requires




Being the derivative of a strictly convex function, the function   is strictly monotone and thus injective. The second equation can be solved for   allowing elimination of   from the first, and solving for the  -intercept   of the tangent as a function of its slope  


where   denotes the Legendre transform of  

The family of tangent lines of the graph of   parameterized by the slope   is therefore given by


or, written implicitly, by the solutions of the equation


The graph of the original function can be reconstructed from this family of lines as the envelope of this family by demanding


Eliminating   from these two equations gives


Identifying   with   and recognizing the right side of the preceding equation as the Legendre transform of   yields


Legendre transformation in more than one dimensionEdit

For a differentiable real-valued function on an open subset U of Rn the Legendre conjugate of the pair (U, f) is defined to be the pair (V, g), where V is the image of U under the gradient mapping Df, and g is the function on V given by the formula




is the scalar product on Rn. The multidimensional transform can be interpreted as an encoding of the convex hull of the function's epigraph in terms of its supporting hyperplanes.[1]

Alternatively, if X is a vector space and Y is its dual vector space, then for each point x of X and y of Y, there is a natural identification of the cotangent spaces T*Xx with Y and T*Yy with X. If f is a real differentiable function over X, then its exterior derivative, df, is a section of the cotangent bundle T*X and as such, we can construct a map from X to Y. Similarly, if g is a real differentiable function over Y, then dg defines a map from Y to X. If both maps happen to be inverses of each other, we say we have a Legendre transform. The notion of the tautological one-form is commonly used in this setting.

When the function is not differentiable, the Legendre transform can still be extended, and is known as the Legendre-Fenchel transformation. In this more general setting, a few properties are lost: for example, the Legendre transform is no longer its own inverse (unless there are extra assumptions, like convexity).

Legendre transformation on manifoldsEdit

Let M be a smooth manifold, and let TM denote its tangent bundle. Let L : TMR be a smooth function, which we will refer to as the Lagrangian. The Legendre transformation of L is a morphism of vector bundles FL : TMT*M defined as follows. Suppose that n = dim M and that UM is a chart. Then U × Rn is a chart on TM, and for any point (x, v) in this chart, the Legendre transformation of L is defined by


The associated energy function is the function E : TMR defined by


where the angle brackets denote the natural pairing of a tangent and cotangent vector. The Legendre transform can be further generalized to a function from a vector bundle over M to its dual bundle.[2]

Further propertiesEdit

Scaling propertiesEdit

The Legendre transformation has the following scaling properties: For a > 0,


It follows that if a function is homogeneous of degree r then its image under the Legendre transformation is a homogeneous function of degree s, where 1/r + 1/s = 1. (Since f(x) = xr/r, with r > 1, implies f*(p) = ps/s.) Thus, the only monomial whose degree is invariant under Legendre transform is the quadratic.

Behavior under translationEdit


Behavior under inversionEdit


Behavior under linear transformationsEdit

Let A : RnRm be a linear transformation. For any convex function f on Rn, one has


where A* is the adjoint operator of A defined by


and Af is the push-forward of f along A


A closed convex function f is symmetric with respect to a given set G of orthogonal linear transformations,


if and only if f* is symmetric with respect to G.

Infimal convolutionEdit

The infimal convolution of two functions f and g is defined as


Let f1, ..., fm be proper convex functions on Rn. Then


Fenchel's inequalityEdit

For any function f and its convex conjugate f * Fenchel's inequality (also known as the Fenchel–Young inequality) holds for every xX and pX*, i.e., independent x, p pairs,


See alsoEdit


  1. ^ "Archived copy". Archived from the original on 2015-03-12. Retrieved 2011-01-26.CS1 maint: archived copy as title (link)
  2. ^ Marsden, Jerrod E., and Ratiu, Tudor, Introduction to Mechanics and Symmetry: A Basic Exposition of Classical Mechanical Systems, Springer-Verlag, 1999, ISBN 978-0-387-98643-2, doi 10.1007/978-0-387-21792-5.

Further readingEdit

External linksEdit