Legendre transformation, its geometry intuition, and applications in machine learning.
Legendre conjugate
For a strictly convex function with , its Legendre conjugate is defined as
As is a strictly concave function of (linear function + concave function) it has unique point that attains the maximum value
Hence, we can explain like this, which point has a gradient equals to the given .
Geometry explaination
Let us approach Legendre conjugate from the geometry perspective to gain more intuition.
Recall that the epigraph for a convex function is
How can we describe its boundary ?
Of course, given a we have , representing from -coordinate.
Let us consider the tangent hyperplane passing a fixed point :
The key point here is that there is only one point of that admits a tangent hyperplane with slope since is a monotonous increasing function w.r.t (). Therefore, we may also represent the boundary with the parameter , representing from -coordinate. In summary, we can identify a point , either by its -coordinate or its -coordinate (). Namely, we have exhibited a dual coordinate system.
How can we write the boundary in terms of -coordinate? Note that any hyperplane with a given slope passing through has a unique point on -axis. These lines admit one equation:
where is the intersection point between the hyperplane and (there may be multiple intersection points, and any of them is valid). Among these parallel lines, is the unique one that attains the minimized -intersection value
Remark: This is because is a convex function and , which concludes
We can also imagine this visually. The hyperplane only has one intersection with wheras other lines with all have mutiple intersections with ; we can move any of the later lines to get the former by decreasing the -axis intersection.
The tangent hyperlane equation can also be writen as
This form reveals that is any point in the hyperplane and is the normal vector perpendicular to the hyperplane at point .
So we can represent with as
In other words, finding a point that has a gradient equals to the given , which is equivalent to
Remark: describes the change of value of the hyperplane when goes from to . Given the slope , the hyperplane passing the point with achieves the minimum value on -axis.
Interpreting Legendre transformation in one sentence:
is a linear function of and thus it amounts to
which is the minimized -intersection value that hyperplane
could attain.
Properties
- is a convex function, since it is a linear function of .
- .
- or if is inversible.
- .
Applications
Smoothed max operator.