Change of variables and necessary conditions for optimality

12 Aug 2017

In this post, we consider how removal and addition of degrees of freedom through change of variables can help in searching for a minimum of a function.

Removing degrees of freedom

Consider a function $f(x,y) = xy$. The necessary condition for optimality says that its differential should be equal to zero,

$df = y dx + x dy \equiv 0 \quad \Leftrightarrow \quad (x, y) = (0, 0).$

However, we can introduce a new variable $z = xy$ and consider a function $g(z) = z$, which obviously has no critical point. What happened here is that we lost one degree of freedom. Initially, we could change $x$ and $y$ independently, but after introducing $z$, we can only change their product $xy$.

Sometimes, though, we don’t need to use all degrees of freedom to find a critical point of a function. For example, if $f(x,y) = (xy)^2$ and $g(z) = z^2$, then

$dg = 2zdz \equiv 0 \quad \Leftrightarrow \quad z = 0$

describes the same set of solutions as the system of equations $(xy^2, x^2y) = (0,0)$ obtained from the partial derivatives of $f$. So, we incur no loss of information by removing some degrees of freedom in this case.

One should be careful, nevertheless, to ensure that the solution $z^*$ of the equation $g_z = 0$ lies in the range of the function $z$. For example, if $f(x) = (\frac{1}{x})^2$, then we could write $g(z) = z^2$ with $z = \frac{1}{x}$. Provided $dg = 2z$, we could haste to declare $z^* = 0$ a critical point of $f$ despite its lying outside the image of $z$.

To summarize,

removing degrees of freedom can help in finding critical points;
not all critical points are guaranteed to be found in this way;
found points may be unreachable through original variables.

Adding degrees of freedom

Adding degrees of freedom can only hurt. Consider the function

$f(z) = z^2 - (2z - 1),$

that has a minimum at $z = 1$. Assuming we are studying its restriction on $z \geq 0.5$, we can introduce the function $g$ of two variables

$g(x(z), y(z)) = x^2 - y^2$

with $x^2 = z^2$ and $y^2 = 2z - 1$. Equating its differential to zero,

$dg = \frac{\partial g}{\partial x} dx + \frac{\partial g}{\partial y} dy = 2x dx - 2y dy \equiv 0 \quad \Leftrightarrow \quad (x, y) = (0, 0),$

leads to contradiction, since it implies $z = 0$ and $z = 0.5$ at the same time. We ran into such troubles because two variables give more flexibility than we can actually afford with one. Expanding $dx$ and $dy$ in the differential, we see that

$dg = \left( \frac{\partial g}{\partial x}\frac{\partial x}{\partial z} + \frac{\partial g}{\partial y}\frac{\partial y}{\partial z} \right) dz$

and even though equation $g_y = 2y = 0$ suggests setting $y = 0$, multiplying $g_y = 2y$ by $y_z = \frac{1}{y}$ results in a finite quantity.

Introducing extra variables just obscures the problem. Most importantly, critical points of $g$ do not tell us anything about critical points of $f$. Therefore, adding degrees of freedom in its pure form is not helpful for optimization.

Boris Belousov

Change of variables and necessary conditions for optimality

Removing degrees of freedom

Adding degrees of freedom