# All smooth $f$-divergences are locally the same

22 Jul 2017For discrete distributions $p$ and $q$, the $f$-divergence is defined as

where $f$ is a convex function $f: (0, \infty) \to \mathbb{R}$ satisfying the condition $f(1) = 0$. If $p$ is a variation of $q$, then

Provided $f$ is **twice differentiable**, we can develop it into Taylor series

and thus approximate the $f$-divergence by a quadratic function

Comparing it with the Fisher metric, we see that it is the same quadratic form scaled by a constant factor $f^{\prime\prime}(1)$.

Note that **not all $f$-divergences are locally the same**,
only the smooth ones.
For example, the total variation distance corresponds to

which is not quadratic around $x = 1$.

## Special case: $\alpha$-divergence

The $f$-divergence with $f$ having the form

is known as the $\alpha$-divergence. Noting that $f_\alpha$ has the properties $f_\alpha^{\prime\prime}(x) = x^{\alpha-2}$ and $f_\alpha^{\prime\prime}(1) = 1$, we obtain the approximation of the $\alpha$-divergence

which directly generalizes the result of Kullback for the KL divergence and its reverse, corresponding to $\alpha = 1$ and $\alpha = 0$ respectively.

## Local approximation is exact for Pearson $\chi^2$ divergence

Pearson $\chi^2$ divergence is the $\alpha$-divergence with $\alpha = 2$, corresponding to the generating function

We established the following quadratic approximation of the $f$-divergence

that is valid for small $dq$. However, if we allow big deviations $dq = p - q$, then we obtain Pearson $\chi^2$ divergence (scaled by $f^{\prime\prime}(1)$),

Thus, Pearson $\chi^2$ divergence is the linear extension of the Fisher metric from a local neighborhood to the whole space. Consequently, local quadratic approximation is exact for Pearson $\chi^2$ divergence, since $f_2^{\prime\prime}(1) = 1$.