# Many faces of entropy

16 Apr 2017With this definition, the *KL divergence* $D(\pi || \mu)$ from $\mu$ to $\pi$
equals minus entropy of $\pi$ under $\mu$,

The *differential entropy* $h(\pi)$ is the entropy of $\pi$ under the
Lebesgue measure $\lambda$,

The *joint entropy* $h(X, Y)$ equals the entropy of the joint distribution
$\mu_{XY}$ over $X$ and $Y$ under the Lebesgue measure,

The *mutual information* $I(X; Y)$ equals minus entropy of the joint
distribution $\mu_{XY}$ under the product measure $\mu_X \times \mu_Y$,

The *conditional entropy* $h(Y|X)$ equals the entropy of $\mu_{XY}$
under the product measure $\mu_X \times \lambda$,

Such unification was actually the motivation of Kullback and Leiblerâ€™s seminal paper on information and sufficiency.