文章目录
  1. 1. variational information bound of mutual information
    1. 1.1. energy-based bound, EB
    2. 1.2. MINE
    3. 1.3. NWJ
    4. 1.4. JS
    5. 1.5. TCPC

variational information bound of mutual information

On variational lower bounds of mutual information

简述:总结当前互信息的估计方法

energy-based bound, EB

这种方法的下界为:

$$
I(x,y) \geq \mathbb E_{p(x,y)}[\log f(x,y)] - \mathbb E_{p(y)}[\frac{\mathbb E_{p(x)}[f(x,y)]}{a}+\log a -1]
$$

推导方法

$$
\mathbb E_{p(y)}[KL(p(x|y)| q(x|y))] \geq 0
$$

$$
I(x,y) = \mathbb E_{p(y)}[\mathbb E_{p(x)}\log \frac{p(x|y)}{p(x)}] \geq 0
$$

$$
I(x,y) \geq \mathbb E_{p(x,y)} [\log q(x|y) - \log p(x)]
$$

基于能量的方法主要体现在这一部分,假设$p(x|y)=p(x)\frac{f(x,y)}{Z(y)}, Z(y)=\mathbb E_{p(x)}[f(x,y)]$

$$
I(x,y) \geq \mathbb E_{p(x,y)} [\log f(x,y)] - \mathbb E_{p(y)}[\log \mathbb E_{p(x)}[f(x,y)]]
$$

由于
$$
\log x \leq x/a + \log a -1
$$
当$x=a$时,等号成立

得到
$$
I(x,y) \geq I_{EB}(x,y) := \mathbb E_{p(x,y)} [\log f(x,y)] - \mathbb E_{p(y)}[\mathbb E_{p(x)}[f(x,y)]/a+\log a -1]
$$

MINE

取$a=E_{p(x)}[f(x,y)]$, $T=\log f(x,y)$

$$
I(x,y) \geq I_{MINE}(x,y) := \mathbb E_{p(x,y)} [T] - \mathbb E_{p(y)}[\log \mathbb E_{p(x)}[\exp(T)]]
$$

NWJ

取$a=e$

$$
I(x,y) \geq I_{NWJ}(x,y) := \mathbb E_{p(x,y)} [\log f(x,y)] - 1/e\mathbb E_{p(x)p(y)}[f(x,y)]
$$

JS

在NWJ方法的基础上,令$f(x,y)=\exp(V(x,y)+1)$

$$
I(x,y) \geq I_{JS}(x,y) := 1 + \mathbb E_{p(x,y)} [V(x,y)] - \mathbb E_{p(x)p(y)}[\exp(V(x,y))]
$$

TCPC

这种方法考虑了数据集批大小的影响,假设$p(y|x$已经知道

$$
I(x,y) = \mathbb E_{x_{1:K}\in \mathcal D}[1/K \sum_{i=1}^K KL(p(y|x_i)|p(y))]
$$

边缘分布$p(y)$估计为:

$$
m(y) = 1/K \sum_{i=1}^K p(y|x_i)
$$

$$
1/K \sum_{i=1}^K [KL(p(y|x_i)|p(y))] = 1/K \sum_{i=1}^K [KL(p(y|x_i)|m(y))] + KL(m(y)|p(y))
$$

$$
I(x,y) \geq I_{TCPC}(x,y) := \mathbb E_{x_{1:K}\in \mathcal D}[1/K \sum_{i=1}^K KL(p(y|x_i)|m(y))]
$$

文章目录
  1. 1. variational information bound of mutual information
    1. 1.1. energy-based bound, EB
    2. 1.2. MINE
    3. 1.3. NWJ
    4. 1.4. JS
    5. 1.5. TCPC