引言

本节通过代码实现期望的乘法性质。

个人疑惑

在数学期望的定义中，有一条随机变量期望的乘法性质：
当随机变量 X , Y \mathcal X,\mathcal Y X,Y相互独立时，有：
E ( X ⋅ Y ) = E ( X ) ⋅ E ( Y ) \mathbb E(\mathcal X \cdot \mathcal Y) = \mathbb E(\mathcal X) \cdot \mathbb E(\mathcal Y) E(X⋅Y)=E(X)⋅E(Y)

个人误区：随机变量 X , Y \mathcal X,\mathcal Y X,Y分别属于不同分布，在各分布下随机变量独立采样时，它们的期望会满足上述形式。

问题：如果两随机变量 X , Y \mathcal X,\mathcal Y X,Y是独立同分布 ( Independent Identically Distribution,IID ) (\text{Independent Identically Distribution,IID}) (Independent Identically Distribution,IID)条件下，上述结果是否也成立 ? ? ?

验证过程

样本生成

我们使用基于一维高斯分布 ( Gaussian Distribution ) (\text{Gaussian Distribution}) (Gaussian Distribution)的两随机变量 X , Y \mathcal X,\mathcal Y X,Y进行验证。高斯分布的概率密度函数表示如下：
F ( x ) = 1 σ 2 π exp ⁡ { − ( x − μ ) 2 2 σ 2 } \mathcal F(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp \left\{- \frac{(x - \mu)^2}{2 \sigma^2} \right\} F(x)=σ2π 1exp{−2σ2(x−μ)2}
对应代码表示如下：

import math
import random
import numpy as np
import matplotlib.pyplot as plt

def pdf(x,mu,sigma):
    return (1 / (sigma * math.sqrt(2 * math.pi))) * math.exp(-1 * (((x - mu) ** 2) / (2 * (sigma ** 2))))

但在牛客网八股刷题系列——概率密度函数中介绍过，概率密度函数结果进描述输入事件发生的可能性，而不是真正的概率结果。这里使用积分近似的方式对概率进行近似描述：
这里使用 1000 1000 1000个点对积分区间(设置左边界 − 5 -5 −5,右边界是需要计算概率的值)
需要注意的是：仅作实验使用，左边界数值不能设置太高，否则会导致积分误差较大。

    def GetIntegral(x,mu,sigma,DivideNum=1000):

        dx = list()
        y = list()
        x = list(np.linspace(-5,x,DivideNum))
        for k in range(0,DivideNum - 1):
            y.append(pdf(x[k],mu,sigma))
            dx.append(x[k+1] - x[k])

        return np.sum(np.multiply(y,dx))

基于上述概率结果，我们从均值为 μ \mu μ方差为 σ \sigma σ的高斯分布中采集一组样本：
由于‘概率密度函数’形状是关于 x = μ x = \mu x=μ对称图形，需要将大于一半积分 ( 0.5 ) (0.5) (0.5)的结果减去 0.5 0.5 0.5,并将剩余部分乘以2;若积分小于 0.5 0.5 0.5,仅需要将该部分乘以2即可。也就是说， x = μ x = \mu x=μ时的作为该分布的概率最大。

    def GetSample(mu,sigma,SampleNum=500):

        SampleList = list()
        count = 0

        while True:
            n = random.uniform(0,1)
            # PDF的有效范围设置
            Samplex = random.uniform(-10,10)
            SampleIntegral = GetIntegral(Samplex,mu,sigma)
            if SampleIntegral >= 0.5:
                Prob = 2 * (1 - SampleIntegral)
            else:
                Prob = 2 * SampleIntegral
            if n < Prob:
                SampleList.append(Samplex)
                count += 1
            if count == SampleNum:
                break
        return SampleList

至此，可以通过该采样得到不同参数 ( μ , σ ) (\mu,\sigma) (μ,σ)，并且数量相同的一维高斯分布样本。

实验过程

首先使用服从于相同分布的两组样本集合SampleListx于SampleListy。将两组样本集合中的样本对应元素相乘，从而得到一个新的样本集合：
这里样本属于随机相乘。因为我们并不知晓两集合中各样本的具体数值结果。

    def CheckExpectationValue(Inputx,Inputy):
        """
        Inputx,Inputy -> IID
        From the same Gaussian Distribution.
        :return:
        """
        # RandomProduct
        ProductSample = [i * j for _,(i,j) in enumerate(zip(Inputx,Inputy))]
        return sum(ProductSample) / len(ProductSample)

此时，构建两个参数相同的高斯分布的样本集合，执行上述操作：
末尾附完整代码。

	SampleListx = GetSample(mu=1.0, sigma=1.0)
    SampleListy = GetSample(mu=1.0, sigma=1.0)
    MixExpect = CheckExpectationValue(SampleListx,SampleListy)
	
	print(sum(SampleListx) / len(SampleListx))
    print(sum(SampleListy) / len(SampleListy))
    print(MixExpect)

原集合与新集合的期望结果分别表示为：

# 1.0 * 1.0 = 1.0
0.9918332790661574
0.9996555919557066
1.0031321613765627

可以发现，个人误区中的想法是错误的：只要随机变量之间相互独立(单独被采样出来)，即便它们属于相同分布，期望的乘法性质依旧成立。

将这个示例泛化：

选择两个不同参数的高斯分布；
每隔 10 10 10个样本，输出一次原始集合、新集合的期望结果，观察它们的收敛过程：

    def CheckExpectAstringency(SampleListx,SampleListy,mu):

        ContainerX = list()
        ContainerY = list()
        ExpectXList = list()
        ExpectYList = list()
        ExpectList = list()

        for idx,(i,j) in enumerate(zip(SampleListx,SampleListy)):

            ContainerX.append(i)
            ContainerY.append(j)
            if len(ContainerX) % 10 == 0:
                ExpectXList.append(sum(ContainerX) / len(ContainerX))
                ExpectYList.append(sum(ContainerY) / len(ContainerY))
                ExpectList.append(CheckExpectationValue(ContainerX,ContainerY))

        plt.plot([i for i in range(len(ExpectList))],[mu for _ in range(len(ExpectList))])
        plt.plot([i for i in range(len(ExpectList))],ExpectList,c="tab:orange")
        plt.plot([i for i in range(len(ExpectList))],ExpectXList,c="tab:red")
        plt.plot([i for i in range(len(ExpectList))],ExpectYList,c="tab:green")
        plt.show()

	SampleListx = GetSample(mu=1.5, sigma=1.0)
    SampleListy = GetSample(mu=2.0, sigma=1.0)
    # mu:1.5 * 2.0 = 3.0
 	CheckExpectAstringency(SampleListx,SampleListy,mu=3.0)

最终图像结果返回如下：
其中绿色线,红色线分别表示各原始集合的期望收敛过程;橙色线表示新集合的期望收敛过程。而蓝色线则表示新集合的理论期望结果。
小实验：关于期望的乘法性质-LMLPHP
可以看出，随着样本的增多，分布的描述越加明显，期望结果逐渐向理论期望结果收敛。并且在原始分布均是高斯分布的条件下，即便分布参数之间存在差异，但不影响期望的乘法性质。
本篇文章目的是针对深度学习笔记——数值稳定性、模型初始化与激活函数中期望问题的验证，该文章中有理解错误，后续修改。

附：完整代码

import math
import random
import numpy as np
import matplotlib.pyplot as plt

def pdf(x,mu,sigma):
    return (1 / (sigma * math.sqrt(2 * math.pi))) * math.exp(-1 * (((x - mu) ** 2) / (2 * (sigma ** 2))))

def Console():
    def GetIntegral(x,mu,sigma,DivideNum=1000):
    
        dx = list()
        y = list()
        x = list(np.linspace(-5,x,DivideNum))
        for k in range(0,DivideNum - 1):
            y.append(pdf(x[k],mu,sigma))
            dx.append(x[k+1] - x[k])
        return np.sum(np.multiply(y,dx))

    def GetSample(mu,sigma,SampleNum=5000):

        SampleList = list()
        count = 0
        while True:
            n = random.uniform(0,1)
            Samplex = random.uniform(-10,10)
            SampleIntegral = GetIntegral(Samplex,mu,sigma)
            if SampleIntegral >= 0.5:
                Prob = 2 * (1 - SampleIntegral)
            else:
                Prob = 2 * SampleIntegral
            if n < Prob:
                SampleList.append(Samplex)
                count += 1
            if count == SampleNum:
                break
        return SampleList

    def CheckExpectationValue(Inputx,Inputy):
        """
        Inputx,Inputy -> IID
        :return:
        """
        # RandomProduct
        ProductSample = [i * j for _,(i,j) in enumerate(zip(Inputx,Inputy))]
        return sum(ProductSample) / len(ProductSample)

    def CheckExpectAstringency(SampleListx,SampleListy,mu):

        ContainerX = list()
        ContainerY = list()
        ExpectXList = list()
        ExpectYList = list()
        ExpectList = list()

        for idx,(i,j) in enumerate(zip(SampleListx,SampleListy)):
            ContainerX.append(i)
            ContainerY.append(j)
            if len(ContainerX) % 10 == 0:
                ExpectXList.append(sum(ContainerX) / len(ContainerX))
                ExpectYList.append(sum(ContainerY) / len(ContainerY))
                ExpectList.append(CheckExpectationValue(ContainerX,ContainerY))

        plt.plot([i for i in range(len(ExpectList))],[mu for _ in range(len(ExpectList))])
        plt.plot([i for i in range(len(ExpectList))],ExpectList,c="tab:orange")
        plt.plot([i for i in range(len(ExpectList))],ExpectXList,c="tab:red")
        plt.plot([i for i in range(len(ExpectList))],ExpectYList,c="tab:green")
        plt.show()

    SampleListx = GetSample(mu=1.0, sigma=1.0)
    SampleListy = GetSample(mu=1.0, sigma=1.0)
    MixExpect = CheckExpectationValue(SampleListx,SampleListy)
    # mu:1.0 * 1.0 = 1.0
 	CheckExpectAstringency(SampleListx,SampleListy,mu=1.0)

if __name__ == '__main__':
    Console()

返回结果：
小实验：关于期望的乘法性质-LMLPHP

相关参考：
数学期望——百度百科

静静的喝酒