本文介绍了贝叶斯推断的Toy R代码表示正态分布的平均值[降雪量的数据]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很多降雪观测:

x <- c(98.044, 107.696, 146.050, 102.870, 131.318, 170.434, 84.836, 154.686,
       162.814, 101.854, 103.378, 16.256)

,我被告知它遵循正态分布,标准偏差为25.4,但均值mu未知.我必须使用贝叶斯公式对mu进行推断.

and I was told that it follows normal distribution with known standard deviation at 25.4 but unknown mean mu. I have to make inference on mu using Bayesian Formula.

这是有关mu

mean of snow |  50.8  | 76.2  | 101.6 | 127.0 |  152.4 | 177.8  
---------------------------------------------------------------
probability  |   0.1  | 0.15  | 0.25  |0.25   |  0.15  |  0.1 
---------------------------------------------------------------

以下是我到目前为止已经尝试过的方法,但是关于post的最后一行无法正常工作.生成的图正好给出一条水平线.

The following is what I have tried so far, but the final line about post does not work correctly. The resulting plot justs give a horizontal line.

library(LearnBayes)
midpts <- c(seq(50.8, 177.8, 30))
prob <- c(0.1, 0.15, 0.25, 0.25, 0.15, 0.1)
p <- seq(50, 180, length = 40000)
histp <- histprior(p, midpts, prob)
plot(p, histp, type = "l")

# posterior density
post <- round(histp * dnorm(x, 115, 42) / sum(histp * dnorm(x, 115, 42)), 3)
plot(p, post, type = "l")

推荐答案

我的第一个建议是,确保您了解其背后的统计信息.当我看到你的

My first suggestion is, make sure you understand the statistics behind this. When I saw your

post <- round(histp * dnorm(x, 115, 42) / sum(histp * dnorm(x, 115, 42)), 3)

我认为您已经弄乱了几个概念.这似乎是贝叶斯公式,但是您可能使用的代码错误.正确的似然函数是

I reckoned you have messed up several concepts. This appears to be Bayes Formula, but you have wrong code for the likelihood. The correct likelihood function is

## likelihood function: `L(obs | mu)`
## standard error is known (to make problem easy) at 25.4
Lik <- function (obs, mu) prod(dnorm(obs, mu, 25.4))

注意,mu是未知的,因此它应该是此函数的变量;同样,似然度是观察时所有个体概率密度的乘积.现在,我们可以评估可能性,例如在mu = 100

Note, mu is a unknown, so it should be a variable of this function; also, likelihood is the product of all individual probability density at observations. Now, we can evaluate likelihood for example, at mu = 100 by

Lik(x, 100)
# [1] 6.884842e-30

要成功实现R,我们需要函数Lik的向量化版本.也就是说,一个函数可以对向量输入mu求值,而不仅仅是标量输入.我将只使用sapply进行矢量化:

For successful R implementation, we need a vectorized version for function Lik. That is, a function that can evaluate on a vector input for mu, rather than just a scalar input. I will just use sapply for vectorization:

vecLik <- function (obs, mu) sapply(mu, Lik, obs = obs)

让我们尝试

vecLik(x, c(80, 90, 100))
# [1] 6.248416e-34 1.662366e-31 6.884842e-30

现在是时候获取mu的先前分配了.原则上,这是一个连续函数,但看起来我们需要使用R包LearnBayes中的histprior对其进行离散近似.

Now it is time to obtain prior distribution for mu. In principle this is a continuous function, but looks like we want a discrete approximation to it, using histprior from R package LearnBayes.

## prior distribution for `mu`: `prior(mu)`
midpts <- c(seq(50.8, 177.8, 30))
prob <- c(0.1, 0.15, 0.25, 0.25, 0.15, 0.1)
mu_grid <- seq(50, 180, length = 40000)  ## a grid of `mu` for discretization
library(LearnBayes)
prior_mu_grid <- histprior(mu_grid, midpts, prob)  ## discrete prior density
plot(mu_grid, prior_mu_grid, type = "l")

在应用贝叶公式之前,我们首先计算分母上的归一化常数NC.这将是Lik(obs | mu) * prior(mu)的整数.但是,由于我们对prior(mu)具有离散近似,因此我们使用黎曼和来对该积分进行近似.

Before applying Baye's Formula, we first work out the normalizing constant NC on the denominator. This would be an integral of Lik(obs | mu) * prior(mu). But as we have discrete approximation for prior(mu), we use Riemann sum to approximate this integral.

delta <- mu_grid[2] - mu_grid[1]    ## division size
NC <- sum(vecLik(x, mu_grid) * prior_mu_grid * delta)    ## Riemann sum
# [1] 2.573673e-28

太好了,一切就绪,我们可以使用贝叶斯公式:

Great, all being ready, we can use Bayes Formula:

posterior(mu | obs) = Lik(obs | mu) * prior(mu) / NC

同样,随着prior(mu)被离散化,posterior(mu)也被离散化.

Again, as prior(mu) is discretized, posterior(mu) is discretized, too.

post_mu <- vecLik(x, mu_grid) * prior_mu_grid / NC

哈哈,让我们画出mu的后验以查看推理结果:

Haha, let's sketch posterior of mu to see the inference result:

plot(mu_grid, post_mu, type = "l")

哇,这太漂亮了!

这篇关于贝叶斯推断的Toy R代码表示正态分布的平均值[降雪量的数据]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 04:39