本文介绍了如何获得STAN中最大似然估计的标准误差?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Stan中使用最大似然优化,但不幸的是optimizing()函数没有报告标准错误:

I am using maximum-likelihood optimization in Stan, but unfortunately the optimizing() function doesn't report standard errors:

> MLb4c <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits)
STAN OPTIMIZATION COMMAND (LBFGS)
init = user
save_iterations = 1
init_alpha = 0.001
tol_obj = 1e-012
tol_grad = 1e-008
tol_param = 1e-008
tol_rel_obj = 10000
tol_rel_grad = 1e+007
history_size = 5
seed = 292156286
initial log joint probability = -4038.66
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes
      13      -2772.49  9.21091e-005     0.0135987     0.07606      0.9845       15
Optimization terminated normally:
  Convergence detected: relative gradient magnitude is below tolerance
> t2 <- proc.time()
> print(t2 - t1)
   user  system elapsed
   0.11    0.19    0.74
>
> MLb4c
$par
       psi      alpha       beta
 0.9495000  0.4350983 -0.2016895

$value
[1] -2772.489

> summary(MLb4c)
      Length Class  Mode
par   3      -none- numeric
value 1      -none- numeric

如何获得估算值(或置信区间-分位数)以及p值的标准误差?

How do I get the standard errors of the estimates (or confidence interval - quantiles), and possibly p-values?

编辑:我按照@Ben Goodrich的建议进行了操作:

I did as kindly advised by @Ben Goodrich:

> MLb4cH <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits, hessian = TRUE)

> sqrt(diag(solve(-MLb4cH$hessian)))
       psi      alpha       beta
0.21138314 0.03251696 0.03270493

但是这些不受约束的"标准误差似乎与真实误差大不相同-这就是使用stan()的贝叶斯拟合输出:

But these "unconstrained" standard errors seem to be very different from the real ones - here as is the output from bayesian fitting using stan():

> print(outb4c, dig = 5)
Inference for Stan model: tmp_stan_model.
3 chains, each with iter=500; warmup=250; thin=1;
post-warmup draws per chain=250, total post-warmup draws=750.

             mean se_mean      sd        2.5%         25%         50%         75%       97.5% n_eff    Rhat
alpha     0.43594 0.00127 0.03103     0.37426     0.41578     0.43592     0.45633     0.49915   594 1.00176
beta     -0.20262 0.00170 0.03167    -0.26640    -0.22290    -0.20242    -0.18290    -0.13501   345 1.00402
psi       0.94905 0.00047 0.01005     0.92821     0.94308     0.94991     0.95656     0.96632   448 1.00083
lp__  -2776.94451 0.06594 1.15674 -2780.07437 -2777.50643 -2776.67139 -2776.09064 -2775.61263   308 1.01220

推荐答案

您可以为optimizing函数指定hessian = TRUE参数,该参数将返回Hessian作为输出列表的一部分.这样,您就可以通过sqrt(diag(solve(-MLb4c$hessian)))获得估计的标准误差.但是这些标准误差与无约束空间中的估计有关.要获得 constrained 空间中参数的估计标准误差,您可以使用delta方法或从均值向量为MLb4c$par且方差-协方差为solve(-MLb4c$hessian),使用constrain_pars函数将这些图形转换为约束空间,并估计每列的标准偏差.

You can specify the hessian = TRUE argument to the optimizing function, which will return the Hessian as part of the list of output. Thus, you can obtain estimated standard errors via sqrt(diag(solve(-MLb4c$hessian))); however those standard errors pertain to the estimates in the unconstrained space. To obtain estimated standard errors for the parameters in the constrained space, you could either use the delta method or draw many times from a multivariate normal distribution whose mean vector is MLb4c$par and whose variance-covariance is solve(-MLb4c$hessian), convert those draws to the constrained space with the constrain_pars function, and estimate the standard deviation of each column.

这里有一些您可以适应情况的R代码

Here is some R code you could adapt to your case

# 1: Compile and save a model (make sure to pass the data here)
model <- stan(file="model.stan", data=c("N","K","X","y"), chains = 0, iter = 0)

# 2: Fit that model
fit <- optimizing(object=get_stanmodel(model), as_vector = FALSE,
                   data=c("N","K","X","y"), hessian = TRUE)

# 3: Extract the vector theta_hat and the Hessian for the unconstrained parameters
theta_hat <- unlist(fit$par)
upars <- unconstrain_pars(linear, relist(theta_hat, fit$par))
Hessian <- fit$hessian

# 4: Extract the Cholesky decomposition of the (negative) Hessian and invert
R <- chol(-Hessian)
V <- chol2inv(R)
rownames(V) <- colnames(V) <- colnames(Hessian)

# 5: Produce a matrix with some specified number of simulation draws from a multinormal
SIMS <- 1000
len <- length(theta_hat)
unconstrained <- upars + t(chol(V)) %*%
  matrix(rnorm(SIMS * len), nrow = len, ncol = SIMS)
theta_sims <- t(apply(unconstrained, 2, FUN = function(upars) {
  unlist(constrain_pars(linear, upars))
}))

# 6: Produce estimated standard errors for the constrained parameters
se <- apply(theta_sims, 2, sd)

这篇关于如何获得STAN中最大似然估计的标准误差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 22:20