问题描述
我在Stan中使用最大似然优化,但不幸的是optimizing()
函数没有报告标准错误:
I am using maximum-likelihood optimization in Stan, but unfortunately the optimizing()
function doesn't report standard errors:
> MLb4c <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits)
STAN OPTIMIZATION COMMAND (LBFGS)
init = user
save_iterations = 1
init_alpha = 0.001
tol_obj = 1e-012
tol_grad = 1e-008
tol_param = 1e-008
tol_rel_obj = 10000
tol_rel_grad = 1e+007
history_size = 5
seed = 292156286
initial log joint probability = -4038.66
Iter log prob ||dx|| ||grad|| alpha alpha0 # evals Notes
13 -2772.49 9.21091e-005 0.0135987 0.07606 0.9845 15
Optimization terminated normally:
Convergence detected: relative gradient magnitude is below tolerance
> t2 <- proc.time()
> print(t2 - t1)
user system elapsed
0.11 0.19 0.74
>
> MLb4c
$par
psi alpha beta
0.9495000 0.4350983 -0.2016895
$value
[1] -2772.489
> summary(MLb4c)
Length Class Mode
par 3 -none- numeric
value 1 -none- numeric
如何获得估算值(或置信区间-分位数)以及p值的标准误差?
How do I get the standard errors of the estimates (or confidence interval - quantiles), and possibly p-values?
编辑:我按照@Ben Goodrich的建议进行了操作:
I did as kindly advised by @Ben Goodrich:
> MLb4cH <- optimizing(get_stanmodel(fitb4c), data = win.data, init = inits, hessian = TRUE)
> sqrt(diag(solve(-MLb4cH$hessian)))
psi alpha beta
0.21138314 0.03251696 0.03270493
但是这些不受约束的"标准误差似乎与真实误差大不相同-这就是使用stan()
的贝叶斯拟合输出:
But these "unconstrained" standard errors seem to be very different from the real ones - here as is the output from bayesian fitting using stan()
:
> print(outb4c, dig = 5)
Inference for Stan model: tmp_stan_model.
3 chains, each with iter=500; warmup=250; thin=1;
post-warmup draws per chain=250, total post-warmup draws=750.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
alpha 0.43594 0.00127 0.03103 0.37426 0.41578 0.43592 0.45633 0.49915 594 1.00176
beta -0.20262 0.00170 0.03167 -0.26640 -0.22290 -0.20242 -0.18290 -0.13501 345 1.00402
psi 0.94905 0.00047 0.01005 0.92821 0.94308 0.94991 0.95656 0.96632 448 1.00083
lp__ -2776.94451 0.06594 1.15674 -2780.07437 -2777.50643 -2776.67139 -2776.09064 -2775.61263 308 1.01220
推荐答案
您可以为optimizing
函数指定hessian = TRUE
参数,该参数将返回Hessian作为输出列表的一部分.这样,您就可以通过sqrt(diag(solve(-MLb4c$hessian)))
获得估计的标准误差.但是这些标准误差与无约束空间中的估计有关.要获得 constrained 空间中参数的估计标准误差,您可以使用delta方法或从均值向量为MLb4c$par
且方差-协方差为solve(-MLb4c$hessian)
,使用constrain_pars
函数将这些图形转换为约束空间,并估计每列的标准偏差.
You can specify the hessian = TRUE
argument to the optimizing
function, which will return the Hessian as part of the list of output. Thus, you can obtain estimated standard errors via sqrt(diag(solve(-MLb4c$hessian)))
; however those standard errors pertain to the estimates in the unconstrained space. To obtain estimated standard errors for the parameters in the constrained space, you could either use the delta method or draw many times from a multivariate normal distribution whose mean vector is MLb4c$par
and whose variance-covariance is solve(-MLb4c$hessian)
, convert those draws to the constrained space with the constrain_pars
function, and estimate the standard deviation of each column.
这里有一些您可以适应情况的R代码
Here is some R code you could adapt to your case
# 1: Compile and save a model (make sure to pass the data here)
model <- stan(file="model.stan", data=c("N","K","X","y"), chains = 0, iter = 0)
# 2: Fit that model
fit <- optimizing(object=get_stanmodel(model), as_vector = FALSE,
data=c("N","K","X","y"), hessian = TRUE)
# 3: Extract the vector theta_hat and the Hessian for the unconstrained parameters
theta_hat <- unlist(fit$par)
upars <- unconstrain_pars(linear, relist(theta_hat, fit$par))
Hessian <- fit$hessian
# 4: Extract the Cholesky decomposition of the (negative) Hessian and invert
R <- chol(-Hessian)
V <- chol2inv(R)
rownames(V) <- colnames(V) <- colnames(Hessian)
# 5: Produce a matrix with some specified number of simulation draws from a multinormal
SIMS <- 1000
len <- length(theta_hat)
unconstrained <- upars + t(chol(V)) %*%
matrix(rnorm(SIMS * len), nrow = len, ncol = SIMS)
theta_sims <- t(apply(unconstrained, 2, FUN = function(upars) {
unlist(constrain_pars(linear, upars))
}))
# 6: Produce estimated standard errors for the constrained parameters
se <- apply(theta_sims, 2, sd)
这篇关于如何获得STAN中最大似然估计的标准误差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!