从多元正态分布中有效地随机抽取

从多元正态分布中有效地随机抽取

本文介绍了从多元正态分布中有效地随机抽取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是想知道是否有人遇到过他/她需要从非常高维的多元正态分布(比如维数 = 10,000)中随机抽取的问题,作为 rmvnorm 函数>mvtnorm 包是不切实际的.

Just wondering if anyone has ever encountered the problem where he/she needs to randomly draw from a very high dimensional multivariate normal distribution (say dimension = 10,000), as the rmvnorm function of the mvtnorm package is impractical for that.

我知道这篇文章有一个Rcpp实现mvnorm 包的 dmvnorm 函数,所以我想知道 rmvnorm 是否存在等效的东西?

I know this article has an Rcpp implementation for the dmvnorm function of the mvtnorm package, so I was wondering if something equivalent exists for rmvnorm?

推荐答案

这里是 mvtnorm::rmvnormRcpp 实现的快速比较 此处 作者:Ahmadou Dicko.显示的时间是从多元正态分布中抽取 100 次的时间,维度范围从 500 到 2500.从下图中您可以推断出维度为 10000 所需的时间.时间包括生成随机 mu 向量和 diag 矩阵,但这些在不同方法中是一致的,并且对于所讨论的维度来说是微不足道的(例如 diag(10000) 为 0.2 秒).

Here's a quick comparison of mvtnorm::rmvnorm and an Rcpp implementation given here by Ahmadou Dicko. The times presented are for 100 draws from a multivariate normal distribution with dimension ranging from 500 to 2500. From the graph below you can probably infer the time required for dimension of 10000. Times include the overhead of generating the random mu vector and the diag matrix, but these are consistent across approaches and are trivial for the dimensions in question (e.g. 0.2 sec for diag(10000)).

library(Rcpp)
library(RcppArmadillo)
library(inline)
library(mvtnorm)

code <- '
using namespace Rcpp;
int n = as<int>(n_);
arma::vec mu = as<arma::vec>(mu_);
arma::mat sigma = as<arma::mat>(sigma_);
int ncols = sigma.n_cols;
arma::mat Y = arma::randn(n, ncols);
return wrap(arma::repmat(mu, 1, n).t() + Y * arma::chol(sigma));
'

rmvnorm.rcpp <-
  cxxfunction(signature(n_="integer", mu_="numeric", sigma_="matrix"), code,
              plugin="RcppArmadillo", verbose=TRUE)

rcpp.time <- sapply(seq(500, 5000, 500), function(x) {
  system.time(rmvnorm.rcpp(100, rnorm(x), diag(x)))[3]
})

mvtnorm.time <- sapply(seq(500, 2500, 500), function(x) {
  system.time(rmvnorm(100, rnorm(x), diag(x)))[3]
})


plot(seq(500, 5000, 500), rcpp.time, type='o', xlim=c(0, 5000),
     ylim=c(0, max(mvtnorm.time)), xlab='dimension', ylab='time (s)')

points(seq(500, 2500, 500), mvtnorm.time, type='o', col=2)

legend('topleft', legend=c('rcpp', 'mvtnorm'), lty=1, col=1:2, bty='n')

这篇关于从多元正态分布中有效地随机抽取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 14:33