本文介绍了将对数正态分布拟合到R中的截断数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在简短的背景下,我会迷惑地描述火势大小的分布,假定该分布遵循对数正态分布(许多小火而很少有大火)。对于我的特定应用,我只对落在一定大小范围内的火灾感兴趣(>最小,

For a brief background, I am insterested in describing a distribution of fire sizes, which is presumed to follow a lognormal distribution (many small fires and few large fires). For my specific application I am only interested in the fires that fall within a certain range of sizes (> min, < max). So, I am attempting to fit a lognormal distribution to a data set that has been censored on both ends. In essence, I want to find the parameters of the lognormal distribution (mu and sigma) that best fits the full distribution prior to censoring. Can I fit the distribution taking into account that I know I am only looking a a portion of the distribution?

我做了一些试验,但是很沮丧。这是一个示例:

I have done some experimentation, but have become stumped. Here's an example:

# Generate data #
D <- rlnorm(1000,meanlog = -0.75, sdlog = 1.5)
# Censor data #
min <- 0.10
max <- 20
Dt <- D[D > min]
Dt <- Dt[Dt <= max]

如果我适合使用fitdistr(MASS)或fitdist(fitdistrplus)进行非审查的数据(D),显然得到的参数值与我输入的参数值大致相同。但是,如果我适合检查数据(Dt),则参数值将与预期不匹配。问题是如何结合已知的检查。我在其他地方看到过一些在fitdistr中使用上限和下限的参考,但是遇到一个不确定的问题:

If I fit the non-censored data (D) using either fitdistr (MASS) or fitdist (fitdistrplus) I obviously get approximately the same parameter values as I entered. But if I fit the censored data (Dt) then the parameter values do not match, as expected. The question is how to incorporate the known censoring. I have seen some references elsewhere to using upper and lower within fitdistr, but I encounter an error that I'm not sure how to resolve:

> fitt <- fitdist(Dt, "lognormal", lower = min, upper = max)
Error in fitdist(Dt, "lognormal", lower = min, upper = max) :
The  dlognormal  function must be defined

我将不胜感激,首先请您咨询一下这是否是适合审查对象的适当方法分布,如果可以的话,如何定义dlognormal函数,以便我可以完成这项工作。谢谢!

I will appreciate any advice, first on whether this is the appropriate way to fit a censored distribution, and if so, how to go about defining the dlognormal function so that I can make this work. Thanks!

推荐答案

您的数据未审查(这意味着在
区间之外的观测值在那里,但是您不知道其确切值)
但被截断了(这些观测值已被丢弃)。

Your data is not censored (that would mean that observations outside the intervalare there, but you do not know their exact value)but truncated (those observations have been discarded).

您只需要提供 fitdist 以及您的截断分布的密度和累积分布函数

You just have to provide fitdist with the density and the cumulative distribution functionof your truncated distribution.

library(truncdist)
dtruncated_log_normal <- function(x, meanlog, sdlog)
  dtrunc(x, "lnorm", a=.10, b=20, meanlog=meanlog, sdlog=sdlog)
ptruncated_log_normal <- function(q, meanlog, sdlog)
  ptrunc(q, "lnorm", a=.10, b=20, meanlog=meanlog, sdlog=sdlog)

library(fitdistrplus)
fitdist(Dt, "truncated_log_normal", start = list(meanlog=0, sdlog=1))
# Fitting of the distribution ' truncated_log_normal ' by maximum likelihood
# Parameters:
#           estimate Std. Error
# meanlog -0.7482085 0.08390333
# sdlog    1.4232373 0.0668787

这篇关于将对数正态分布拟合到R中的截断数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-15 04:00