本文介绍了Statsmodels PACF 图置信区间与 PACF 函数不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在观察偏自相关 (PACF) 图时,我有一个时间序列似乎有明显的滞后,即 PACF 值大于蓝色置信区间.我想以编程方式验证这一点,但似乎不起作用.

我用 statsmodels 时间序列 api 绘制了 PACF 图,这表明第一个滞后很重要.因此,我使用

这表明第一个滞后非常显着,约为 0.98,并且整个图中的置信区间(蓝色矩形)约为 (-0.06, 0.06).

或者,当试图获得这些精确的绘图值时(为简洁起见,仅获得前 10 个滞后):

sm.tsa.stattools.pacf(x, nlags=10, alpha=0.05)

得到的 PACF 值是(与上图匹配):

array([ 1. , 0.997998 , -0.00200201, -0.00200402, -0.00200605,-0.0020081, -0.00201015, -0.00201222, -0.0020143, -0.00201639,-0.00201849])

置信区间(在上图中以蓝色显示)似乎在第一个滞后时关闭:

 array([[ 1. , 1. ],[ 0.93601849, 1.0599775 ],[-0.06398151, 0.0599775],[-0.06398353, 0.05997548],[-0.06398556, 0.05997345],[-0.0639876 , 0.05997141],[-0.06398965, 0.05996935],[-0.06399172, 0.05996729],[-0.0639938, 0.05996521],[-0.06399589, 0.05996312],[-0.06399799, 0.05996101]]))

怎么回事?

API 参考:

解决方案

根据代码:

  • stattools.pacf 计算估计 pacf 的置信区间,即它以实际值为中心
  • graphics.tsa.plot_pacf 取该置信区间并减去估计的 pacf,因此置信区间以零为中心.

我不知道或不记得为什么要这样做.

在示例中,滞后大于或等于 2 的所有 pacf 都接近于零,因此绘图与 stattools.pacf 的结果之间没有明显差异.

I have a time series that appears to have a significant lag when observing the partial autocorrelation (PACF) plot, i.e. PACF value is greater than the blue confidence interval. I wanted to verify this programmatically but it doesn't seem to work.

I plotted the PACF plot with statsmodels time series api, which showed the first lag was significant. So, I used the PACF estimation to get the PACF values along with the confidence interval at each point, but the confidence intervals between the two don't match up. What's even more odd is the plot function in the source code uses the underlying estimation function so they should both match up.

Example:

import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm

x = np.arange(1000)
sm.graphics.tsa.plot_pacf(x)
plt.show()

Which shows the first lag is quite significant that is ~0.98 and the confidence interval (blue rectangle) is about (-0.06, 0.06) throughout the plot.

Alternatively, when trying to get these exact plot values (only getting first 10 lags for brevity):

sm.tsa.stattools.pacf(x, nlags=10, alpha=0.05)

The resulting PACF values are (which match the above plot):

array([ 1.        ,  0.997998  , -0.00200201, -0.00200402, -0.00200605,
        -0.0020081 , -0.00201015, -0.00201222, -0.0020143 , -0.00201639,
        -0.00201849])

And the confidence interval (shown in blue in the above graph), seems off for the first lag:

 array([[ 1.        ,  1.        ],
        [ 0.93601849,  1.0599775 ],
        [-0.06398151,  0.0599775 ],
        [-0.06398353,  0.05997548],
        [-0.06398556,  0.05997345],
        [-0.0639876 ,  0.05997141],
        [-0.06398965,  0.05996935],
        [-0.06399172,  0.05996729],
        [-0.0639938 ,  0.05996521],
        [-0.06399589,  0.05996312],
        [-0.06399799,  0.05996101]]))

What's going on?

Api Reference:

解决方案

according to the code:

  • stattools.pacf computes the confidence interval around the estimated pacf, i.e. it's centered at the actual value
  • graphics.tsa.plot_pacf takes that confidence interval and subtracts the estimated pacf, So the confidence interval is centered at zero.

I don't know or remember why it was done this way.

In the example all pacf for lags larger or equal to 2 are close to zero, so there is no visible difference between plot and the results from stattools.pacf.

这篇关于Statsmodels PACF 图置信区间与 PACF 函数不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 10:30