使用注意力机制的 LSTM 彻底改变时间序列预测

一、说明

在时间序列预测领域，对更准确、更高效的模型的追求始终存在。深度学习的应用为该领域的重大进步铺平了道路，其中，长短期记忆（LSTM）网络与注意力机制的集成尤其具有革命性。本文深入探讨了一个实际案例研究：使用这种复杂的架构预测 Apple Inc. （AAPL）的股价。

重要的是要承认这些想法的起源：韦恩·格雷。Wayne 是一名金融分析师，在人工智能方面拥有专业知识。在不透露太多的情况下，韦恩很棒，他的想法更上一层楼。这篇文章试图解释我们的对话，只是他研究的开始！

二、LSTM 和注意力机制简介

LSTM 网络是一种特殊的递归神经网络（RNN），能够学习数据序列中的长期依赖关系。它们被广泛用于顺序数据，并且是时间序列分析中许多预测建模成功的基石。注意力机制最初是为自然语言处理任务开发的，它通过允许模型在进行预测时专注于输入序列的特定部分来增强 LSTM，类似于人类注意力的工作方式。

三、为什么要将 LSTM 与时间序列注意力相结合？

金融市场的动态性质使得股价预测成为一项具有挑战性的任务。传统的 LSTM 尽管能够捕获时间依赖性，但可能会难以应对股价变动的噪音和波动。注意力机制通过为输入数据的不同时间步长提供加权重要性来解决这个问题，使模型能够优先考虑更相关的信息并提高其预测性能。

案例研究：预测 AAPL 股价
在我们的实验中，我们利用了AAPL四年的历史股价，使用“收盘价”进行预测。首先对数据进行归一化以帮助训练过程，然后输入到我们的 LSTM with Attention 模型中。

四、模型架构

我们的模型包括一个 LSTM 层，然后是一个注意力层和一个全连接层来产生输出。注意力层计算注意力权重并将其应用于 LSTM 的输出，生成一个上下文向量，作为最终预测的输入。

# Import necessary libraries
import yfinance as yf
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import MinMaxScaler

# Download Apple Inc. stock data
aapl_data = yf.download('AAPL', start='2020-01-01', end='2024-03-01')

# Use the 'Close' price for prediction
close_prices = aapl_data['Close'].values

# Normalize the data
scaler = MinMaxScaler(feature_range=(0, 1))
close_prices_scaled = scaler.fit_transform(close_prices.reshape(-1, 1))

# Prepare the dataset
X = close_prices_scaled[:-1]
y = close_prices_scaled[1:]

# Reshape for LSTM
X = X.reshape(-1, 1, 1)
y = y.reshape(-1, 1)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Convert to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32)

# LSTM with Attention Mechanism
class LSTMAttention(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim=1, num_layers=1):
        super(LSTMAttention, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
        self.attention = nn.Linear(hidden_dim, 1)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        lstm_out, _ = self.lstm(x)
        attention_weights = torch.softmax(self.attention(lstm_out).squeeze(-1), dim=-1)
        context_vector = torch.sum(lstm_out * attention_weights.unsqueeze(-1), dim=1)
        out = self.fc(context_vector)
        return out

# Instantiate and train the model
model = LSTMAttention(input_dim=1, hidden_dim=50)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Training loop
epochs = 100
for epoch in range(epochs):
    model.train()
    optimizer.zero_grad()
    output = model(X_train_tensor)
    loss = criterion(output, y_train_tensor)
    loss.backward()
    optimizer.step()

    if epoch % 10 == 0:
        model.eval()
        test_pred = model(X_test_tensor)
        test_loss = criterion(test_pred, y_test_tensor)
        print(f'Epoch {epoch}, Loss: {loss.item()}, Test Loss: {test_loss.item()}')

# Predictions
model.eval()
predictions = model(X_test_tensor).detach().numpy()
predictions_actual = scaler.inverse_transform(predictions)

# Plotting
plt.figure(figsize=(15, 5))
plt.plot(scaler.inverse_transform(y_test), label='Actual')
plt.plot(predictions_actual, label='Predicted')
plt.title('AAPL Stock Price Prediction')
plt.legend()
plt.show()

# Calculate MSE
mse = mean_squared_error(scaler.inverse_transform(y_test), predictions_actual)
print(f'Mean Squared Error: {mse}')

训练与评估

该模型经过了 100 多个时期的训练，显示训练集和测试集的均方误差（MSE）持续下降。当与实际值进行对比时，最终预测表明该模型能够密切跟踪实际价格走势。
使用注意力机制的 LSTM 彻底改变时间序列预测-LMLPHP

五、验证

# 第 1 步：获取下个月的额外数据
additional_data = yf。download（'AAPL'， start='2023-03-02'， end='2023-03-29'）

# 第 2 步：预处理新数据
new_close_prices = additional_data['Close'].values
new_close_prices_scaled = scaler.变换（new_close_prices。reshape（-1， 1））

# 准备用于预测
的新数据集 X_new = new_close_prices_scaled[：-1]
y_new_actual = new_close_prices_scaled[1：]

X_new = X_new。重塑（-1， 1， 1）
y_new_actual = y_new_actual。reshape（-1， 1）

# 转换为 PyTorch 张量
X_new_tensor = torch。tensor（X_new， dtype=torch.float32）
y_new_actual_tensor = 火炬。tensor（y_new_actual， dtype=torch.float32）

# 第 3 步：制作预测
模型。eval（）
new_predictions = model（X_new_tensor）。detach（） 中。numpy（）
new_predictions_actual = 缩放器。inverse_transform（new_predictions）

# 第 4 步：评估模型
plt.图（figsize=（15， 5））
plt.plot（缩放器。inverse_transform（y_new_actual）， label='实际'）
plt。plot（new_predictions_actual， label='Predicted'）
plt。title（'AAPL 新月股价预测'）
plt.legend（）
plt.show（）

六、计算指标

new_mse = mean_squared_error（scaler.inverse_transform（y_new_actual）， new_predictions_actual）
print（f'新均方误差：{new_mse}')

结果分析
第一个图显示测试集的预测价格与实际价格。这一时期的均方误差约为9.87，表明与看不见的数据有很强的拟合度。

为了进行可靠的验证，我们还预测了下个月的价格：

随后一段时间的 MSE 约为 5.63，这令人印象深刻，展示了该模型的泛化能力。
使用注意力机制的 LSTM 彻底改变时间序列预测-LMLPHP

七、结论

事实证明，LSTM 与 Attention 机制相结合，是处理股票价格等时间序列数据的强大架构。它不仅捕捉了时间模式，还辨别了对未来价值产生重大影响的关键时刻。虽然这个模型代表了向前迈出的一大步，但金融市场的波动性确保了对更精细模型的探索仍在继续。然而，我们的AAPL股价预测案例研究的结果是有希望的，并为未来的探索奠定了坚实的基础。

无水先生