本文介绍了如何使用有状态LSTM和batch_size>布置训练数据1个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景

我想在Keras中进行有状态" LSTM的小批量培训.我输入的训练数据在一个大矩阵"X"中,其尺寸为m x n,其中

I would like to do mini-batch training of "stateful" LSTMs in Keras. My input training data is in a large matrix "X" whose dimensions are m x n where

m = number-of-subsequences
n = number-of-time-steps-per-sequence

X的每一行都包含一个子序列,该子序列在前一行的子序列保留下来的位置进行拾取.因此,鉴于数据序列较长,

Each row of X contains a subsequence which picks up where the subsequence on the preceding row leaves off. So given a long sequence of data,

Data = ( t01, t02, t03, ... )

其中"tK"表示原始数据中位置K的令牌,该序列在X中的布局如下:

where "tK" means the token at position K in the original data, the sequence is layed out in X like so:

X = [
  t01 t02 t03 t04
  t05 t06 t07 t08
  t09 t10 t11 t12
  t13 t14 t15 t16
  t17 t18 t19 t20
  t21 t22 t23 t24
]

问题

我的问题是当我对有状态LSTM以这种方式布置的数据进行小批量培训时会发生什么.具体来说,小批量训练通常一次训练连续"的行组.因此,如果我使用2的小批量,那么X将被分成三个小批量X1,X2和X3,其中

My question is about what happens when I do mini-batch training on data layed out this way with stateful LSTMs. Specifically, mini-batch training typically trains on "contiguous" groups of rows at a time. So if I use a mini-batch size of 2, then X would be split into three mini-batches X1, X2 and X3 where

X1 = [
  t01 t02 t03 t04
  t05 t06 t07 t08
]

X2 = [
  t09 t10 t11 t12
  t13 t14 t15 t16
]

X3 = [
  t17 t18 t19 t20
  t21 t22 t23 t25
]

请注意,这种类型的小批量处理与训练有状态 LSTM不同,因为通过处理前一批的最后一列而产生的隐藏状态不是与时间相对应的隐藏状态-在后续批处理的第一列之前执行

Notice that this type of mini-batching does not agree with training stateful LSTMs since the hidden states produced by processing the last column of the previous batch are not the hidden states that correspond to the time-step before the first column of the subsequent batch.

要看到这一点,请注意,迷你批处理将像这样从左到右进行处理:

To see this, notice that the mini-batches will be processed as though from left-to-right like this:

------ X1 ------+------- X2 ------+------- X3 -----
t01 t02 t03 t04 | t09 t10 t11 t12 | t17 t18 t19 t20
t05 t06 t07 t08 | t13 t14 t15 t16 | t21 t22 t23 t24

暗示

- Token t04 comes immediately before t09
- Token t08 comes immediately before t13
- Token t12 comes immediately before t17
- Token t16 comes immediately before t21

但是我想使用迷你批处理对行进行分组,以便我们在迷你批处理之间获得这种时间对齐:

But I want mini-batching to group rows so that we get this kind of temporal alignment across mini-batches:

------ X1 ------+------- X2 ------+------- X3 -----
t01 t02 t03 t04 | t05 t06 t07 t08 | t09 t10 t11 t12
t13 t14 t15 t16 | t17 t18 t19 t20 | t21 t22 t23 t24

在Keras中训练LSTM的标准方法是什么?

What is the standard way to accomplish this when training LSTMs in Keras?

感谢这里的任何指针.

推荐答案

谢谢.丹尼尔似乎暗示,如果我像这样重组X,我可以使用2的批处理大小:

Thanks. It seems Daniel implies I can use a batch size of 2 if I reorganize X like this:

X = [
  t01 t02 t03 t04 
  t13 t14 t15 t16
  t05 t06 t07 t08
  t17 t18 t19 t20 
  t09 t10 t11 t12
  t21 t22 t23 t24
]

这是正确的解释吗?

这篇关于如何使用有状态LSTM和batch_size>布置训练数据1个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-12 16:13