TensorFlow 学习笔记 - 几种 LSTM 对比 - 好文

TensorFlow 学习笔记 - 几种 LSTM 对比

* tf.nn.rnn_cell.BasicLSTMCell
* tf.nn.static_rnn
* tf.nn.static_rnn
* tf.nn.dynamic_rnn
* tf.contrib.cudnn_rnn
* tf.contrib.rnn.LSTMBlockCell
* tf.contrib.rnn.LSTMBlockFusedCell
* tf.contrib.rnn.BasicLSTMCell variants
1. BasicLSTMCell

tf.nn.rnn_cell.BasicLSTMCell 是一种参考或者标准实现。一般情况下，都不应该是首选。

The tf.nn.rnn_cell.BasicLSTMCell should be considered a reference
implementation and used only as a last resort when no other options will work.

2. tf.nn.static_rnn vs tf.nn.dynamic_rnn

如果不是使用一个 RNN layer，而是只使用一个 RNN cell，应该首要选择 tf.nn.dynamic_rnn。好处有：
1. 如果 inputs 过大的话，使用 tf.nn.static_rnn 会增加 graph 的大小，并且有增加编译时间。 2.
tf.nn.dynamic_rnn 能够很好地处理长 sequence，它可以从 GPU 和 CPU 中交换内存。
When using one of the cells, rather than the fully fused RNN layers, you have
a choice of whether to use tf.nn.static_rnn or tf.nn.dynamic_rnn. There
shouldn’t generally be a performance difference at runtime, but large unroll
amounts can increase the graph size of the tf.nn.static_rnn and cause long
compile times. An additional advantage of tf.nn.dynamic_rnn is that it can
optionally swap memory from the GPU to the CPU to enable training of very long
sequences. Depending on the model and hardware configuration, this can come at
a performance cost. It is also possible to run multiple iterations of
tf.nn.dynamic_rnn and the underlying tf.while_loop construct in parallel,
although this is rarely useful with RNN models as they are inherently
sequential.

3. tf.contrib.cudnn_rnn
1. 如果 NN 限定只在 NVIDIA 的 GPU 上运行，可以考虑使用 tf.contrib.cudnn_rnn，它通常比
tf.contrib.rnn.BasicLSTMCell 和 tf.contrib.rnn.LSTMBlockCell 快一个数量级，并且，相比于
tf.contrib.rnn.BasicLSTMCell，它使用少三四倍的内存。 2. 如果 NN 需要 layer normalization,
则不应该使用 tf.contrib.cudnn_rnn。
On NVIDIA GPUs, the use of tf.contrib.cudnn_rnn should always be preferred
unless you want layer normalization, which it doesn’t support. It is often at
least an order of magnitude faster than tf.contrib.rnn.BasicLSTMCell and
tf.contrib.rnn.LSTMBlockCell and uses 3-4x less memory than
tf.contrib.rnn.BasicLSTMCell.

4. tf.contrib.rnn.LSTMBlockCell

tf.contrib.rnn.LSTMBlockCell 通常在 reinforcement learning 中使用，适用于一个时间步伐运行一次 RNN
的场景。一般会和 tf.while_loop 结合使用，用来与 environment 进行交互。

If you need to run one step of the RNN at a time, as might be the case in
reinforcement learning with a recurrent policy, then you should use the
tf.contrib.rnn.LSTMBlockCell with your own environment interaction loop inside
a tf.while_loop construct. Running one step of the RNN at a time and returning
to Python is possible, but it will be slower.

5. tf.contrib.rnn.LSTMBlockFusedCell

在只有 CPU，或者 GPU 机器上无法获得 tf.contrib.cudnn_rnn，或者移动设备上，应该使用
tf.contrib.rnn.LSTMBlockFusedCell。

On CPUs, mobile devices, and if tf.contrib.cudnn_rnn is not available on your
GPU, the fastest and most memory efficient option is
tf.contrib.rnn.LSTMBlockFusedCell.

6. tf.contrib.rnn.BasicLSTMCell variants

对于 tf.contrib.rnn.BasicLSTMCell 变体，比如 tf.contrib.rnn.NASCell,
tf.contrib.rnn.PhasedLSTMCell, tf.contrib.rnn.UGRNNCell,
tf.contrib.rnn.GLSTMCell, tf.contrib.rnn.Conv1DLSTMCell,
tf.contrib.rnn.Conv2DLSTMCell, tf.contrib.rnn.LayerNormBasicLSTMCell, etc.,
它们都具有 tf.contrib.rnn.BasicLSTMCell 的缺点：性能差，高耗内存。
For all of the less common cell types like tf.contrib.rnn.NASCell,
tf.contrib.rnn.PhasedLSTMCell, tf.contrib.rnn.UGRNNCell,
tf.contrib.rnn.GLSTMCell, tf.contrib.rnn.Conv1DLSTMCell,
tf.contrib.rnn.Conv2DLSTMCell, tf.contrib.rnn.LayerNormBasicLSTMCell, etc., one
should be aware that they are implemented in the graph like
tf.contrib.rnn.BasicLSTMCell and as such will suffer from the same poor
performance and high memory usage. One should consider whether or not those
trade-offs are worth it before using these cells. For example, while layer
normalization can speed up convergence, because cuDNN is 20x faster the fastest
wall clock time to convergence is usually obtained without it.

热门工具换一换