论文原文
论文下载
论文被引：1651（2020/03/01） 4827（2022/03/26）
论文年份：2015

文章目录

Abstract
1 Introduction
2 Preliminaries
- 2.1 Formulation of Precipitation Nowcasting Problem
- 2.2 Long Short-Term Memory for Sequence Modeling
3 The Model
- 3.1 Convolutional LSTM
- 3.2 Encoding-Forecasting Structure
4 Experiments
- 4.1 Moving-MNIST Dataset
- 4.2 Radar Echo Dataset
5 Conclusion and Future Work
References

Abstract

The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time. Very few previous studies have examined this crucial and challenging weather forecasting problem from the machine learning perspective. In this paper, we formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem in which both the input and the prediction target are spatiotemporal sequences. By extending the fully connected LSTM (FC-LSTM) to have convolutional structures in both the input-to-state and state-to-state transitions, we propose the convolutional LSTM (ConvLSTM) and use it to build an end-to-end trainable model for the precipitation nowcasting problem.Experiments show that our ConvLSTM network captures spatiotemporal correlations better and consistently outperforms FC-LSTM and the state-of-theart operational ROVER algorithm for precipitation nowcasting.

降水临近预报的目标是在相对较短的时间内预测当地未来的降雨强度。从机器学习的角度来看，以前很少有研究这个关键而又具有挑战性的天气预报问题。本文将降水临近预报问题表述为一个时空序列预测问题（spatiotemporal sequence forecasting problem），其中输入和预报目标都是时空序列。通过扩展全连接LSTM (FC-LSTM)使其在输入到状态和状态到状态的转换中都具有卷积结构，我们提出了卷积LSTM (ConvLSTM)，并利用它建立了一个降水临近预报问题的端到端可训练模型。实验表明，我们的ConvLSTM网络能够更好地捕捉时空相关性，并且在降水临近预报方面始终优于FC-LSTM和先进的业务ROVER算法。

1 Introduction

Nowcasting convective precipitation has long been an important problem in the field of weather forecasting. The goal of this task is to give precise and timely prediction of rainfall intensity in a local region over a relatively short period of time (e.g., 0-6 hours). It is essential for taking such timely actions as generating society-level emergency rainfall alerts, producing weather guidance for airports, and seamless integration with a longer-term numerical weather prediction (NWP) model. Since the forecasting resolution and time accuracy required are much higher than other traditional forecasting tasks like weekly average temperature prediction, the precipitation nowcasting problem is quite challenging and has emerged as a hot research topic in the meteorology community [22].

对流降水的临近预报一直是天气预报领域的一个重要问题。这项任务的目标是在相对较短的时间内(如0-6小时)对当地的降雨强度进行精确和及时的预测。这对于及时采取行动至关重要，如生成社会层面的紧急降雨警报、为机场提供天气指导，以及与长期数值天气预测(NWP)模型无缝集成。由于所需的预报分辨率和时间精度远高于其他传统预报任务，如周平均温度预报，降水临近预报问题极具挑战性，已成为气象界的一个热门研究课题[22]。

Existing methods for precipitation nowcasting can roughly be categorized into two classes [22], namely, NWP based methods and radar echo1extrapolation based methods. For the NWP approach, making predictions at the nowcasting timescale requires a complex and meticulous simulation of the physical equations in the atmosphere model. Thus the current state-of-the-art operational precipitation nowcasting systems [19, 6] often adopt the faster and more accurate extrapolation based methods. Specifically, some computer vision techniques, especially optical flow based methods, have proven useful for making accurate extrapolation of radar maps [10, 6, 20]. One recent progress along this path is the Real-time Optical flow by Variational methods for Echoes of Radar (ROVER) algorithm [25] proposed by the Hong Kong Observatory (HKO) for its Short-range Warning of Intense Rainstorms in Localized System (SWIRLS) [15]. ROVER calculates the optical flow of consecutive radar maps using the algorithm in [5] and performs semi-Lagrangian advection [4] on the flow field, which is assumed to be still, to accomplish the prediction. However, the success of these optical flow based methods is limited because the flow estimation step and the radar echo extrapolation step are separated and it is challenging to determine the model parameters to give good prediction performance.

现有的降水临近预报方法大致可分为两类[22]，即基于NWP的方法和基于雷达回波提取的方法。对于NWP方法，在临近预报时间尺度上进行预测，需要对大气模型中的物理方程进行复杂而细致的模拟。因此，目前最先进的降水临近预报系统[19，6]通常采用更快、更准确的外推方法。具体来说，一些计算机视觉技术，尤其是基于光流的方法，已被证明可用于精确推断雷达图[10、6、20]。最近在这方面取得的一项进展是由HKO天文台提出的雷达回波变分方法实时光流算法[25]，该算法用于本地系统中强暴雨的短期预警[15]。ROVER使用[5]中的算法计算连续雷达图的光流，并对假设静止的流场执行半拉格朗日平流[4]，以完成预测。然而，这些基于光流的方法的成功是有限的，因为流估计步骤和雷达回波推断步骤是分开的，并且确定模型参数以给出良好的预测性能是具有挑战性的。

These technical issues may be addressed by viewing the problem from the machine learning perspective. In essence, precipitation nowcasting is a spatiotemporal sequence forecasting problem with the sequence of past radar maps as input and the sequence of a fixed number (usually larger than 1) of future radar maps as output.2However, such learning problems, regardless of their exact applications, are nontrivial in the first place due to the high dimensionality of the spatiotemporal sequences especially when multi-step predictions have to be made, unless the spatiotemporal structure of the data is captured well by the prediction model. Moreover, building an effective prediction model for the radar echo data is even more challenging due to the chaotic nature of the atmosphere.

这些技术问题可以通过从机器学习的角度来看待问题来解决。本质上，降水临近预报是一个时空序列预测问题，以过去雷达图的序列作为输入，以固定数量(通常大于1)的未来雷达图的序列作为输出。然而，由于时空序列的高维性，这种学习问题，无论其具体应用如何，首先是不重要的，特别是当必须进行多步预测时，除非预测模型很好地捕捉到数据的时空结构。此外，由于大气的混沌特性，为雷达回波数据建立有效的预测模型更具挑战性。

Recent advances in deep learning, especially recurrent neural network (RNN) and long short-term memory (LSTM) models [12, 11, 7, 8, 23, 13, 18, 21, 26], provide some useful insights on how to tackle this problem. According to the philosophy underlying the deep learning approach, if we have a reasonable end-to-end model and sufficient data for training it, we are close to solving the problem. The precipitation nowcasting problem satisfies the data requirement because it is easy to collect a huge amount of radar echo data continuously. What is needed is a suitable model for end-to-end learning. The pioneering LSTM encoder-decoder framework proposed in [23] provides a general framework for sequence-to-sequence learning problems by training temporally concatenated LSTMs, one for the input sequence and another for the output sequence. In [18], it is shown that prediction of the next video frame and interpolation of intermediate frames can be done by building an RNN based language model on the visual words obtained by quantizing the image patches. They propose a recurrent convolutional neural network to model the spatial relationships but the model only predicts one frame ahead and the size of the convolutional kernel used for state-to-state transition is restricted to 1. Their work is followed up later in [21] which points out the importance of multi-step prediction in learning useful representations. They build an LSTM encoder-decoderpredictor model which reconstructs the input sequence and predicts the future sequence simultaneously. Although their method can also be used to solve our spatiotemporal sequence forecasting problem, the fully connected LSTM (FC-LSTM) layer adopted by their model does not take spatial correlation into consideration.

深度学习的最新进展，特别是递归神经网络(RNN)和长短期记忆(LSTM)模型[12，11，7，8，23，13，18，21，26]，为如何解决这个问题提供了一些有用的见解。根据深度学习方法背后的哲学，如果我们有一个合理的端到端模型和足够的数据来训练它，我们就接近解决问题。降水临近预报问题满足了数据要求，因为它易于连续采集大量的雷达回波数据。需要的是适合端到端学习的模式。在[23]中提出的开创性的LSTM编码器-解码器框架，通过训练时间上连接的LSTM，一个用于输入序列，另一个用于输出序列，为序列到序列的学习问题提供了一个通用框架。在[18]中，提出了下一个视频帧的预测和中间帧的插值可以通过在通过量化图像块获得的视觉单词上建立基于RNN的语言模型来完成。他们提出了一种递归卷积神经网络来模拟空间关系，但该模型只预测前一帧，用于状态到状态转换的卷积核的大小被限制为1。他们的工作在后面的[21]中进行了跟进，指出了多步预测在学习有用表示中的重要性。他们建立了一个LSTM编码器-解码器预测模型，该模型重建输入序列并同时预测未来序列。虽然他们的方法也可以用于解决我们的时空序列预测问题，但他们的模型采用的 全连接LSTM (FC-LSTM)层没有考虑空间相关性。

In this paper, we propose a novel convolutional LSTM (ConvLSTM) network for precipitation nowcasting. We formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem that can be solved under the general sequence-to-sequence learning framework proposed in [23]. In order to model well the spatiotemporal relationships, we extend the idea of FC-LSTM to ConvLSTM which has convolutional structures in both the input-to-state and state-to-state transitions. By stacking multiple ConvLSTM layers and forming an encoding-forecasting structure, we can build an end-to-end trainable model for precipitation nowcasting. For evaluation, we have created a new real-life radar echo dataset which can facilitate further research especially on devising machine learning algorithms for the problem. When evaluated on a synthetic Moving-MNIST dataset [21] and the radar echo dataset, our ConvLSTM model consistently outperforms both the FC-LSTM and the state-of-the-art operational ROVER algorithm.

本文提出了一种新的卷积LSTM网络用于降水临近预报。我们将降水临近预报表述为一个时空序列预测问题，这个问题可以在文献[23]中提出的一般序列到序列学习框架下解决。为了更好地模拟时空关系，我们将LSTM函数的思想推广到在输入到状态和状态到状态的转换中具有卷积结构的卷积模型。通过叠加多个ConvLSTM层并形成编码预测结构，我们可以为降水临近预报建立一个端到端的可训练模型。为了进行评估，我们创建了一个新的真实雷达回波数据集，这可以促进进一步的研究，特别是针对该问题设计机器学习算法。当在合成的Moving-MNIST数据集[21]和雷达回波数据集上进行评估时，我们的ConvLSTM模型始终优于FC-LSTM和最先进的操作ROVER算法。

2 Preliminaries

2.1 Formulation of Precipitation Nowcasting Problem

The goal of precipitation nowcasting is to use the previously observed radar echo sequence to forecast a fixed length of the future radar maps in a local region (e.g., Hong Kong, New York, or Tokyo). In real applications, the radar maps are usually taken from the weather radar every 6-10 minutes and nowcasting is done for the following 1-6 hours, i.e., to predict the 6-60 frames ahead. From the machine learning perspective, this problem can be regarded as a spatiotemporal sequence forecasting problem.

降水临近预报的目的是利用以前观测到的雷达回波序列来预测一个地区(如香港、纽约或东京)未来雷达图的固定长度。在实际应用中，雷达图通常每6-10分钟从天气雷达上获取一次，并在随后的1-6小时内进行临近预报，即预测未来的6-60帧。从机器学习的角度来看，这个问题可以看作是一个时空序列预测问题。

Suppose we observe a dynamical system over a spatial region represented by an M×NM×NM×N grid which consists of MMM rows and NNN columns. Inside each cell in the grid, there are PPP measurements which vary over time. Thus, the observation at any time can be represented by a tensor χ∈RP×M×N\chi ∈ R^{P×M×N}χ∈RP×M×N, where RRR denotes the domain of the observed features. If we record the observations periodically, we will get a sequence of tensors χ1^,χ2^,...,χt^\hat{\chi_1}, \hat{\chi_2}, . . . ,\hat{\chi_t}χ1^,χ2^,...,χt^. The spatiotemporal sequence forecasting problem is to predict the most likely length-KKK sequence in the future given the previous JJJ observations which include the current one:

假设我们在一个空间区域上观察到一个动力系统，这个空间区域由一个由 MMM 行和 NNN 列组成的 M×NM×NM×N 网格表示。在网格中的每个单元格内，都有随时间变化的 PPP 度量。因此，任何时候的观测值都可以用张量 χ∈RP×M×N\chi ∈ R^{P×M×N}χ∈RP×M×N 来表示，其中 RRR 表示观测特征的定义域。如果我们定期记录观察结果，我们将得到一系列张量 χ1^,χ2^,...,χt^\hat{\chi_1}, \hat{\chi_2}, . . . ,\hat{\chi_t}χ1^,χ2^,...,χt^。时空序列预测问题是，在给定先前的 JJJ 个观测值的情况下，预测未来最可能的长度为 KKK 的序列，包括当前的观测值：

For precipitation nowcasting, the observation at every timestamp is a 2D radar echo map. If we divide the map into tiled non-overlapping patches and view the pixels inside a patch as its measurements (see Fig. 1), the nowcasting problem naturally becomes a spatiotemporal sequence forecasting problem.

对于降水临近预报，每个时间戳的观测值是2D雷达回波图。如果我们将地图划分为平铺的非重叠面片，并将面片内的像素视为其测量值(参见图1)，则临近预报问题自然会变成时空序列预测问题。

We note that our spatiotemporal sequence forecasting problem is different from the one-step time series forecasting problem because the prediction target of our problem is a sequence which contains both spatial and temporal structures. Although the number of free variables in a length-K sequence can be up to O(MKNKPK)O(M^KN^KP^K)O(MKNKPK), in practice we may exploit the structure of the space of possible predictions to reduce the dimensionality and hence make the problem tractable.

我们注意到，我们的时空序列预测问题不同于一步时间序列预测问题，因为我们的问题的预测目标是包含空间和时间结构的序列。虽然长度为 KKK 的序列中自由变量的数量可以达到 O(MKNKPK)O(M^KN^KP^K)O(MKNKPK)，但在实践中，我们可以利用可能预测的空间结构来降低维数，从而使问题易于处理。

2.2 Long Short-Term Memory for Sequence Modeling

For general-purpose sequence modeling, LSTM as a special RNN structure has proven stable and powerful for modeling long-range dependencies in various previous studies [12, 11, 17, 23]. The major innovation of LSTM is its memory cell ct which essentially acts as an accumulator of the state information. The cell is accessed, written and cleared by several self-parameterized controlling gates. Every time a new input comes, its information will be accumulated to the cell if the input gate itis activated. Also, the past cell status ct−1could be “forgotten” in this process if the forget gate ftis on. Whether the latest cell output ctwill be propagated to the final state htis further controlled by the output gate ot. One advantage of using the memory cell and gates to control information flow is that the gradient will be trapped in the cell (also known as constant error carousels [12]) and be prevented from vanishing too quickly, which is a critical problem for the vanilla RNN model [12, 17, 2]. FC-LSTM may be seen as a multivariate version of LSTM where the input, cell output and states are all 1D vectors. In this paper, we follow the formulation of FC-LSTM as in [11]. The key equations are shown in (2) below, where ‘◦’ denotes the Hadamard product:

对于通用序列建模，LSTM作为一种特殊的RNN结构，在以前的各种研究中已经被证明是稳定和强大的，可以对长程依赖关系进行建模[12，11，17，23]。LSTM的主要创新是它的记忆单元 ctc_tct，它实质上充当了状态信息的累加器。该单元由几个自参数化控制门访问、写入和清除。每当有新的输入时，如果输入门 iti_tit 被激活，它的信息就会累积到单元中。此外，如果忘记门 ftf_tft 开启，过去的单元状态 ct−1c_{t-1}ct−1 在此过程中可能会被“忘记”。最后的单元输出是否将被传播到最终状态由输出门 oto_tot 进一步控制。使用记忆单元和门来控制信息流的一个优点是，梯度将被捕获在单元中(也称为恒定误差转盘(constant error carousels)[12])，并防止梯度消失得太快，这是普通RNN模型[12，17，2]的一个关键问题。FC-LSTM可以被视为一个多元版本的LSTM，其中的输入，单元输出和状态都是1D矢量。在本文中，我们遵循[11]中的FC-LSTM公式。关键方程如下(2)所示，其中 ◦◦◦ 表示哈达玛乘积：

Multiple LSTMs can be stacked and temporally concatenated to form more complex structures. Such models have been applied to solve many real-life sequence modeling problems [23, 26].

可以堆叠多个LSTM，并在时间上级联以形成更复杂的结构。这种模型已被应用于解决许多现实生活中的序列建模问题[23，26]。

3 The Model

We now present our ConvLSTM network. Although the FC-LSTM layer has proven powerful for handling temporal correlation, it contains too much redundancy for spatial data. To address this problem, we propose an extension of FC-LSTM which has convolutional structures in both the input-to-state and state-to-state transitions. By stacking multiple ConvLSTM layers and forming an encoding-forecasting structure, we are able to build a network model not only for the precipitation nowcasting problem but also for more general spatiotemporal sequence forecasting problems.

现在展示我们的ConvLSTM网络。尽管FC-LSTM层已被证明在处理时间相关性方面非常强大，但它包含了太多的空间数据冗余。为了解决这个问题，我们提出了一个扩展的FC-LSTM，它在输入到状态和状态到状态的转换中都有卷积结构。通过叠加多个ConvLSTM层并形成编码预测结构，我们不仅能够为降水临近预报问题建立网络模型，而且能够为更一般的时空序列预测问题建立网络模型。

3.1 Convolutional LSTM

The major drawback of FC-LSTM in handling spatiotemporal data is its usage of full connections in input-to-state and state-to-state transitions in which no spatial information is encoded. To overcome this problem, a distinguishing feature of our design is that all the inputs χ1,...,χt\chi_1, ... ,\chi_tχ1,...,χt, cell outputs C1,...,CtC_1, ... ,C_tC1,...,Ct, hidden states H1,...,HtH_1, ... ,H_tH1,...,Ht, and gates it,ft,oti_t, f_t, o_tit,ft,ot of the ConvLSTM are 3D tensors whose last two dimensions are spatial dimensions (rows and columns). To get a better picture of the inputs and states, we may imagine them as vectors standing on a spatial grid. The ConvLSTM determines the future state of a certain cell in the grid by the inputs and past states of its local neighbors. This can easily be achieved by using a convolution operator in the state-to-state and input-to-state transitions (see Fig. 2). The key equations of ConvLSTM are shown in (3) below, where ‘∗’ denotes the convolution operator and ‘◦’ denotes the Hadamard product:

FC-LSTM在处理时空数据方面的主要缺点是它在输入到状态（input-to-state）和状态到状态（state-to-state）的转换中使用了全连接，而在这些转换中没有空间信息被编码。为了克服这个问题，我们设计的一个显著特点是所有输入 χ1,...,χt\chi_1 , ... , \chi_tχ1,...,χt，单元输出 C1,...,CtC_1, ... , C_tC1,...,Ct，隐藏状态 H1,...,HtH_1, . . . ,H_tH1,...,Ht，ConvLSTM的 it,ft,oti_t, f_t, o_tit,ft,ot 门都是3D张量，其最后两个维度是空间维度（行和列）。为了更好地了解输入和状态，我们可以把它们想象成站在空间网格上的向量。ConvLSTM通过其本地邻居的输入和过去状态来确定网格中某个单元的未来状态。这可以很容易地通过在状态到状态和输入到状态的转换中使用卷积算子来实现(见图2)。ConvLSTM的关键方程如(3)所示，其中 ‘∗’ 表示卷积算子，‘◦’ 表示哈达玛乘积：

If we view the states as the hidden representations of moving objects, a ConvLSTM with a larger transitional kernel should be able to capture faster motions while one with a smaller kernel can capture slower motions. Also, if we adopt a similar view as [16], the inputs, cell outputs and hidden states of the traditional FC-LSTM represented by (2) may also be seen as 3D tensors with the last two dimensions being 1. In this sense, FC-LSTM is actually a special case of ConvLSTM with all features standing on a single cell.

如果我们将状态视为运动对象的隐藏表示，具有较大过渡内核（transitional kernel）的ConvLSTM应该能够捕捉更快的运动，而具有较小内核的ConvLSTM可以捕捉更慢的运动。此外，如果我们采用与[16]类似的观点，则由(2)表示的传统FC-LSTM的输入、单元输出和隐藏状态也可以被视为3D张量，最后两个维度为1。从这个意义上说，FC-LSTM实际上是ConvLSTM的一个特例，所有特征都位于一个单元上。

To ensure that the states have the same number of rows and same number of columns as the inputs, padding is needed before applying the convolution operation. Here, padding of the hidden states on the boundary points can be viewed as using the state of the outside world for calculation. Usually, before the first input comes, we initialize all the states of the LSTM to zero which corresponds to “total ignorance” of the future. Similarly, if we perform zero-padding (which is used in this paper) on the hidden states, we are actually setting the state of the outside world to zero and assume no prior knowledge about the outside. By padding on the states, we can treat the boundary points differently, which is helpful in many cases. For example, imagine that the system we are observing is a moving ball surrounded by walls. Although we cannot see these walls, we can infer their existence by finding the ball bouncing over them again and again, which can hardly be done if the boundary points have the same state transition dynamics as the inner points.

为了确保状态具有与输入相同的行数和列数，在应用卷积运算之前需要填充。这里，边界点上隐藏状态的填充可以被视为使用外界的状态进行计算。通常，在第一次输入到来之前，我们将LSTM的所有状态初始化为零，这相当于对未来的“完全无知”。类似地，如果我们对隐藏状态执行零填充(本文中使用)，我们实际上是将外界的状态设置为零，并假设没有关于外部的先验知识。通过填充状态，我们可以不同地处理边界点，这在许多情况下是有帮助的。例如，想象我们正在观察的系统是一个被墙包围的运动的球。虽然我们看不到这些墙，但我们可以通过一次又一次地发现球在它们上面反弹来推断它们的存在，如果边界点与内部点具有相同的状态转移动力学，这很难做到。

3.2 Encoding-Forecasting Structure

Like FC-LSTM, ConvLSTM can also be adopted as a building block for more complex structures. For our spatiotemporal sequence forecasting problem, we use the structure shown in Fig. 3 which consists of two networks, an encoding network and a forecasting network. Like in [21], the initial states and cell outputs of the forecasting network are copied from the last state of the encoding network. Both networks are formed by stacking several ConvLSTM layers. As our prediction target has the same dimensionality as the input, we concatenate all the states in the forecasting network and feed them into a 1 × 1 convolutional layer to generate the final prediction.

像FC-LSTM一样，ConvLSTM也可以用作更复杂结构的构件。对于我们的时空序列预测问题，我们使用图3所示的结构，它由两个网络组成，一个编码网络和一个预测网络。像在[21]中一样，预测网络的初始状态和单元输出是从编码网络的最后状态复制而来的。这两种网络都是由几个ConvLSTM层堆叠而成的。由于我们的预测目标与输入具有相同的维数，因此我们将预测网络中的所有状态连接起来，并将它们馈送到 1×11×11×1 卷积层，以生成最终的预测。

We can interpret this structure using a similar viewpoint as [23]. The encoding LSTM compresses the whole input sequence into a hidden state tensor and the forecasting LSTM unfolds this hidden state to give the final prediction:

我们可以用类似于[23]的观点来解释这个结构。编码LSTM将整个输入序列压缩成一个隐藏状态张量，预测LSTM展开这个隐藏状态以给出最终预测:

This structure is also similar to the LSTM future predictor model in [21] except that our input and output elements are all 3D tensors which preserve all the spatial information. Since the network has multiple stacked ConvLSTM layers, it has strong representational power which makes it suitable for giving predictions in complex dynamical systems like the precipitation nowcasting problem we study here.

这种结构也类似于[21]中的LSTM未来预测模型，只是我们的输入和输出元素都是保留所有空间信息的3D张量。由于该网络具有多个层叠的ConvLSTM层，它具有很强的代表性，这使得它适合于在复杂的动力系统中给出预测，如我们在这里研究的降水临近预报问题。

4 Experiments

We first compare our ConvLSTM network with the FC-LSTM network on a synthetic MovingMNIST dataset to gain some basic understanding of the behavior of our model. We run our model with different number of layers and kernel sizes and also study some “out-of-domain” cases as in [21]. To verify the effectiveness of our model on the more challenging precipitation nowcasting problem, we build a new radar echo dataset and compare our model with the state-of-the-art ROVER algorithm based on several commonly used precipitation nowcasting metrics. The results of the experiments conducted on these two datasets lead to the following findings:

ConvLSTM is better than FC-LSTM in handling spatiotemporal correlations.
Making the size of state-to-state convolutional kernel bigger than 1 is essential for capturing the spatiotemporal motion patterns.
Deeper models can produce better results with fewer parameters.
ConvLSTM performs better than ROVER for precipitation nowcasting.

我们首先在一个合成的 Moving-MNIST 数据集上比较ConvLSTM网络和FC-LSTM网络，以获得对模型行为的一些基本理解。我们用不同的层数和内核大小运行我们的模型，并研究一些“域外”情况，如[21]。为了验证我们的模型在更具挑战性的降水临近预报问题上的有效性，我们构建了一个新的雷达回波数据集，并将我们的模型与基于几种常用的降水临近预报度量的最先进的ROVER算法进行了比较。在这两个数据集上进行的实验结果得出以下发现：

在处理时空相关性方面，ConvLSTM优于FC-LSTM。
使状态到状态的卷积核的大小大于1对于捕获时空运动模式至关重要。
更深的模型可以用更少的参数产生更好的结果。
ConvLSTM在降水临近预报方面的表现优于ROVER。

Our implementations of the models are in Python with the help of Theano [3, 1]. We run all the experiments on a computer with a single NVIDIA K20 GPU. Also, more illustrative “gif” examples are included in the appendix.

我们在Theano [3，1]下使用Python实现这些模型。我们在具有单个NVIDIA K20 GPU的计算机上运行所有实验。另外，附录中还包含更多说明性的“gif”示例。

4.1 Moving-MNIST Dataset

For this synthetic dataset, we use a generation process similar to that described in [21]. All data instances in the dataset are 20 frames long (10 frames for the input and 10 frames for the prediction) and contain two handwritten digits bouncing inside a 64 × 64 patch. The moving digits are chosen randomly from a subset of 500 digits in the MNIST dataset.3The starting position and velocity direction are chosen uniformly at random and the velocity amplitude is chosen randomly in [3,5). This generation process is repeated 15000 times, resulting in a dataset with 10000 training sequences, 2000 validation sequences, and 3000 testing sequences. We train all the LSTM models by minimizing the cross-entropy loss4using back-propagation through time (BPTT) [2] and RMSProp [24] with a learning rate of 10−310^{-3}10−3 and a decay rate of 0.90.90.9. Also, we perform early-stopping on the validation set.

对于这个合成数据集，我们使用类似于[21]中描述的生成过程。数据集中的所有数据实例都是20帧长(10帧用于输入，10帧用于预测)，并且包含两个在64 × 64补丁中的手写数字。移动数字是从MNIST数据集中的500个数字的子集随机选择的。起始位置和速度方向是随机均匀选择的，速度幅度是在[3，5]中随机选择的。该生成过程重复15000次，产生具有10000个训练序列、2000个验证序列和3000个测试序列的数据集。我们通过最小化交叉熵损失来训练所有的LSTM模型，使用通过时间的反向传播(BPTT) [2]和RMSProp [24]，学习率为 10−310^{-3}10−3，衰减率为 0.90.90.9。此外，我们对验证集执行提前停止。

Despite the simple generation process, there exist strong nonlinearities in the resulting dataset because the moving digits can exhibit complicated appearance and will occlude and bounce during their movement. It is hard for a model to give accurate predictions on the test set without learning the inner dynamics of the system.

尽管生成过程很简单，但在生成的数据集中存在很强的非线性，因为移动的数字可能会呈现复杂的外观，并且在移动过程中会遮挡和反弹。如果不了解系统的内部动态，模型很难对测试集做出准确的预测。

For the FC-LSTM network, we use the same structure as the unconditional future predictor model in [21] with two 2048-node LSTM layers. For our ConvLSTM network, we set the patch size to 4 × 4 so that each 64 × 64 frame is represented by a 16 × 16 × 16 tensor. We test three variants of our model with different number of layers. The 1-layer network contains one ConvLSTM layer with 256 hidden states, the 2-layer network has two ConvLSTM layers with 128 hidden states each, and the 3-layer network has 128, 64, and 64 hidden states respectively in the three ConvLSTM layers. All the input-to-state and state-to-state kernels are of size 5 × 5. Our experiments show that the ConvLSTM networks perform consistently better than the FC-LSTM network. Also, deeper models can give better results although the improvement is not so significant between the 2-layer and 3-layer networks. Moreover, we also try other network configurations with the state-to-state and input-tostate kernels of the 2-layer and 3-layer networks changed to 1×1 and 9×9, respectively. Although the number of parameters of the new 2-layer network is close to the original one, the result becomes much worse because it is hard to capture the spatiotemporal motion patterns with only 1×1 state-tostate transition. Meanwhile, the new 3-layer network performs better than the new 2-layer network since the higher layer can see a wider scope of the input. Nevertheless, its performance is inferior to networks with larger state-to-state kernel size. This provides evidence that larger state-to-state kernels are more suitable for capturing spatiotemporal correlations. In fact, for 1 × 1 kernel, the receptive field of the states will not grow as time advances. But for larger kernels, later states have larger receptive fields and are related to a wider range of the input. The average cross-entropy loss (cross-entropy loss per sequence) of each algorithm on the test set is shown in Table 1. We need to point out that our experiment setting is different from [21] where an infinite number of training data is assumed to be available. The current offline setting is chosen in order to understand how different models perform in occasions where not so much data is available. Comparison of the 3-layer ConvLSTM and FC-LSTM in the online setting is included in the appendix.

对于FC-LSTM网络，我们使用与[21]中无条件未来预测模型相同的结构，具有两个2048节点的LSTM层。对于我们的ConvLSTM网络，我们将补丁大小设置为4 × 4，以便每个64 × 64帧由16×16×16张量表示。我们用不同的层数测试我们模型的三个变体。一层网络包含一个具有256个隐藏状态的ConvLSTM层，两层网络包含两个各具有128个隐藏状态的ConvLSTM层，而三层网络包含分别具有128、64和64个隐藏状态的ConvLSTM层。所有输入到状态和状态到状态内核的大小都是5 × 5。我们的实验表明，ConvLSTM网络的性能始终优于FC-LSTM网络。此外，更深的模型可以给出更好的结果，尽管两层和三层网络之间的改进并不显著。此外，我们还尝试了其他网络配置，其中2层和3层网络的状态到状态核和输入到状态核分别变为1×1和9×9。虽然新的两层网络的参数数目接近原始网络，但结果变得更糟，因为仅用1×1的状态到状态转换很难捕捉时空运动模式。同时，新的3层网络比新的2层网络性能更好，因为更高层可以看到更大范围的输入。然而，它的性能不如具有更大的状态到状态内核大小的网络。这提供了更大的状态到状态核更适合捕捉时空相关性的证据。事实上，对于1 × 1内核，状态的感受野不会随着时间的推移而增长。但是对于更大的核，后期状态有更大的感受野，并且与更大范围的输入相关。测试集上每个算法的平均交叉熵损失(每个序列的交叉熵损失)如表1所示。我们需要指出，我们的实验设置与[21]不同，在[21]中，假设有无限数量的训练数据可用。选择当前的离线设置是为了了解不同型号在没有太多数据可用的情况下的性能。附录中包含了3层ConvLSTM和FC-LSTM在在线设置中的比较。

Next, we test our model on some “out-of-domain” inputs. We generate another 3000 sequences of three moving digits, with the digits drawn randomly from a different subset of 500 MNIST digits that does not overlap with the training set. Since the model has never seen any system with three digits, such an “out-of-domain” run is a good test of the generalization ability of the model [21]. The average cross-entropy error of the 3-layer model on this dataset is 6379.42. By observing some of the prediction results, we find that the model can separate the overlapping digits successfully and predict the overall motion although the predicted digits are quite blurred. One “out-of-domain” prediction example is shown in Fig. 4.

接下来，我们在一些“域外”输入上测试我们的模型。我们生成另外3000个三个移动数字的序列，这些数字从不同的500个MNIST数字的子集随机抽取，不与训练集重叠。由于该模型从未见过任何三位数的系统，因此这种“域外”运行是对模型泛化能力的良好测试[21]。该数据集上3层模型的平均交叉熵误差为6379.42。通过观察一些预测结果，我们发现该模型可以成功地分离重叠的数字，并预测整体运动，尽管预测的数字相当模糊。图4示出了一个“域外”预测例子。

4.2 Radar Echo Dataset

The radar echo dataset used in this paper is a subset of the three-year weather radar intensities collected in Hong Kong from 2011 to 2013. Since not every day is rainy and our nowcasting target is precipitation, we select the top 97 rainy days to form our dataset. For preprocessing, we first transform the intensity values Z to gray-level pixels P by setting P=Z−minZmaxZ−minZP = \frac{Z−min{Z}}{max{Z}−min{Z}}P=maxZ−minZZ−minZ and crop the radar maps in the central 330 × 330 region. After that, we apply the disk filter5 with radius 10 and resize the radar maps to 100 × 100. To reduce the noise caused by measuring instruments, we further remove the pixel values of some noisy regions which are determined by applying K-means clustering to the monthly pixel average. The weather radar data is recorded every 6 minutes, so there are 240 frames per day. To get disjoint subsets for training, testing and validation, we partition each daily sequence into 40 non-overlapping frame blocks and randomly assign 4 blocks for training, 1 block for testing and 1 block for validation. The data instances are sliced from these blocks using a 20-frame-wide sliding window. Thus our radar echo dataset contains 8148 training sequences, 2037 testing sequences and 2037 validation sequences and all the sequences are 20 frames long (5 for the input and 15 for the prediction). Although the training and testing instances sliced from the same day may have some dependencies, this splitting strategy is still reasonable because in real-life nowcasting, we do have access to all previous data, including data from the same day, which allows us to apply online fine-tuning of the model. Such data splitting may be viewed as an approximation of the real-life “fine-tuning-enabled” setting for this application.

本文使用的雷达回波数据集是2011年至2013年在香港收集的三年天气雷达强度的子集。因为不是每天都下雨，我们的临近预报目标是降水，所以我们选择前97个雨天来形成我们的数据集。对于预处理，我们首先通过设置 P=Z−minZmaxZ−minZP = \frac{Z−min{Z}}{max{Z}−min{Z}}P=maxZ−minZZ−minZ 将强度值 ZZZ 转换为灰度像素 PPP，并在中心 330×330330 × 330330×330 区域裁剪雷达图。之后，我们应用半径为 101010 的磁盘滤波器（disk filter），并将雷达图调整到 100×100100 × 100100×100。为了降低由测量仪器引起的噪声，我们进一步去除了一些噪声区域的像素值，这些像素值是通过对月像素平均值应用K均值聚类来确定的。天气雷达数据每6分钟记录一次，所以每天有240帧。为了获得用于训练、测试和验证的不相交子集，我们将每个日常序列划分为40个不重叠的帧块，并随机分配4个块用于训练，1个块用于测试，1个块用于验证。使用20帧宽的滑动窗口从这些块中分割数据实例。因此，我们的雷达回波数据集包含8148个训练序列，2037个测试序列和2037个验证序列，并且所有序列都是20帧长(5帧用于输入，15帧用于预测)。尽管同一天切片的训练和测试实例可能有一些依赖性，但这种分割策略仍然是合理的，因为在现实生活中，我们确实可以访问所有以前的数据，包括同一天的数据，这使我们可以对模型进行在线微调。这种数据分割可被视为该应用的现实生活中的“启用微调”设置的近似。

We set the patch size to 2 and train a 2-layer ConvLSTM network with each layer containing 64 hidden states and 3 × 3 kernels. For the ROVER algorithm, we tune the parameters of the optical flowestimator6on the validation set and use the best parameters (shown in the appendix) to report the test results. Also, we try three different initialization schemes for ROVER: ROVER1 computes the optical flow of the last two observed frames and performs semi-Lagrangian advection afterwards; ROVER2 initializes the velocity by the mean of the last two flow fields; and ROVER3 gives the initialization by a weighted average (with weights 0.7, 0.2 and 0.1) of the last three flow fields. In addition, we train an FC-LSTM network with two 2000-node LSTM layers. Both the ConvLSTM network and the FC-LSTM network optimize the cross-entropy error of 15 predictions.

我们将补丁大小设置为2，并训练一个2层的ConvLSTM网络，每层包含64个隐藏状态和3 × 3个内核。对于ROVER算法，我们在验证集上调整光流估计器6的参数，并使用最佳参数(如附录所示)报告测试结果。此外，我们为ROVER尝试了三种不同的初始化方案：ROVER1计算最后两个观察帧的光流，然后执行半拉格朗日平流；ROVER2通过最后两个流场的平均值初始化速度；ROVER3通过最后三个流场的加权平均值(权重为0.7、0.2和0.1)进行初始化。此外，我们训练了一个具有两个2000节点LSTM层的FC-LSTM网络。 ConvLSTM网络和FC-LSTM网络都优化了15个预测的交叉熵误差。

We evaluate these methods using several commonly used precipitation nowcasting metrics, namely, rainfall mean squared error (Rainfall-MSE), critical success index (CSI), false alarm rate (FAR), probability of detection (POD), and correlation. The Rainfall-MSE metric is defined as the average squared error between the predicted rainfall and the ground truth. Since our predictions are done at the pixel level, we project them back to radar echo intensities and calculate the rainfall at every cell of the grid using the Z-R relationship [15]: Z=10loga+10blogRZ = 10 loga+ 10blogRZ=10loga+10blogR, where ZZZ is the radar echo intensity in dB, RRR is the rainfall rate in mm/h, and a, b are two constants with a = 118.239, b = 1.5241. The CSI, FAR and POD are skill scores similar to precision and recall commonly used by machine learning researchers. We convert the prediction and ground truth to a 0/1 matrix using a threshold of 0.5mm/h rainfall rate (indicating raining or not) and calculate the hits (prediction = 1, truth = 1), misses (prediction = 0, truth = 1) and false alarms (prediction = 1, truth = 0). The three skill scores are defined as CSI=hitshits+misses+falsealarmsCSI = \frac{hits}{hits+misses+falsealarms}CSI=hits+misses+falsealarmshits, FAR=falsealarmshits+falsealarmsFAR = \frac{falsealarms}{hits+falsealarms}FAR=hits+falsealarmsfalsealarms, POD=hitshits+missesPOD = \frac{hits}{hits+misses}POD=hits+misseshits. The correlation of a predicted frame PPP and a ground-truth frame TTT is defined as ∑i,jPi,jTi,j(∑i,jPi,j2)(∑i,jTi,j2)+ε\frac{\sum_{i,j}\ P_{i,j}T_{i,j}}{\sqrt{(\sum_{i,j}P^2_{i,j})(\sum_{i,j} T^2_{i,j})}+ε}(∑i,jPi,j2)(∑i,jTi,j2)+ε∑i,j Pi,jTi,j where ε=10−9ε = 10^{−9}ε=10−9.

我们使用几个常用的降水临近预报指标来评估这些方法，即降雨均方误差、临界成功指数、虚警率、探测概率和相关性。降雨量均方误差指标被定义为预测降雨量和真实情况之间的平均平方误差。由于我们的预测是在像素级完成的，我们将它们投影回雷达回波强度，并使用Z-R关系计算网格每个单元的降雨量[15]: Z=10loga+10blogRZ = 10 loga+ 10blogRZ=10loga+10blogR，其中ZZZ是雷达回波强度，单位为分贝，RRR是降雨率，单位为毫米/小时，a,ba, ba,b 是两个常数，a=118.239a = 118.239a=118.239，b=1.5241b = 1.5241b=1.5241。CSI、FAR和POD是类似于机器学习研究人员常用的精度和召回率的技能分数。我们使用0.5毫米/小时的降雨率阈值(指示是否下雨)将预测和真实情况转换为0/1矩阵，并计算 hits (prediction = 1, truth = 1)，misses (prediction = 0, truth = 1) ，false alarms (prediction = 1, truth = 0)。这三个技能得分被定义为 CSI=hitshits+misses+falsealarmsCSI = \frac{hits}{hits+misses+falsealarms}CSI=hits+misses+falsealarmshits，FAR=falsealarmshits+falsealarmsFAR = \frac{falsealarms}{hits+falsealarms}FAR=hits+falsealarmsfalsealarms，POD=hitshits+missesPOD = \frac{hits}{hits+misses}POD=hits+misseshits。预测帧PPP和真实帧TTT的相关性定义为∑i,jPi,jTi,j(∑i,jPi,j2)(∑i,jTi,j2)+ε\frac{\sum_{i,j}\ P_{i,j}T_{i,j}}{\sqrt{(\sum_{i,j}P^2_{i,j})(\sum_{i,j} T^2_{i,j})}+ε}(∑i,jPi,j2)(∑i,jTi,j2)+ε∑i,j Pi,jTi,j，其中 ε=10−9ε = 10^{−9}ε=10−9。

All results are shown in Table 2 and Fig. 5. We can find that the performance of the FC-LSTM network is not so good for this task, which is mainly caused by the strong spatial correlation in the radar maps, i.e., the motion of clouds is highly consistent in a local region. The fully-connected structure has too many redundant connections and makes the optimization very unlikely to capture these local consistencies. Also, it can be seen that ConvLSTM outperforms the optical flow based ROVER algorithm, which is mainly due to two reasons. First, ConvLSTM is able to handle the boundary conditions well. In real-life nowcasting, there are many cases when a sudden agglomeration of clouds appears at the boundary, which indicates that some clouds are coming from the outside. If the ConvLSTM network has seen similar patterns during training, it can discover this type of sudden changes in the encoding network and give reasonable predictions in the forecasting network. This, however, can hardly be achieved by optical flow and semi-Lagrangian advection based methods. Another reason is that, ConvLSTM is trained end-to-end for this task and some complex spatiotemporal patterns in the dataset can be learned by the nonlinear and convolutional structure of the network. For the optical flow based approach, it is hard to find a reasonable way to update the future flow fields and train everything end-to-end. Some prediction results of ROVER2 and ConvLSTM are shown in Fig. 6. We can find that ConvLSTM can predict the future rainfall contour more accurately especially in the boundary. Although ROVER2 can give sharper predictions than ConvLSTM, it triggers more false alarms and is less precise than ConvLSTM in general. Also, the blurring effect of ConvLSTM may be caused by the inherent uncertainties of the task, i.e, it is almost impossible to give sharp and accurate predictions of the whole radar maps in longer-term predictions. We can only blur the predictions to alleviate the error caused by this type of uncertainty.

所有结果如表2和图5所示。我们可以发现，FC-LSTM网络的性能并不适合这项任务，这主要是由于雷达图中的强空间相关性，即云的运动在局部区域高度一致。全连接结构有太多的冗余连接，使得优化不太可能捕获这些局部一致性。此外，可以看出，ConvLSTM优于基于光流的ROVER算法，这主要是由于两个原因。首先，ConvLSTM能够很好地处理边界条件。在现实生活中的临近预报中，有许多情况是云突然聚集在边界上，这表明一些云是从外面来的。如果ConvLSTM网络在训练期间看到了类似的模式，它可以在编码网络中发现这种类型的突然变化，并在预测网络中给出合理的预测。然而，这很难通过光流和基于半拉格朗日平流的方法来实现。另一个原因是，ConvLSTM是为此任务进行端到端训练的，并且数据集中的一些复杂时空模式可以通过网络的非线性和卷积结构来学习。对于基于光流的方法，很难找到一种合理的方法来更新未来的流场和端到端地训练一切。ROVER2和ConvLSTM的一些预测结果如图6所示。我们可以发现，ConvLSTM可以更准确地预测未来的降雨等值线，尤其是在边界上。虽然ROVER2可以给出比ConvLSTM更精确的预测，但它会触发更多的错误警报，并且通常不如ConvLSTM精确。此外，ConvLSTM的模糊效应可能是由任务固有的不确定性造成的，即在长期预测中几乎不可能给出整个雷达图的清晰和准确的预测。我们只能模糊预测，以减轻这种不确定性造成的误差。

5 Conclusion and Future Work

In this paper, we have successfully applied the machine learning approach, especially deep learning, to the challenging precipitation nowcasting problem which so far has not benefited from sophisticated machine learning techniques. We formulate precipitation nowcasting as a spatiotemporal sequence forecasting problem and propose a new extension of LSTM called ConvLSTM to tackle the problem. The ConvLSTM layer not only preserves the advantages of FC-LSTM but is also suitable for spatiotemporal data due to its inherent convolutional structure. By incorporating ConvLSTM into the encoding-forecasting structure, we build an end-to-end trainable model for precipitation nowcasting. For future work, we will investigate how to apply ConvLSTM to video-based action recognition. One idea is to add ConvLSTM on top of the spatial feature maps generated by a convolutional neural network and use the hidden states of ConvLSTM for the final classification.

在这篇论文中，我们成功地将机器学习方法，特别是深度学习，应用于到目前为止还没有从复杂的机器学习技术中获益的具有挑战性的降水临近预报问题。我们将降水临近预报表述为一个时空序列预测问题，并提出了LSTM的一个新的扩展来解决这个问题。ConvLSTM层不仅保留了FC-LSTM的优点，而且由于其固有的卷积结构，也适用于时空数据。通过将ConvLSTM合并到编码预测结构中，我们为降水临近预报建立了一个端到端的可训练模型。对于未来的工作，我们将研究如何将ConvLSTM应用于基于视频的动作识别。一个想法是在卷积神经网络生成的空间特征图的基础上添加ConvLSTM，并使用ConvLSTM的隐藏状态进行最终分类。

References

[1] F. Bastien, P . Lamblin, R. Pascanu, J. Bergstra, I. Goodfellow, A. Bergeron, N. Bouchard, D. WardeFarley, and Y . Bengio. Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, 2012.
[2] Y . Bengio, I. Goodfellow, and A. Courville. Deep Learning. Book in preparation for MIT Press, 2015.
[3] J. Bergstra, O. Breuleux, F. Bastien, P . Lamblin, R. Pascanu, G. Desjardins, J. Turian, D. Warde-Farley, and Y . Bengio. Theano: a CPU and GPU math expression compiler. In Scipy, volume 4, page 3. Austin, TX, 2010.
[4] R. Bridson. Fluid Simulation for Computer Graphics. Ak Peters Series. Taylor & Francis, 2008.
[5] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert. High accuracy optical flow estimation based on a theory for warping. In ECCV, pages 25–36. 2004.
[6] P . Cheung and H.Y . Yeung. Application of optical-flow technique to significant convection nowcast for terminal areas in Hong Kong. In the 3rd WMO International Symposium on Nowcasting and V ery ShortRange F orecasting (WSN12), pages 6–10, 2012.
[7] K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y . Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP, pages 1724– 1734, 2014.
[8] J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. V enugopalan, K. Saenko, and T. Darrell. Long-term recurrent convolutional networks for visual recognition and description. In CVPR, 2015.
[9] R. H. Douglas. The stormy weather group (Canada). In Radar in Meteorology, pages 61–68. 1990. [10] Urs Germann and Isztar Zawadzki. Scale-dependence of the predictability of precipitation from continental radar images. Part I: Description of the methodology. Monthly Weather Review, 130(12):2859–2873, 2002.
[11] A. Graves. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013. [12] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
[13] A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
[14] B. Klein, L. Wolf, and Y . Afek. A dynamic convolutional layer for short range weather prediction. In CVPR, 2015.
[15] P .W. Li, W.K. Wong, K.Y . Chan, and E. S.T. Lai. SWIRLS-An Evolving Nowcasting System. Hong Kong Special Administrative Region Government, 2000.
[16] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
[17] R. Pascanu, T. Mikolov, and Y . Bengio. On the difficulty of training recurrent neural networks. In ICML, pages 1310–1318, 2013.
[18] M. Ranzato, A. Szlam, J. Bruna, M. Mathieu, R. Collobert, and S. Chopra. Video (language) modeling: a baseline for generative models of natural videos. arXiv preprint arXiv:1412.6604, 2014.
[19] M. Reyniers. Quantitative Precipitation F orecasts Based on Radar Observations: Principles, Algorithms and Operational Systems. Institut Royal Météorologique de Belgique, 2008.
[20] H. Sakaino. Spatio-temporal image pattern prediction method based on a physical model with timevarying optical flow. IEEE Transactions on Geoscience and Remote Sensing, 51(5-2):3023–3036, 2013.
[21] N. Srivastava, E. Mansimov, and R. Salakhutdinov. Unsupervised learning of video representations using lstms. In ICML, 2015.
[22] J. Sun, M. Xue, J. W. Wilson, I. Zawadzki, S. P . Ballard, J. Onvlee-Hooimeyer, P . Joe, D. M. Barker, P . W. Li, B. Golding, M. Xu, and J. Pinto. Use of NWP for nowcasting convective precipitation: Recent progress and challenges. Bulletin of the American Meteorological Society, 95(3):409–426, 2014.
[23] I. Sutskever, O. Vinyals, and Q. V . Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104–3112, 2014.
[24] T. Tieleman and G. Hinton. Lecture 6.5 - RMSProp: Divide the gradient by a running average of its recent magnitude. Coursera Course: Neural Networks for Machine Learning, 4, 2012.
[25] W.C. Woo and W.K. Wong. Application of optical flow techniques to rainfall nowcasting. In the 27th Conference on Severe Local Storms, 2014.
[26] K. Xu, J. Ba, R. Kiros, A. Courville, R. Salakhutdinov, R. Zemel, and Y . Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015.

【Paper】ConvLSTM：Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting相关推荐

【论文翻译】Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
论文:https://arxiv.org/pdf/1506.04214.pdf 代码: (pytorch):https://github.com/automan000/Convolution_LSTM ...
【时空序列预测第二篇】Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
个人公众号 AI蜗牛车作者是南京985AI硕士,CSDN博客专家,研究方向主要是时空序列预测和时间序列数据挖掘,获国家奖学金,校十佳大学生,省优秀毕业生,阿里天池时空序列比赛rank3.公众号致力于 ...
【时空序列预测paper】ConvLSTM:A Machine Learning Approach for Precipitation Nowcasting
前言: 论文和两位大佬的解读一起看AI蜗牛车和翻滚的小@强论文原文:Convolutional LSTM Network: a machine learning approach for prec ...
【Paper】CNN-LSTM：Long-term Recurrent Convolutional Networks for Visual Recognition and Description
论文期刊:CVPR 2015 (oral) 论文被引:3673 (04/24/20) 论文原文:点击此处该论文是 CNN-LSTM 的开山鼻祖,主要用于生成图像描述.初稿发布于2014年,拿到了 C ...
【短临预报系列第一篇】Convolutional LSTM Network: A Machine LearningApproach for Precipitation Nowcasting
前言: 这是我研究短临预报看的第一篇文章,施行健博士所写,他提出的ConvLSTM模型从2015年到现在还有人使用,引用量达到了5000+,可以说是开山之作. 文章地址:https://arxiv.o ...
【Paper】CNN-LSTM：Show and Tell: A Neural Image Caption Generator
论文期刊:CVPR 论文年份:2015 论文被引:3390(04/22/20) 论文下载:点击此处文章目录 Abstract 1. Introduction 2. Related Work 3. M ...
【Paper】WISDM：Activity Recognition using Cell Phone Accelerometers
论文原文:点击此处论文下载:点击此处论文被引:2034 论文年份:2010 本文是WISDM (WIreless Sensor Data Mining) 无线传感数据挖掘实验室的第一篇论文. 如果 ...
【边缘检测】BDCN：Bi-Directional Cascade Network for Perceptual Edge Detection
CVPR 2019 Bi-Directional Cascade Network for Perceptual Edge Detection github链接:https://github.com/p ...
【Paper】Word2Vec：词嵌入的一枚银弹
Introduction Word2Vec是Google在2013 年开源的一个词向量(Word Embedding)计算工具,其用来解决单词的分布编码问题,因其简单高效引起了工业界和学术界极大的关注 ...

【Paper】ConvLSTM：Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting