Ground radar precipitation estimation with deep learning approaches in meteorological private cloud

Accurate precipitation estimation is significant since it matters to everyone on social and economic activities and is of great importance to monitor and forecast disasters. The traditional method utilizes an exponential relation between radar reflectivity factors and precipitation called Z-R relationship which has a low accuracy in precipitation estimation. With the rapid development of computing power in cloud computing, recent researches show that artificial intelligence is a promising approach, especially deep learning approaches in learning accurate patterns and appear well suited for the task of precipitation estimation, given an ample account of radar data. In this study, we introduce these approaches to the precipitation estimation, proposing two models based on the back propagation neural networks (BPNN) and convolutional neural networks (CNN) respectively, to compare with the traditional method in meteorological service systems. The results of the three approaches show that deep learning algorithms outperform the traditional method with 75.84% and 82.30% lower mean square errors respectively. Meanwhile, the proposed method with CNN achieves a better performance than that with BPNN for its ability to preserve the spatial information by maintaining the interconnection between pixels, which improves 26.75% compared to that with BPNN.


Introduction
In recent years, the problem of climate change has caused the attention from all over the world. As one of the most significant factors in water resource ecosystem, precipitation plays an important role in meteorological fields, which has a strong impact on human's daily lives as well as business such as agriculture and construction [1][2][3]. The variations in time and quantity of rainfall have the potential impact on the agriculture yield and disaster management [4][5][6]. Prior knowledge of rainfall behavior can help farmers and policy formulation to minimize crop damage. Moreover, it also plays an important role in disaster warning and relief [7][8][9]. The rain gauge is a simple and effective way to measure precipitation. However, the measurement network system is subject to many factors such as low density and the complexity of the precipitation phenomena would lead to a large error [10][11][12]. With the advantage of the wide measurement range, high spatial and temporal resolution and the real-time data transmission, the ground radar has been widely applied in meteorological industry, including precipitation estimation [13][14][15]. The traditional method employed in precipitation estimation is Z-R relationship model which utilizes the radar echo intensity and rainfall intensity to establish an equation to calculate the precipitation [16][17][18]. The practical Z-R relationship is determined by the distribution of the droplet spectrum while the distribution is restricted by a lot of factors, which means that a constant Z-R relationship in a specific region would bring a large deviation on the precipitation estimation when applied in another region. Therefore, seeking (2020) 9:22 Page 2 of 12 for a more appropriate method is an inevitable approach to ensure the performance of the estimation [19][20][21].
Realizing the defect of the estimation method, many meteorologists have made great efforts to explore new methods. With the development of deep learning algorithms in recent years, studies using these methodologies have been drawing attention to improve the estimation performance [22,23]. Deep learning is distinguished for the specialty in learning accurate relations from large and complex datasets, which is well suitable for the precipitation estimation task under elastic computing resources available in the cloud [24][25][26]. Among the deep learning algorithms, the artificial neural network is a novel method simulating human's thinking and memory based on the research of biological neural network. With the strong capacity of nonlinear mapping as well as its property of fault tolerance, adaptability and self-learning, the neural network becomes a new favorite to solve problems in the fields of precipitation estimation [27][28][29].
To address the issue in the accuracy of precipitation estimation mentioned above, back propagation neural networks and convolutional neural networks are applied to improve the accuracy of precipitation estimation. Especially, convolutional neural networks have not been applied in such research based on the data offered by the Doppler radar system. Then, there are extensive experimental evaluations to choose a more efficient and effective one of the proposed method. The specific objectives of this study are: 1) to introduce the deep learning methods based on the Doppler radar data to estimate precipitation in meteorological private cloud; 2) to compare the performances of the proposed methods with baseline model (Z-R relationship) and achieve a better method; and 3) to verify that whether the use of integrity radar data would enhance performance versus the discrete data.
The remainder of this paper is organized as follows: In "Related work" section, we review the peer research and work. "Data preparation" section describes the details of data preparation and the dataset employed in the experiment. The details of three models are presented in "Scheme" section. The experiments as well as their results and analysis are covered in "Experiments" section. Finally, we conclude our work in "Conclusion" section.

Related work
With the great importance of the precipitation, many researchers have made efforts to estimate the precipitation as accurate as possible. In meteorology, Z-R relationship is a traditional method for estimation. The model reveals that radar reflectively factors have an exponential relation with precipitation, which is acquired from years of data. However, the accuracy of the model is of great deviation, especially unacceptable with a large error if the rain is heavy. In recent years, several approaches for precipitation estimation have been proposed in order to get a much better results.
Lazri et al. [30] developed a precipitation estimation scheme with the multi-layer perceptron (MLP) which utilized data from the high spectral resolution of the SEVIRI satellite. Two MLPs were used: MLP1 is used to identify rain and no-rain pixels and MLP2 is applied to estimate precipitation for rainfall pixels, which are beneficial for area-wide rainfall detection and quantification in a high spatial and temporal resolution.
Hernández et al. [31] introduced a deep learning architecture to estimate the accumulated precipitation for the next day. Their model includes an autoencoder and a multilayer perceptron. The autoencoder is an unsupervised network used to reduce and capture non-linear relationships between attributes and the multilayer perceptron is employed to make predictions in their problem. Compared with other previous proposals, it demonstrated that their model achieved an improvement on the prediction. However, the improvement is limited with single meteorological factors.
Francesco Beritelli et al. [32] proposed a new classification method applied to classify the precipitation into four rainfall intensities, which is based on a probabilistic neural network with three received signal level local features of the 4G/LTE. The performance of their model obtained an overall correct classification rate of 96.7%. However, their work was not further to estimate the specific precipitation.
Ouallouche et al. [33] introduced a precipitation estimation technique based on the random forest (RF) algorithm. The RF consists of two main parts: classification and regression, which are receptively performed on the MSGretrieved data. The RF algorithm is applied to classify the MSG images to three classes, whereas the rainfall rate of the pixels is assigned to the convective and stratiform classes with the random forest regression. However, the night-time precipitation estimation was not as good as daytime precipitation scenes.
Pengcheng Zhang et al. [34] proposed a novel solution called Dynamic Regional Combined short-term rainfall Forecasting approach (DRCF) based on the Multi-layer Perceptron. They employed Principal Component Analysis to reduce the input dimension. After then, the output was put into a MLP to make the short-term rainfall forecasting. Moreover, they utilized the surrounding sites to predict the rainfall at the same time with the same process mentioned above. The final prediction was the average of the results from all sites. This method takes the high altitude weather information into consideration while the improvement of the performance is finite. Meanwhile, there are different amounts of sites in different areas, which make the accuracy of the prediction fluctuate greatly. Folino et al. [35] proposed a universal model with machine learning technique based on a deep learning architecture, which integrates information derived from weather radars and satellites. The model consists of three components: Information Retrieval, Data Analytics and Evaluation and the model allows the combination of the information extracted by many data sources. The Evaluation component is based on a deep neural network to provide more accurate predictions for heavy rainfall cases with the weighted MAE loss function. The modified loss function narrowed the gap between the observation and the evaluation while the deep neural network is prone to overfit even the inverted dropout technique was employed.
Mojtaba et al. [36] proposed a CNN-based model with infrared and water vapor channels from geostationary satellites for precipitation estimation, which was compared with baseline models called PERSIANN-CCS and PERSIANN-SDAE through various evaluation indexes. Results demonstrated that the proposed model outperformed the baseline models as well as the efficiency. However, the estimation is the day precipitation that had no advantages in practical work because people pay more attention to the precipitation in a short time.
In order to enhance the improvement on precipitation estimation, we introduce two models based on the back propagation neural network and convolutional neural network respectively, which are compared with the traditional method of Z-R relationship, to find a better performance method.

Data preparation
The data comes from a meteorological observatory located in the center area of Taizhou in Zhejiang province. The Doppler radar transmits radar reflectivity factors (dBZ) as well as the corresponding longitude and latitude every six minutes with eleven different elevation angles, which are stored with a binary file format orderly in private cloud [37][38][39]. Meanwhile, there are four automatic weather stations recording the minutely precipitation ordered by date and time, which are considered as the authentic value to verify the accuracy of precipitation estimation. The data used in our experiments cover from 2013 to 2017.
The data used for models contain two main characteristics, including dBZ values and the corresponding precipitation from rain gauges. According to the meteorologists, the minimum elevation data are more closely associated with the precipitation. Therefore, the data from the minimum elevation were used in our experiments which were projected to a horizontal plane with the height of 1200 meters distant from the ground to achieve a high-precision and integral mosaic of dBZ, which is similarly regarded as a "square grid" with the resolution of 1km×1 • .
In order to utilize the radar reflectivity information for a better result, the area data instead of the single point value are taken into consideration. The center of the area matrix is the grid point nearest to the automatic weather station. Around the center, a 24km × 24 • area (with a total of 625 dBZ values on grid points) is employed as is shown in Fig. 1. In addition, due to the delay of the precipitation, the current value is substituted with the sum of next 6 minutes (including current precipitation) so that the precipitation would be more precise [40]. As the unit of precipitation is mm/6min, it is necessary to transform the unit to mm/h. Therefore, each sample of data consists of two fields: a matrix with shape of 25×25 which memories the dBZ values as input and a one-hour precipitation as the authentic label of the model estimation.
According to meteorological literatures above, the value of radar data (dBZ) below fifteen, which are also called ground echoes, hardly have an impact on the precipitation. Therefore, the average of the matrix elements below fifteen is to be abnegated which doesn't contribute to the enhancement of the efficiency and accuracy.
The dataset is randomly divided into training set and test set with a percentage of 80 and 20 respectively, which means that the training set is employed to find the relationship between the radar reflectivity factors and precipitation so that the parameters are determined, while the test set is used to verify the accuracy of the relationship.

Overview
When the artificial neural networks are applied to practical tasks, the main differences lie in the architecture and parameters of the networks. In order to find a better model to estimate precipitation, two models called Precipitation Estimation from Radar using Back Propagation Neural Network (PERBPNN) and Precipitation Estimation from Radar using Convolutional Neural Network (PERCNN) are proposed and conducted to compare the performance with the traditional methods of Z-R ralationship which is set as the baseline model. Figure 2 illustrates the overview of our scheme.

Baseline model
For the moment, the precipitation with radar data in the industry is mainly calculated through the relationship between the radar echo intensity and rain intensity according to the Eq. (1): where Z is the radar echo intensity, R is the one-hour precipitation and a, b are empirical coefficients. Due to the complexity of the meteorological problems, the coefficients may be diverse in different regions [41]. Figure 3 shows the details of this method ,especially there is a little different from original method that the average of the dBZ matrix is used instead of the single dBZ value in the center of the matrix. As a result, the effect of the surrounding is considered which would make a better performance.

PERBPNN
With the development of the hardware, deep learning becomes a new favorite which attracts scholars not only from the computer industry but from other industries, including meteorological industry. BPNN is one of the most representatives in deep learning [42]. Once the data were put into the network, BPNN would optimize the parameters automatically. After some epochs of learning, the parameters would be determined automatically. Figure 4 shows the details of the BPNN. The dBZ matrix is reshaped to a column vector as the input of the model. Compared with the baseline model, concrete values in the dBZ matrix are employed in this model as it is the advantage of the BPNN which takes more features into consideration to enhance the estimation accuracy. The key component in the BPNN is the computation of each neuron which is expressed in Formula (2): where n is the amount of layers, l represents that the variables belongs to the l−th layer, w [l] is the parameters matrix and b [l] is the bias; a [l] is the output matrix of each layer, g is the suitable activation function. Therefore, the neurons of each layer are computed at the same time.
The estimated value would be obtained through a series computation of hidden layers. Then, the stochastic gradient descent method which is shown in the Formula (3) is applied for the back propagation to adapt the parameters so that the accuracy becomes better [43,44].
where J is the loss function, α is the learning rate, w [l] and b [l] are the parameter matrix and bias vector in layer l. With some epochs of the forward propagation and back propagation, the final architecture of the model is determined.

PERCNN
CNN is special for its ability to automatically extract features hierarchically. The use of convolutional kernel is capable of avoiding the one-to-one connections among all units and reducing the parameters with weight sharing. Moreover, it is beneficial to reduce the over-fitting as well as improve the computing speed and fault tolerance [45][46][47]. Figure 5 shows the details of PERCNN. The entirety of the dBZ matrix, which includes the neighborhood information, is applied as the input so that the features among the area could be extracted. When the dBZ matrix is put into the model, a lot of feature maps are calculated with several convolution and max-pooling operations. The output after each convolution is given in Formula (4): where f is the kernel size, w [l] m,n is the weight at the position of (m, n) in the kernel, a [l−1] m+i,n+j is the value in the receptive field of layer l −1 at the position of (m+i, n+j), and b [l] is the bias matrix of layer l, z [l] i,j is the direct result of each step of convolution in layer l, g is the suitable activation function and a [l] i,j is the ultimate output of layer l. Then, the size of the output is determined by Formula (5) as following: where n [l−1] and n [l] represent the size of feature from layer l −1 and l respectively, f is the kernel size applied with stride s and padding p.
In addition, the pooling operation especially max pooling is usually employed to improve the robustness of feature extraction and reduce the dimension of the model between the convolution and activation function which is given in Formula (6): where R p represents the pooling domain of each stride, (a, b) represents the position in the pooling domain, m [l] i,j is the result of the max pooling and the remaining parameters are the same as the Formula (4). Then, the ultimate feature maps of the first portion are squeezed to a one-dimensional vector to join the fully connected (FC) layers to estimate the precipitation. The FC networks are similar to the BPNN and the computation of the process is suitable to refer the Formula (2). And the output of the FC networks is the precipitation estimation.
After the forward propagation, the stochastic gradient descent algorithm is also employed to minimum the value of the loss function to achieve a better architecture, which is the same as the Formula (3). Experienced with some epochs of training, the final architecture of the model is determined.

Experiments
With the private cloud built by the meteorological department as the platform, the experimental environment of this study was built. Data processing, model training and verification were completed in the private cloud.
Our networks were trained and tested on AMD Ryzen 5 3600 6-core Processor CPU and NVIDIA GeForce GTX 1660 6GB GPU. It took around 3 hours for 10000 epochs of training. During the experiments, we implemented a mini-batch system in order to adapt the restriction of the video memory. Each epoch of training consists of running all mini-batches to cover the training dataset. Meanwhile, some algrithms were applied to optimize the training process. We trained out model with PyTorch 1.3.1 framework(in python 3.6) which supports CUDA 10.0.
During the experiments, the mean square error (MSE) and root mean square error (RMSE) are applied as the loss function to estimate the performance of these models, which is shown in Formula (7): where m is the sample size,ŷ i and y i respectively represent the estimated and authentic value of the sample i. The following is the specific experimental process.

Baseline model
In this model, the dBZ matrix is transformed to a single value which is the average of 625 elements in the matrix. The relationship between dBZ and Z is shown in Eq. (8):  Then, the Formula (1) needs to be transformed to Eq. (9) as is shown below: More specifically, the Eq. (10) would be used for the linear regression: Therefore, the least square method is used to solve the problem and the parameters as well as estimations are determined. The a is 0.762 and b is 0.003 through the computation.

Neural network optimization
In order to make the performances of neural networks better and accelerate the training speed, several means are employed during the training.

Z-Score normalization
Z-Score normalization is a regularization method which transforms the input data into the standardized normal distribution as is shown in Formula (11). Standardization is essentially a linear transformation with many good properties, which determines that the change of data will not cause "failure", but improve the performance of data. It is beneficial to eliminate the effects caused by the differences of the value range, which makes the training speed faster and the estimation more accurate. This operation is conducted before the radar data are put into the model.
where n is the sample size, x i is the i−th input, μ is the sample average, σ is the sample standard deviation and x * i represents the normalization result of x i .

Batch normalization
This operation is similar to the Z-Score normalization. The difference is that the normalization is not applied on all data. It is used on the feature map before the convolution operation of each layer. Through the batch normalization, the problem of gradient disappearance is avoided, which means that the training speed becomes faster greatly. It is suitable for the batch normalization to apply Formula (11) as well, except that the n represents the count of the elements in the feature map.

Inverted dropout
During the training, it is easy to overfit, which makes the error of training set very low while the performance of test set is still poor. Inverted Dropout is a regularization technique that is able to reduce the overfitting of the training set effectively. It is applied on a hidden layer to set some weight as zero according to a certain probability, which is similar to delete some neuron nodes of the layer randomly during the forward propagation while the back propagation is not influenced. That is to say, the network is different during the forward propagation but the gradient descent works on the original network. This makes the parameters would not rely on the training set too much to reduce overfitting. The dropout rate is empirically set as 0.5.

PERBPNN
The input layer and output layer are fixed with the number of neurons of 625 and 1 respectively. However, the structure of hidden layers is determined through trial and error. Table 1 presents the adopted parameters for the model with lots of tests. Figure 6a displays the training and testing process. The orange curve and blue curve represent the relationship between loss and epoch on training set and test set respectively. It is shown that the training set loss decreased rapidly at the beginning and then fluctuated strongly when epoch increases. In addition, it is obvious that the test set loss becomes smooth and steady after 100 epochs. With

PERCNN
The structure of PERCNN is determined by extensive trials as well. More specifically, a greedy strategy of search is conducted for the following set of parameters. Table 2 presents the approved values for each parameter. It is noticeable that the pooling technique is not applied in our model because the pooling layer is not effective for estimation. The input of the model is a value matrix instead of an image matrix, so there are no textural features necessary to extract. For instance, we consider the three matrices with values representing the dBZ at some locations which is shown in the following:    And during the training, the performance of model without pooling technique is better than one with it. Therefore, the pooling technique is removed from our model.
The overview of the network is shown in Fig. 7. The specific configuration and parameter information of the whole network are detailed in Table 3. In order to make the result of the comparison with the PERBPNN, the activation function, optimizer and training times are the same as PERBPNN. The training and testing process is shown in Fig. 6b. Different from PERBPNN, after the rapid decrease at the beginning, both training set and test set losses are smooth and steady.

Results
With the experiments among the three models, the research demonstrated that the estimation with the deep learning technique is really effective and it achieved a more accurate result as is presented in Fig. 8. It is clear that precipitation estimation from ground radar information acquired by PERBPNN and PERCNN correspond to the authentic value very well. The errors of each model are shown in Table 4 which accords with the results above.
More specifically, the estimations are compared with the authentic values for those data instances. Figure 9 shows that the relation between estimation by the three models and the authentic precipitation. The x-axis represents the authentic value, while the y-axis represents the estimation. A good estimation would be place in a straight line with the values of slope equal and intercept equal to one and zero respectively. As it is possible to observe, the distribution of the data points from deep learning models are located around the aforementioned reference line while the traditional method have a poor performance. Obviously, the performance with CNN is the best, the second is BPNN and the Z-R model is a little unsatisfactory. The baseline model performed well just when the precipitation is in a low level while it presented with a big bias as the precipitation increases. Instead, PERBPNN and PERCNN almost have a good performance almost all the time, and the latter is more accurate as the data points of PERCNN are closer to the reference line which is illustrated in the graph. More specifically, the neighbor data of the center is of great effect on the estimation accuracy. It is observed that BPNN captures the contribution of single data to the estimation while leaving out the integrity of the data. However, precipitation is a continuous process since there is no possibility that only a small area rains heavily while the neighbor area is suddenly no rain. Therefore, CNN is more suitable for the estimation due to the ability of integrity feature extraction.

Conclusion
In this study, we implemented three different models to estimate the precipitation from ground radar information. Experimental results showed that the performance of deep learning models are better than that of the traditional models. In addition, the RMSE of PERCNN is reduced by 14.41% compared with PERBPNN. It indicated that precipitation estimation with surrounding radar data achieved a more accurate result especially the integration feature of neighborhood information. In the future, we will explore more effective methods to enhance the accuracy of the estimation and try to study the precipitation prediction. Fig. 9 The correlation between estimation and authentic value of different models