About

Using deep learning spatial-temporal graph models for seasonal forecasting of extreme temperature events

ClimateDL is a regional program CLIMAT-AmSud project.

Goals

The core of the project is around climate forecasting, and more specifically on the prediction of extreme temperatures at a seasonal scale in South America, using machine learning and statistical methods, with a focus in both cases on network analysis techniques. Probability theory, stochastic processes, and risk analysis are also related areas for the research program. As applications, extreme climate events have a large impact on agriculture, human health, water resources, and network infrastructures (e.g. telecommunication and energy distribution, another field where the participants have a long experience), among many others.

Abstract

Forecasting extreme seasonal temperature events is especially relevant to society and its ecosystem due to their potentially dangerous impacts. Despite the above, the study of this problem is still very incipient in South America, where most of the methods proposed in the literature correspond to process-based dynamic models and classical statistical methods. Increasing the accuracy of predicting the probability of occurrence of extreme temperatures at the seasonal scale would have huge impacts on our economical life (on our food production, on our energy consumption), on our health, etc. Currently, innovative approaches are emerging in climate forecasting, using network science and machine learning methodologies. The complex interrelations between temporal and spatial variables in climate time-series justify the use of climate networks as an underlying automatic model explanation of the complex physical processes involved. The use of machine learning methods, and in particular deep learning architectures, is justified by their success in several forecasting problems. In this project, we will evaluate the precision of the most prominent deep-learning architectures for our forecasting problem, and in particular, those implementing spatio-temporal models based on graphs (climate networks). The most promising antecedents for the use of this methodology are several works based on climate networks and on the effect of “El Niño” on similar forecasting problems. We expect to propose improvements to these deep learning architectures to increase their precision in our particular problems. For example, changing the scale of nodes, changing the similarity of edges, proposing new architectures that include multiple networks, training the model with augmented data, etc. We will implement and compare the results with classical statistical methods, such as the principal component regression technique and Wavelet transform-based methods, among others. Assuming we achieve good predictive quality, we will evaluate the impact of these predictions on some human activities, and we will focus on risk analysis of specific infrastructures (telecommunication networks, energy distribution) and on some other concrete case studies in situations where the considered extreme events have a strong negative impact and for which historical data are available.

Data

The main dataset used in our project comes from 137 weather stations in southern South America. It includes max and min daily temperatures, and daily accumulated precipitations from 1977 to 2018. Data is provided by national meteorological services:

In order to create the dataset, the data was preprocessed by Solange Suli and Verónica Dankiewicz at the Departamento de Ciencias de la Atmósfera y los Océanos. Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires. Errors and outliers were fixed. And weather stations with high level of missing data were discarded (69  of 206 analyzed stations).

Contact us

Please contact us at: prbocca at fing.edu.uy