Weather Forecaster

Introduction:

Currently, weather forecasting is mostly done through Numerical Weather Prediction Models (NWP) that take in vast amounts of quantitative data and perform simulations that replicate the atmosphere based on mathematical equations. This process requires intensive computational resources and is prone to inaccuracies when simplifications and assumptions are made to facilitate the computation. Challenges arise when developing an NWP model that captures and simulates the weather since weather data is inherently complex due to the intricate interactions and non-linear relationships between weather variables. Therefore, we are proposing a deep-learning approach that has the potential to address these issues by training networks to learn representations and patterns between atmospheric variables and make more robust and accurate predictions. Our model would tackle this problem as a regression task and make weather predictions for the common weather variables.

Data:

We will be using the ECMWF Reanalysis v5 (ERA5) dataset, which provides a large amount of hourly weather information dating back to 1940. In terms of size, the database should have around 700,000 entries (24*365*80). Although it would contain large vectors of information, we believe that this is fairly reasonable since they are not images or a data type that requires large storage. In terms of preprocessing, since our model will take in a sequence of weather data within a specified period to learn the spatial features, we will be grouping weather data to contain the weather data for previous hours (e.g. grouping hours 0, -1, -2, -3). In addition, we will be matching the dataset’s features with a weather API so that our model can be executed on live weather data and make predictions for the future.

Methodology:

The architecture of our weather forecast model will initially employ a convolutional LSTM design. This involves feeding a sequence of weather data vectors from the past window-size hours into various TCN (temporal convolutional network) layers. These layers combine convolutional filters and learning kernels with long-term memory, akin to an RNN, to produce a sequence of outputs. An alternative design we'll explore is using detailed CNN layers to detect spatial features in the data, followed by an LSTM to interpret these outputs as elements in a sequence for forecasting. The specific layer configurations and design will require experimentation and further research. One key hyperparameter we'll introduce is the "time window," determining how many previous hours are used as input. Additionally, addressing periodic features like seasonal cycles will be crucial, potentially requiring specialized activation functions. While we aim for a global-scale model capable of detecting local phenomena, data availability may dictate the model's scope. Challenges include training for extreme weather events and considering alternative approaches like one-shot learning or conditional probability distributions using VAEs, GANs, or diffusion models, though time constraints may limit exploration in these areas.

Metrics:

For our project, accuracy is an appropriate metric since we can evaluate the accuracy of the model’s prediction based on testing data. We plan to assess the model’s performance by computing MSE with respect to two sources of testing data. Within the ERA5 dataset, we can evaluate the model’s predictions for previous weather data. For more recent or current weather data, we can compare the model’s performance with API weather data.

Our base goal is to implement a model that is capable of processing input weather data points and outputting a prediction. Our target goal is to build upon our base goal and achieve high accuracy in our model’s prediction. Our stretch goal is to expand on the project’s scope and train the model to include extreme weather events. Alternatively, we could also implement a different architecture (such as transformer, one-shot learning, VAEs, etc.)

Ethics:

Why is Deep Learning a good approach to this problem? Deep learning is a good approach to weather forecasting because it can potentially address some of the key challenges in traditional NW models. NWP models require intensive computational resources and are prone to inaccuracies when simplifications and assumptions are made to facilitate the computation. This is due to the inherent complexity of weather data, which involves many interactions and non-linear relationships between various atmospheric variables. In contrast, a deep learning approach has the potential to learn these complex, non-linear patterns directly from the data, without relying on the same level of computational resources or the need to make simplifying assumptions. By training neural networks to learn representations and patterns between the inputs and outputs, a deep learning model can make more robust and accurate weather predictions. Additionally, deep learning models can potentially capture local weather phenomena and extreme events more effectively than traditional NWP models, which is a key advantage in improving the overall accuracy and reliability of weather forecasting.

What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain? The main dataset we plan to use is the ECMWF Reanalysis v5 (ERA5) dataset, which provides hourly weather information dating back to 1940. The dataset is quite large, with around 700,000 entries. Some potential concerns that we need to consider with the dataset include representativeness, as we are not sure if the dataset covers all geographic regions equally or if there may be gaps or biases in coverage, especially for less populated areas. There may also be historical biases, as the data goes back to 1940, reflecting the state of weather observation and measurement technology at that time. Additionally, the quality of the data labeling is not mentioned, which could impact the accuracy of the model's training. These are important considerations, as biases or gaps in the training data could lead to biases in the model's predictions, which could disproportionately impact certain populations or regions.