Skip to content

Latest commit

 

History

History
300 lines (234 loc) · 11.3 KB

File metadata and controls

300 lines (234 loc) · 11.3 KB

ParticleDA.jl

ParticleDA.jl is a Julia package to perform data assimilation with particle filters, distributed using MPI.

Installation

To install the latest stable version of the package, open the Julia REPL, enter the package manager with ], then run the command

add ParticleDA.jl

If you plan to develop the package (make changes, submit pull requests, etc), in the package manager mode run this command

dev ParticleDA

This will automatically clone the repository to your local directory ~/.julia/dev/ParticleDA.

You can exit from the package manager mode by pressing CTRL + C or, alternatively, the backspace key when there is no input in the prompt.

Usage

After installing the package, you can start using it in Julia's REPL with

using ParticleDA

The run the particle filter you can use the function run_particle_filter:

run_particle_filter

To simulate observations from the model (which can be used to for example test the filtering algorithms) you can use the function simulate_observations_from_model

simulate_observations_from_model

The filter_type argument to run_particle_filter should be a concrete subtype of the ParticleFilter abstract type.

ParticleDA.ParticleFilter
BootstrapFilter
OptimalFilter

The summary_stat_type argument to run_particle_filter should be a concrete subtype of the AbstractSummaryStat abstract type.

ParticleDA.AbstractSummaryStat
ParticleDA.AbstractCustomReductionSummaryStat
ParticleDA.AbstractSumReductionSummaryStat
ParticleDA.MeanAndVarSummaryStat
ParticleDA.MeanSummaryStat
ParticleDA.NaiveMeanAndVarSummaryStat
ParticleDA.NaiveMeanSummaryStat

The next section details how to write the interface between the model and the particle filter.

Interfacing the model

The model needs to define a custom data structure and a few functions, that will be used by run_particle_filter:

  • a custom structure which holds the data about the model. This will be used to dispatch the methods to be defined, listed below;
  • an initialisation function with the following signature:
    init(params_dict::Dict, n_tasks::Integer) -> model
    with params_dict a dictionary with the parameters of the model and n_tasks an integer specifying the maximum number tasks (coroutines) parallelisable operations will be scheduled over. This initialisation function should create an instance of the model data structure and return it. The value of n_tasks can be used to create task-specific buffers for writing to when computing the model updates to avoid reallocating data structures on each function call. As tasks may be run in parallel over multiple threads, any buffers used in functions called within tasks should be unique to the task; to facilitate this functions in the model interface (see below) which may be called within tasks scheduled in parallel are passed a task_index argument which is a integer index in 1:n_tasks which is guaranteed to be unique to a particular task and so can be used to index in to task specific buffers.
  • The model needs to extend the following methods, using the model type for dispatch:
ParticleDA.get_state_dimension
ParticleDA.get_observation_dimension
ParticleDA.get_state_eltype
ParticleDA.get_observation_eltype
ParticleDA.sample_initial_state!
ParticleDA.update_state_deterministic!
ParticleDA.update_state_stochastic!
ParticleDA.sample_observation_given_state!
ParticleDA.get_log_density_observation_given_state
ParticleDA.write_model_metadata
  • Optionally, if the model has additive Gaussian observation and state noise, it may also extend the following methods, again using the model type for dispatch, to allow using the more statistically efficient OptimalFilter for filtering
ParticleDA.get_observation_mean_given_state!
ParticleDA.get_covariance_state_noise
ParticleDA.get_covariance_observation_noise
ParticleDA.get_covariance_state_observation_given_previous_state
ParticleDA.get_covariance_observation_observation_given_previous_state

Functions are provided to run tests to check a model correctly implements the expected interfaces:

ParticleDA.run_unit_tests_for_generic_model_interface
ParticleDA.run_tests_for_optimal_proposal_model_interface
ParticleDA.run_tests_for_convergence_of_filter_estimates_against_kalman_filter

Other function you may want to extend

ParticleDA.get_state_indices_correlated_to_observations
ParticleDA.get_initial_state_mean!
ParticleDA.write_snapshot
ParticleDA.write_state
ParticleDA.get_covariance_initial_state
ParticleDA.init_filter
ParticleDA.sample_proposal_and_compute_log_weights!
ParticleDA.write_observation
ParticleDA.write_weights
ParticleDA.get_initial_state_mean

Input parameters

You can store the input parameters in an YAML file with the following structure

filter:
  key1: value1

model:
  model_name1:
    key2: value2
    key3: value3
  model_name2:
    key4: value4
    key5: value5

The parameters under filter are related to the particle filter, under model you can specify the parameters for different models.

The particle filter parameters are saved in the following data structure:

ParticleDA.FilterParameters

Example: estimating the state in a tsunami model

A full example of a model interfacing ParticleDA is available in test/models/llw2d.jl. This model represents a simple two dimensional simulation of tsunami dynamics and is partly based on the tsunami data assimilation code by Takuto Maeda. The particle filter can be run with observations simulated from the model as follows

# Load ParticleDA
using ParticleDA

# Save some variables for later use
test_dir = joinpath(dirname(pathof(ParticleDA)), "..", "test")
llw2d_src = joinpath(test_dir, "models", "llw2d.jl")
input_file = joinpath(test_dir, "integration_test_1.yaml")
observation_file = tempname()

# Instantiate the test environment
using Pkg
Pkg.activate(test_dir)
Pkg.instantiate()

# Include the sample model source code and load it
include(llw2d_src)
using .LLW2d

# Simulate observations from the model to use 
simulate_observations_from_model(LLW2d.init, input_file, observation_file)

# Run the (optimal proposal) particle filter with simulated observations computing the
# mean and variance of the particle ensemble. On non-Intel architectures you may need
# to use NaiveMeanAndVarSummaryStat instead
final_states, final_statistics = run_particle_filter(
  LLW2d.init, input_file, observation_file, OptimalFilter, MeanAndVarSummaryStat
)

Parameters of the test model

The Input parameters section shows how to pass the input parameters for the filter and the model. For the model included in the test suite, called llw2d, you can set the following parameters:

LLW2d.LLW2dModelParameters

Observation station coordinates

The coordinates of the observation stations can be set in two different ways.

  1. Provide the coordinates in an input file. Set the parameter station_filename to the name of your input file. The input file is in plain text, the format is one row per station containing x, y - coordinates in metres. Here is a simple example with two stations

    # Comment lines starting with '#' will be ignored by the code
    # This file contains two stations: at [1km, 1km] and [2km, 2km]
    1.0e3, 1.0e3
    2.0e3, 2.0e3
  2. Provide parameters for an equispaced rectilinear grid of observation stations. The values of these parameters should then be set:

    • n_stations_x : Number of observation stations in the x direction
    • n_stations_y : Number of observation stations in the y direction
    • station_distance_x : Distance between stations in the x direction (m)
    • station_distance_y : Distance between stations in the y direction (m)
    • station_boundary_x : Distance between bottom left edge of box and first station in the x direction (m)
    • station_boundary_y : Distance between bottom left edge of box and first station in the y direction (m)

    As an example, one could set

    n_stations_x=2,
    n_stations_y=2,
    station_distance_x=1.0e3,
    station_distance_y=1.0e3,
    station_boundary_x=10.0e3,
    station_boundary_y=10.0e3,

    to generate 4 stations at [10km, 10km], [10km, 11km], [11km, 10km] and [11km, 11km].

Output

If the filter parameter verbose is set to true, run_particle_filter will produce an HDF5 file in the run directory. The file name is particle_da.h5 by default (this is configurable using the output_filename filter parameter). The file contains the summary statistics of the estimate state distribution (by default the mean and variance), particle weights, parameters used, and other metadata at each time step observation were assimilated. To read the output file, use the HDF5 library.

A basic plotting tool is provided in a Jupyter notebook. This is intended as a template to build more sophisticated postprocessing tools, but can be used for some simple analysis. Set the variable timestamp in the third cell to plot different time slices from the output file. More functionality may be added as the package develops.

Running in parallel

The particle state update is parallelised using both MPI and threading. According to our preliminary tests both methods work well at small scale. To use the threading, set the environment variable JULIA_NUM_THREADS to the number of threads you want to use before starting Julia and then call the run_particle_filter function normally. You can check the number of threads julia has available by calling in Julia's REPL

Threads.nthreads()

To use the MPI parallelisation, write a Julia script that calls the run_particle_filter function for the relevant model and observations and run it in an Unix shell with

mpirun -np <your_number_of_processes> julia <your_julia_script>

Note that the parallel performance may vary depending on the performance of the algorithm. In general, a degeneracy of the particle weights will lead to poor load balance and parallel performance. See this issue for more details.

Kalman filters

To allow validation that the filter implementations give the expected results when applied to linear-Gaussian state space models, dense and matrix-free Kalman filter implementations are available in the ParticleDA.Kalman module

ParticleDA.Kalman.KalmanFilter
ParticleDA.Kalman.MatrixFreeKalmanFilter
ParticleDA.Kalman.run_kalman_filter
ParticleDA.Kalman.lmult_by_observation_matrix!
ParticleDA.Kalman.pre_and_postmultiply_by_state_transition_matrix!
ParticleDA.Kalman.pre_and_postmultiply_by_observation_matrix!
ParticleDA.Kalman.lmult_by_state_transition_matrix!

Testing

We have a basic test suite for ParticleDA.jl. You can run the tests by entering the package manager mode in Julia's REPL with ] and running the command

test ParticleDA

License

The ParticleDA.jl package is licensed under the MIT "Expat" License.