ParticleDA.jl is a Julia
package to perform data assimilation with particle filters, distributed using MPI.
To install the latest stable version of the package, open the Julia
REPL, enter the package manager
with ], then run the command
add ParticleDA.jl
If you plan to develop the package (make changes, submit pull requests, etc), in the package manager mode run this command
dev ParticleDA
This will automatically clone the repository to your local directory
~/.julia/dev/ParticleDA.
You can exit from the package manager mode by pressing CTRL + C or,
alternatively, the backspace key when there is no input in the prompt.
After installing the package, you can start using it in Julia's REPL with
using ParticleDAThe run the particle filter you can use the function run_particle_filter:
run_particle_filter
To simulate observations from the model (which can be used to for example test the
filtering algorithms) you can use the function simulate_observations_from_model
simulate_observations_from_model
The filter_type argument to run_particle_filter should be a concrete subtype
of the ParticleFilter abstract type.
ParticleDA.ParticleFilter
BootstrapFilter
OptimalFilter
The summary_stat_type argument to run_particle_filter should be a concrete
subtype of the AbstractSummaryStat abstract type.
ParticleDA.AbstractSummaryStat
ParticleDA.AbstractCustomReductionSummaryStat
ParticleDA.AbstractSumReductionSummaryStat
ParticleDA.MeanAndVarSummaryStat
ParticleDA.MeanSummaryStat
ParticleDA.NaiveMeanAndVarSummaryStat
ParticleDA.NaiveMeanSummaryStat
The next section details how to write the interface between the model and the particle filter.
The model needs to define a custom data structure and a few functions, that will
be used by run_particle_filter:
- a custom structure which holds the data about the model. This will be used to dispatch the methods to be defined, listed below;
- an initialisation function with the following signature:
with
init(params_dict::Dict, n_tasks::Integer) -> model
params_dicta dictionary with the parameters of the model andn_tasksan integer specifying the maximum number tasks (coroutines) parallelisable operations will be scheduled over. This initialisation function should create an instance of the model data structure and return it. The value ofn_taskscan be used to create task-specific buffers for writing to when computing the model updates to avoid reallocating data structures on each function call. As tasks may be run in parallel over multiple threads, any buffers used in functions called within tasks should be unique to the task; to facilitate this functions in the model interface (see below) which may be called within tasks scheduled in parallel are passed atask_indexargument which is a integer index in1:n_taskswhich is guaranteed to be unique to a particular task and so can be used to index in to task specific buffers. - The model needs to extend the following methods, using the model type for dispatch:
ParticleDA.get_state_dimension
ParticleDA.get_observation_dimension
ParticleDA.get_state_eltype
ParticleDA.get_observation_eltype
ParticleDA.sample_initial_state!
ParticleDA.update_state_deterministic!
ParticleDA.update_state_stochastic!
ParticleDA.sample_observation_given_state!
ParticleDA.get_log_density_observation_given_state
ParticleDA.write_model_metadata
- Optionally, if the model has additive Gaussian observation and state noise, it may
also extend the following methods, again using the model type for dispatch, to allow
using the more statistically efficient
OptimalFilterfor filtering
ParticleDA.get_observation_mean_given_state!
ParticleDA.get_covariance_state_noise
ParticleDA.get_covariance_observation_noise
ParticleDA.get_covariance_state_observation_given_previous_state
ParticleDA.get_covariance_observation_observation_given_previous_state
Functions are provided to run tests to check a model correctly implements the expected interfaces:
ParticleDA.run_unit_tests_for_generic_model_interface
ParticleDA.run_tests_for_optimal_proposal_model_interface
ParticleDA.run_tests_for_convergence_of_filter_estimates_against_kalman_filter
ParticleDA.get_state_indices_correlated_to_observations
ParticleDA.get_initial_state_mean!
ParticleDA.write_snapshot
ParticleDA.write_state
ParticleDA.get_covariance_initial_state
ParticleDA.init_filter
ParticleDA.sample_proposal_and_compute_log_weights!
ParticleDA.write_observation
ParticleDA.write_weights
ParticleDA.get_initial_state_mean
You can store the input parameters in an YAML file with the following structure
filter:
key1: value1
model:
model_name1:
key2: value2
key3: value3
model_name2:
key4: value4
key5: value5The parameters under filter are related to the particle filter, under model
you can specify the parameters for different models.
The particle filter parameters are saved in the following data structure:
ParticleDA.FilterParameters
A full example of a model interfacing ParticleDA is available in test/models/llw2d.jl.
This model represents a simple two dimensional simulation of tsunami dynamics and is
partly based on the tsunami data assimilation code by
Takuto Maeda. The particle filter can be run with observations simulated from the model
as follows
# Load ParticleDA
using ParticleDA
# Save some variables for later use
test_dir = joinpath(dirname(pathof(ParticleDA)), "..", "test")
llw2d_src = joinpath(test_dir, "models", "llw2d.jl")
input_file = joinpath(test_dir, "integration_test_1.yaml")
observation_file = tempname()
# Instantiate the test environment
using Pkg
Pkg.activate(test_dir)
Pkg.instantiate()
# Include the sample model source code and load it
include(llw2d_src)
using .LLW2d
# Simulate observations from the model to use
simulate_observations_from_model(LLW2d.init, input_file, observation_file)
# Run the (optimal proposal) particle filter with simulated observations computing the
# mean and variance of the particle ensemble. On non-Intel architectures you may need
# to use NaiveMeanAndVarSummaryStat instead
final_states, final_statistics = run_particle_filter(
LLW2d.init, input_file, observation_file, OptimalFilter, MeanAndVarSummaryStat
)The Input parameters section shows how to pass the input parameters for
the filter and the model. For the model included in the test suite, called
llw2d, you can set the following parameters:
LLW2d.LLW2dModelParameters
The coordinates of the observation stations can be set in two different ways.
-
Provide the coordinates in an input file. Set the parameter
station_filenameto the name of your input file. The input file is in plain text, the format is one row per station containing x, y - coordinates in metres. Here is a simple example with two stations# Comment lines starting with '#' will be ignored by the code # This file contains two stations: at [1km, 1km] and [2km, 2km] 1.0e3, 1.0e3 2.0e3, 2.0e3
-
Provide parameters for an equispaced rectilinear grid of observation stations. The values of these parameters should then be set:
n_stations_x: Number of observation stations in the x directionn_stations_y: Number of observation stations in the y directionstation_distance_x: Distance between stations in the x direction (m)station_distance_y: Distance between stations in the y direction (m)station_boundary_x: Distance between bottom left edge of box and first station in the x direction (m)station_boundary_y: Distance between bottom left edge of box and first station in the y direction (m)
As an example, one could set
n_stations_x=2, n_stations_y=2, station_distance_x=1.0e3, station_distance_y=1.0e3, station_boundary_x=10.0e3, station_boundary_y=10.0e3,
to generate 4 stations at
[10km, 10km],[10km, 11km],[11km, 10km]and[11km, 11km].
If the filter parameter verbose is set to true, run_particle_filter will produce an HDF5 file in the run directory. The file name is particle_da.h5 by default (this is configurable using the output_filename filter parameter). The file contains the summary statistics of the estimate state distribution (by default the mean and variance), particle weights, parameters used, and other metadata at each time step observation were assimilated. To read the output file, use the HDF5 library.
A basic plotting tool is provided in a Jupyter notebook. This is intended as a template to build more sophisticated postprocessing tools, but can be used for some simple analysis. Set the variable timestamp in the third cell to plot different time slices from the output file. More functionality may be added as the package develops.
The particle state update is parallelised using both MPI and threading. According to our preliminary tests both methods work well at small scale. To use the threading, set the environment variable JULIA_NUM_THREADS to the number of threads you want to use before starting Julia and then call the run_particle_filter function normally. You can check the number of threads julia has available by calling in Julia's REPL
Threads.nthreads()To use the MPI parallelisation, write a Julia script that calls the run_particle_filter function for the relevant model and observations and run it in an Unix shell with
mpirun -np <your_number_of_processes> julia <your_julia_script>Note that the parallel performance may vary depending on the performance of the algorithm. In general, a degeneracy of the particle weights will lead to poor load balance and parallel performance. See this issue for more details.
To allow validation that the filter implementations give the expected results when
applied to linear-Gaussian state space models, dense and matrix-free Kalman filter
implementations are available in the ParticleDA.Kalman module
ParticleDA.Kalman.KalmanFilter
ParticleDA.Kalman.MatrixFreeKalmanFilter
ParticleDA.Kalman.run_kalman_filter
ParticleDA.Kalman.lmult_by_observation_matrix!
ParticleDA.Kalman.pre_and_postmultiply_by_state_transition_matrix!
ParticleDA.Kalman.pre_and_postmultiply_by_observation_matrix!
ParticleDA.Kalman.lmult_by_state_transition_matrix!
We have a basic test suite for ParticleDA.jl. You can run the tests by entering the
package manager mode in Julia's REPL with ] and running the command
test ParticleDA
The ParticleDA.jl package is licensed under the MIT "Expat" License.