In this project, we tackle control from partial noisy observations.
We propose a framework, Data-assimilated Model-informed Reinforcement Learning (DA-MIRL), that has three components:
- A predictive model of the system's dynamics (physical or data-driven)
- An ensemble-based data assimilation method for real-time state estimation (Ensemble Kalman Filter, EnKF)
- An off-policy actor-critic reinforcement learning (RL) algorithm for learning the control policy (Deep Deterministic Policy Gradient, DDPG)
The environment is the Kuramoto–Sivashinsky (KS) equation, which is a 1D partial differential equation that exhibits spatiotemporal chaos. The below figure shows the KS system first evolving without control and then being controlled using a trained RL agent, which stabilises the flow. We place sensors along the x-axis to obtain measurements of the system. We aim to learn a stabilising optimal control policy using these partial and noisy measurements, i.e., observations. The control is enabled by actuators applying a Gaussian mixture forcing.
The codebase is written in JAX. The following experiments can be run:
- Model-free RL
python ddpg_experiment_v3.py
- Data-assimilated model-informed RL using a physical truncated Fourier basis model of the system
python ddpg_with_enkf_experiment_v3.py
- Data-assimilated model-informed RL using a data-driven control-aware Echo State Network (ESN) model of the system
python ddpg_with_enkf_esn.py
Running an experiment creates a folder in local_results/ where configurations, model weights and plots (optional) are saved.
The results are visualised in the Jupyter notebooks, Model-free, Model-informed Fourier and Model-informed ESN.
The trained weights of these runs can be found in Results.
The experiments can be configured using ml_collections. You can find sample config files in the configs/ directory.
To specify a configuration file when running an experiment, use the --config flags, or you can also individually configure the setting. For example:
python ddpg_with_enkf_esn.py --config configs/enKF_config_mb.py --config.enKF.std_obs 0.1 --env_config configs/KS_config.py --env_config.nu 0.08--make_plots – generate plots of the episodes during training
--log_wandb – log losses and metrics to Weights and Biases
Previous runs can be accessed at:
