Stochastic Optimal Control Matching
Authors: Carles Domingo i Enrich, Jiequn Han, Brandon Amos, Joan Bruna, Ricky T. Q. Chen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, our algorithm achieves lower error than all the existing IDO techniques for stochastic optimal control for three out of four control problems, in some cases by an order of magnitude. |
| Researcher Affiliation | Collaboration | Carles Domingo-Enrich NYU & FAIR, Meta cd2754@nyu.edu Jiequn Han Flatiron Institute jhan@flatironinstitute.org Brandon Amos FAIR, Meta bda@meta.com Joan Bruna NYU & Flatiron Institute bruna@cims.nyu.edu Ricky T. Q. Chen FAIR, Meta rtqichen@meta.com |
| Pseudocode | Yes | Algorithm 1 Iterative Diffusion Optimization (IDO) algorithms for stochastic optimal control |
| Open Source Code | Yes | Code can be found at https://github.com/facebookresearch/SOC-matching. |
| Open Datasets | No | The paper defines the functions and parameters for its experimental settings (QUADRATIC ORNSTEIN UHLENBECK, LINEAR ORNSTEIN UHLENBECK, DOUBLE WELL, and PATH INTEGRAL SAMPLER ON MIXTURE OF GAUSSIANS) rather than using pre-existing, publicly available datasets. |
| Dataset Splits | No | The paper describes dynamic simulation of trajectories and Monte Carlo estimation for evaluation, specifying batch sizes for these simulations, but does not refer to static training/validation/test dataset splits in the conventional sense for a fixed dataset. |
| Hardware Specification | Yes | Each algorithm was run using a 16GB V100 GPU. |
| Software Dependencies | No | The paper mentions using 'Adam' for optimization but does not provide specific version numbers for software libraries, programming languages (e.g., Python, PyTorch), or other dependencies. |
| Experiment Setup | Yes | For all them, we train the control using Adam with learning rate 1 10 4. For SOCM, we train the reparametrization matrices using Adam with learning rate 1 10 2. We use batch size m = 128 unless otherwise specified. |