Critic Sequential Monte Carlo

Authors: Vasileios Lioutas, Jonathan Wilder Lavington, Justice Sefas, Matthew Niedoba, Yunpeng Liu, Berend Zwartsenberg, Setareh Dabiri, Frank Wood, Adam Scibior

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on collision avoidance in a high-dimensional simulated driving task show that Critic SMC significantly reduces collision rates at a low computational cost while maintaining realism and diversity of driving behaviors across vehicles and environment scenarios.
Researcher Affiliation Collaboration Vasileios Lioutas 1,2, J. Wilder Lavington1,2, Justice Sefas1,2, Matthew Niedoba1,2, Yunpeng Liu1,2, Berend Zwartsenberg1, Setareh Dabiri1, Frank Wood1,2,3, Adam Scibior1 1Inverted AI, 2University of British Columbia, 3Mila
Pseudocode Yes Algorithm 1 Critic Sequential Monte Carlo. Algorithm 2 Sequential Monte Carlo.
Open Source Code Yes We include in the supplementary material a demo code implementation of Critic SMC applied to the following linear Gaussian state-space model (LGSSM) with well-defined critic function
Open Datasets Yes The prior model is trained on the INTERACTION (Zhan et al., 2019) dataset and the task is that given 10 timesteps of observed behavior, predict the next 30 timesteps of future trajectories.
Dataset Splits Yes The evaluation is performed using the validation split of the INTERACTION dataset, which neither ITRA nor the critic saw during training.
Hardware Specification Yes We train the model using a single Nvidia RTX 2080Ti GPU.
Software Dependencies No The paper mentions "stable-baselines3 (Brockman et al., 2016)" for SAC implementation, but does not provide specific version numbers for this or other software components.
Experiment Setup Yes We train the model using a single Nvidia RTX 2080Ti GPU. The prioritized experience replay buffer has a size of 1 million stored experiences. The discount factor is set to 0.99, the batch size to 256 and the learning rate to 0.001. Finally, we sample 1024 actions during running Critic SMC while training the critic model.