reproducibilityindex.ai

Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning

Authors: Changyu CHEN, Ramesha Karunasena, Thanh Nguyen, Arunesh Sinha, Pradeep Varakantham

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct extensive experiments to show the scalability of our approach compared to prior methods and the ability to enforce arbitrary state-conditional constraints on the support of the distribution of actions in any state. We evaluate IAR-A2C against prior works across a diverse set of environments, including lowdimensional discrete control tasks such as Cart Pole and Acrobot, the visually challenging Pistonball task with high-dimensional image inputs and an extremely large action space (upto 59, 049 categories), and an emergency resource allocation simulator in a city, referred to as Emergency Resource Allocation (ERA).
Researcher Affiliation	Academia	Singapore Management University1,University of Oregon2, Rutgers University3
Pseudocode	Yes	Algorithm 1: ELBO Optimization; Algorithm 2: IAR-A2C
Open Source Code	Yes	Our implementation is available at https://github.com/cameron-chen/flow-iar.
Open Datasets	Yes	We evaluate IAR-A2C against prior works across a diverse set of environments, including lowdimensional discrete control tasks such as Cart Pole and Acrobot [3], the visually challenging Pistonball task [31] with high-dimensional image inputs and an extremely large action space (upto 59, 049 categories), and an emergency resource allocation simulator in a city, referred to as Emergency Resource Allocation (ERA).
Dataset Splits	No	The paper does not provide specific details on how the datasets were split into training, validation, or test sets, such as percentages, absolute counts, or references to standard predefined splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory configurations.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow, or specific solvers).
Experiment Setup	No	The paper does not provide specific experimental setup details, such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings, beyond mentioning that approaches are trained with a certain number of seeds.