reproducibilityindex.ai

VAST: Value Function Factorization with Variable Agent Sub-Teams

Authors: Thomy Phan, Fabian Ritz, Lenz Belzner, Philipp Altmann, Thomas Gabor, Claudia Linnhoff-Popien

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate VAST in three multi-agent domains and show that VAST can signiﬁcantly outperform state-of-the-art VFF, when the number of agents is sufﬁciently large. [...] 5 Experimental Setup [...] 6.1 Comparison of Value Function Factorization Operators for VAST [...] 6.2 State-of-the-Art Comparison
Researcher Affiliation	Academia	Thomy Phan1 Fabian Ritz1 Lenz Belzner2 Philipp Altmann1 Thomas Gabor1 Claudia Linnhoff-Popien1 1LMU Munich 2Technische Hochschule Ingolstadt
Pseudocode	Yes	Algorithm 1 Variable Agent Sub-Teams
Open Source Code	Yes	Code and README are available at https://github.com/thomyphan/scalable-marl.
Open Datasets	No	Does not apply. Only simulated data from the domains described in Section 5 was used. The paper describes custom-built grid-world environments (Warehouse[N], Battle[N], Gaussian Squeeze[N]) but does not provide concrete access information for a publicly available dataset used for training.
Dataset Splits	No	Appendix A.1.1 General Training Details: 'We applied early stopping to prevent overﬁtting and selected hyperparameters based on the validation performance.' While validation is mentioned for hyperparameter selection and early stopping, the paper does not specify concrete dataset splits (e.g., percentages or sample counts) for training, validation, and test sets. The experiments are conducted in simulated environments rather than on traditional datasets with predefined splits.
Hardware Specification	Yes	All experiments ran on compute servers equipped with Intel Xeon E5-2630 v4 (10 cores), NVIDIA Quadro RTX 5000 (16 GB), and 256 GB RAM.
Software Dependencies	Yes	We used Python 3.8.5, PyTorch 1.8.0, and CUDA 11.1 for all experiments.
Experiment Setup	Yes	Further details on the training setup and the experiments are speciﬁed in Appendix A.1 and A.2. Appendix A.1.1 General Training Details: 'All parameters for the networks are initialized uniformly randomly within [−0.01, 0.01]. We used Adam optimizer [17] with a learning rate of 5e-4 and epsilon 1e-5. [...] We used a batch size of 32.' Appendix A.1.2 Domain-Specific Training Details: 'Warehouse[N]: number of episodes 30000, Battle[N]: number of episodes 100000, Gaussian Squeeze[N]: number of episodes 50000.'