reproducibilityindex.ai

Adversarial Counterfactual Environment Model Learning

Authors: Xiong-Hui Chen, Yang Yu, Zhengmao Zhu, ZhiHua Yu, Chen Zhenjun, Chenghe Wang, Yinan Wu, Rong-Jun Qin, Hongqiu Wu, Ruijin Ding, Huang Fangsheng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments are conducted in two synthetic tasks, three continuous-control tasks, and a real-world application. We first verify that GALILEO can make accurate predictions on counterfactual data queried by other policies compared with baselines.
Researcher Affiliation	Collaboration	1 National Key Laboratory for Novel Software Technology, Nanjing University 2 School of Artificial Intelligence, Nanjing University, 3 Meituan, 4 Polixir.ai, 5 Tsinghua University
Pseudocode	Yes	Algorithm 1 Pseudocode for GALILEO
Open Source Code	Yes	3 code https://github.com/xionghuichen/galileo.
Open Datasets	Yes	We select 3 Mu Jo Co environments from D4RL [17] to construct our model learning tasks. The Cancer Genomic Atlas (TCGA) is a project that has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels.
Dataset Splits	Yes	In D4RL benchmark, only the medium tasks is collected with a fixed policy... So we train models in datasets Half Cheetah-medium, Walker2d-medium, and Hopper-medium. A Real-world Large-scale Food-delivery Platform We finally deploy GALILEO in a real-world large-scale food-delivery platform.
Hardware Specification	Yes	We use one Tesla V100 PCIe 32GB GPU and a 32-core Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz to train all of our models.
Software Dependencies	No	The paper mentions optimization algorithms like TRPO and PPO, but does not provide specific software dependencies with version numbers (e.g., library names and their versions like PyTorch 1.9, scikit-learn 0.24).
Experiment Setup	Yes	Table 6: Table of hyper-parameters for all of the tasks. This includes specific values for hidden layers, hidden units, batch size, learning rate, and other training parameters.