Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Adversarial Counterfactual Environment Model Learning
Authors: Xiong-Hui Chen, Yang Yu, Zhengmao Zhu, ZhiHua Yu, Chen Zhenjun, Chenghe Wang, Yinan Wu, Rong-Jun Qin, Hongqiu Wu, Ruijin Ding, Huang Fangsheng
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments are conducted in two synthetic tasks, three continuous-control tasks, and a real-world application. We first verify that GALILEO can make accurate predictions on counterfactual data queried by other policies compared with baselines. |
| Researcher Affiliation | Collaboration | 1 National Key Laboratory for Novel Software Technology, Nanjing University 2 School of Artificial Intelligence, Nanjing University, 3 Meituan, 4 Polixir.ai, 5 Tsinghua University |
| Pseudocode | Yes | Algorithm 1 Pseudocode for GALILEO |
| Open Source Code | Yes | 3 code https://github.com/xionghuichen/galileo. |
| Open Datasets | Yes | We select 3 Mu Jo Co environments from D4RL [17] to construct our model learning tasks. The Cancer Genomic Atlas (TCGA) is a project that has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein, and epigenetic levels. |
| Dataset Splits | Yes | In D4RL benchmark, only the medium tasks is collected with a fixed policy... So we train models in datasets Half Cheetah-medium, Walker2d-medium, and Hopper-medium. A Real-world Large-scale Food-delivery Platform We finally deploy GALILEO in a real-world large-scale food-delivery platform. |
| Hardware Specification | Yes | We use one Tesla V100 PCIe 32GB GPU and a 32-core Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz to train all of our models. |
| Software Dependencies | No | The paper mentions optimization algorithms like TRPO and PPO, but does not provide specific software dependencies with version numbers (e.g., library names and their versions like PyTorch 1.9, scikit-learn 0.24). |
| Experiment Setup | Yes | Table 6: Table of hyper-parameters for all of the tasks. This includes specific values for hidden layers, hidden units, batch size, learning rate, and other training parameters. |