Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
An Enhanced Advising Model in Teacher-Student Framework using State Categorization
Authors: Daksh Anand, Vaibhav Gupta, Praveen Paruchuri, Balaraman Ravindran6653-6660
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the robustness of our approach by showcasing our experiments on multiple Atari 2600 games using a fixed set of hyper-parameters. |
| Researcher Affiliation | Academia | 1 Machine Learning Lab, IIIT Hyderabad 2 Robert Bosch Center for Data Science and AI, IIT Madras |
| Pseudocode | Yes | The complete algorithm is given in algorithm 1. The routine get Action(s) returns the action to be taken by the student in the state s. |
| Open Source Code | No | No explicit statement or link for open-source code release is provided in the paper. |
| Open Datasets | Yes | We demonstrate the performance of our approach on three domains from the Arcade Learning Environment (Bellemare et al. 2013), namely Qbert, Boxing and Seaquest. |
| Dataset Splits | No | The paper describes training and testing epochs for evaluating performance but does not specify explicit training/validation/test dataset splits as commonly found in supervised learning. |
| Hardware Specification | No | No specific hardware details (e.g., CPU, GPU models, or cloud instance types) used for experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions algorithms and architectures like Double-DQN and DQN, but does not specify software packages with version numbers (e.g., PyTorch, TensorFlow, or specific libraries). |
| Experiment Setup | Yes | The values of advice ratio α and the batch size were fixed to 0.01 and 8 respectively for this experiment. For all the games, we fix γ to 0.99. All the agents were trained for 30 million steps with the size of each training epoch being 40k steps. |