reproducibilityindex.ai

An Enhanced Advising Model in Teacher-Student Framework using State Categorization

Authors: Daksh Anand, Vaibhav Gupta, Praveen Paruchuri, Balaraman Ravindran6653-6660

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the robustness of our approach by showcasing our experiments on multiple Atari 2600 games using a ﬁxed set of hyper-parameters.
Researcher Affiliation	Academia	1 Machine Learning Lab, IIIT Hyderabad 2 Robert Bosch Center for Data Science and AI, IIT Madras
Pseudocode	Yes	The complete algorithm is given in algorithm 1. The routine get Action(s) returns the action to be taken by the student in the state s.
Open Source Code	No	No explicit statement or link for open-source code release is provided in the paper.
Open Datasets	Yes	We demonstrate the performance of our approach on three domains from the Arcade Learning Environment (Bellemare et al. 2013), namely Qbert, Boxing and Seaquest.
Dataset Splits	No	The paper describes training and testing epochs for evaluating performance but does not specify explicit training/validation/test dataset splits as commonly found in supervised learning.
Hardware Specification	No	No specific hardware details (e.g., CPU, GPU models, or cloud instance types) used for experiments are provided in the paper.
Software Dependencies	No	The paper mentions algorithms and architectures like Double-DQN and DQN, but does not specify software packages with version numbers (e.g., PyTorch, TensorFlow, or specific libraries).
Experiment Setup	Yes	The values of advice ratio α and the batch size were ﬁxed to 0.01 and 8 respectively for this experiment. For all the games, we ﬁx γ to 0.99. All the agents were trained for 30 million steps with the size of each training epoch being 40k steps.