Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning

Authors: Kibeom Kim, Min Whoo Lee, Yoonsung Kim, JeHwan Ryu, Minsu Lee, Byoung-Tak Zhang

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed methods on visual navigation and robot arm manipulation tasks with multi-target environments and show that GDAN outperforms the state-of-the-art methods in terms of task success ratio, sample efficiency, and generalization. Additionally, qualitative analyses demonstrate that our proposed method can help the agent become aware of and focus on the given instruction clearly, promoting goal-directed behavior.
Researcher Affiliation	Collaboration	Kibeom Kim1,2, Min Whoo Lee1, Yoonsung Kim1, Je-Hwan Ryu1, Minsu Lee1,3 , Byoung-Tak Zhang1,3 1Seoul National University, 2Surromind, 3AIIS EMAIL
Pseudocode	No	The paper refers to 'Appendix D for algorithm details' but does not include pseudocode or a clearly labeled algorithm block within the main text provided.
Open Source Code	Yes	Code available at https://github.com/kibeom Kim/GACE-GDAN
Open Datasets	No	The paper mentions developing and making publicly available 'visual navigation and robot arm manipulation tasks as benchmarks' along with their implementation, which defines the environment for data generation. However, it does not provide a concrete link, DOI, or citation for a pre-collected, static 'dataset'.
Dataset Splits	No	The paper mentions training, seen, and unseen environments for generalization evaluation, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for any dataset.
Hardware Specification	No	The paper does not specify any particular hardware components such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments.
Software Dependencies	No	The paper mentions software components like A3C, Pixel-SAC, MuJoCo, ViZDoom, and LSTM, but it does not provide specific version numbers for any of them.
Experiment Setup	Yes	The rewards are set as rsuccess = 10, rnongoal = 1, rtimeout = 0.1, rstep = 0.01. Full details of the environment are provided in the Appendix C. ... We complete the training procedure by optimizing the overall loss Ltotal as the weighted sum of the two losses in Eq. 9. We focus on improving the policy for performing the main task and assign weight η to LGACE for performing goal-aware representation learning for the feature extractor σ( ). ... All experiments are repeated five times.