Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance

Authors: Yufeng Wang, Jinwu Hu, Ziteng Huang, Kunyang Lin, Zitian Zhang, Peihao Chen, Yu Hu, Qianyue Wang, Zhuliang Yu, Bin Sun, Xiaofen Xing, Qingfang Zheng, Mingkui Tan

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that our proposed training method is applicable to different LLMs, improving user-oriented proactivity and attractiveness in open-domain dialogues. Code and appendix are available at github.com/wang678/LLM-UPC. The paper contains a dedicated '5 Experiment' section, presenting comparative results, ablation studies, and real-user evaluations with performance metrics (Tables 1, 2, and Figures 4, 5).
Researcher Affiliation	Collaboration	The authors are affiliated with: 1South China University of Technology (Academia), 2Peng Cheng Laboratory (Public Research), 3Pazhou Laboratory (Public Research), 4Tencent AI Lab (Industry), 5Tencent Robotics X Lab (Industry), 6Hong Kong Polytechnic University (Academia), 7Hunan University (Academia). The mix of university and industry affiliations indicates a collaboration.
Pseudocode	Yes	The paper includes 'Algorithm 1 Dialogue corpus generation in iteration k.' and 'Algorithm 2 Iterative Curriculum Learning.' which are clearly labeled algorithm blocks.
Open Source Code	Yes	Code and appendix are available at github.com/wang678/LLM-UPC.
Open Datasets	No	The paper states: 'Finally, we construct the ISCO-800, a dataset with 800 user backgrounds, to create diverse user agents.' and '3) Construction of a user background dataset ISCO-800.' While a new dataset is constructed and described, no direct URL, DOI, or specific repository name for accessing the ISCO-800 dataset itself is provided, separate from the general code repository link.
Dataset Splits	Yes	The 800 user agents are divided into training, validation, and test sets (500, 100, and 200 users, respectively) for dialogue generation.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processors, or memory) used for running experiments are mentioned in the paper. It only references large language models used.
Software Dependencies	No	The paper mentions using specific LLM models (e.g., Qwen1.5-32B-Chat, GPT-3.5, GPT-4) but does not provide specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed to replicate the experimental setup.
Experiment Setup	Yes	The paper provides specific hyperparameters in Appendix B: 'In our experiment, 𝛼= 3, 𝛽= 2, 𝑅= 3, 𝑇= 5, and 𝐾= 4.' These parameters (alpha, beta, maximum regeneration attempts R, dialogue turns T, and maximum number of iterations K) define the experimental setup.