Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance
Authors: Yufeng Wang, Jinwu Hu, Ziteng Huang, Kunyang Lin, Zitian Zhang, Peihao Chen, Yu Hu, Qianyue Wang, Zhuliang Yu, Bin Sun, Xiaofen Xing, Qingfang Zheng, Mingkui Tan
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our proposed training method is applicable to different LLMs, improving user-oriented proactivity and attractiveness in open-domain dialogues. Code and appendix are available at github.com/wang678/LLM-UPC. The paper contains a dedicated '5 Experiment' section, presenting comparative results, ablation studies, and real-user evaluations with performance metrics (Tables 1, 2, and Figures 4, 5). |
| Researcher Affiliation | Collaboration | The authors are affiliated with: 1South China University of Technology (Academia), 2Peng Cheng Laboratory (Public Research), 3Pazhou Laboratory (Public Research), 4Tencent AI Lab (Industry), 5Tencent Robotics X Lab (Industry), 6Hong Kong Polytechnic University (Academia), 7Hunan University (Academia). The mix of university and industry affiliations indicates a collaboration. |
| Pseudocode | Yes | The paper includes 'Algorithm 1 Dialogue corpus generation in iteration k.' and 'Algorithm 2 Iterative Curriculum Learning.' which are clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code and appendix are available at github.com/wang678/LLM-UPC. |
| Open Datasets | No | The paper states: 'Finally, we construct the ISCO-800, a dataset with 800 user backgrounds, to create diverse user agents.' and '3) Construction of a user background dataset ISCO-800.' While a new dataset is constructed and described, no direct URL, DOI, or specific repository name for accessing the ISCO-800 dataset itself is provided, separate from the general code repository link. |
| Dataset Splits | Yes | The 800 user agents are divided into training, validation, and test sets (500, 100, and 200 users, respectively) for dialogue generation. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, processors, or memory) used for running experiments are mentioned in the paper. It only references large language models used. |
| Software Dependencies | No | The paper mentions using specific LLM models (e.g., Qwen1.5-32B-Chat, GPT-3.5, GPT-4) but does not provide specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed to replicate the experimental setup. |
| Experiment Setup | Yes | The paper provides specific hyperparameters in Appendix B: 'In our experiment, 𝛼= 3, 𝛽= 2, 𝑅= 3, 𝑇= 5, and 𝐾= 4.' These parameters (alpha, beta, maximum regeneration attempts R, dialogue turns T, and maximum number of iterations K) define the experimental setup. |