Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge
Authors: Yoonna Jang, Jungwoo Lim, Yuna Hur, Dongsuk Oh, Suhyune Son, Yeonsoo Lee, Donghoon Shin, Seungryong Kim, Heuiseok Lim10803-10812
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the abilities to make informative and customized utterances of pre-trained language models, we utilize BART and GPT-2 as well as transformer-based models. We assess their generation abilities with automatic scores and conduct human evaluations for qualitative results. |
| Researcher Affiliation | Collaboration | Yoonna Jang1 , Jungwoo Lim1 , Yuna Hur1 , Dongsuk Oh1, Suhyune Son1, Yeonsoo Lee2, Donghoon Shin2, Seungryong Kim1 , and Heuiseok Lim1 1Department of Computer Science and Engineering, Korea University 2Language AI Lab, NCSOFT |
| Pseudocode | No | The paper describes the model architecture and objective functions but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | In this work, we introduce a new dataset, call For Customized conversation dataset1 (called Fo Cus), that supports knowledge-grounded answers that reflect user s persona. 1http://github.com/pkchat-focus/Fo Cus |
| Open Datasets | Yes | In this work, we introduce a new dataset, call For Customized conversation dataset1 (called Fo Cus), that supports knowledge-grounded answers that reflect user s persona. 1http://github.com/pkchat-focus/Fo Cus |
| Dataset Splits | Yes | We split the collected data into train, valid and test sets. The detailed statistics of our dataset are summarized in Table 2. ... Table 2: Statistics of Fo Cus dataset. # Dialogs Train 11,562 Valid 1,445 Test 1,445 |
| Hardware Specification | Yes | Fine-tuning them on the entire data with 2 epochs takes approximately 10 hours with one RTX-8000 GPU. |
| Software Dependencies | No | The paper states, "We implement the models based on the source code of Hugging Face s transformers (Wolf et al. 2020, 2019)," but does not provide specific version numbers for software dependencies like Hugging Face Transformers, PyTorch, TensorFlow, or Python. |
| Experiment Setup | Yes | We use a batch size of 4 with a gradient accumulation of 32. Adam optimizer is used, and the learning rate is set as 6.25e-5, where 훽1 = 0.9, 훽2 = 0.999 with linear decay. ... For the utterance generation, we use the nucleus sampling with top-p = 0.9 and sampling temperature with 0.7. The maximum sequence length is set to 20. |