Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Personalized Visual Content Generation in Conversational Systems
Authors: Xianquan Wang, Zhaocheng Du, Huibo Xu, Shukang Yin, Yupeng Han, Jieming Zhu, Kai Zhang, Qi Liu
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark conversational datasets including objective metrics and GPT-based evaluations demonstrate that our framework outperforms strong baselines, which highlight its potential to redefine personalization in visual content generation for conversational scenarios like e-commerce and real-world recommendation. |
| Researcher Affiliation | Collaboration | 1University of Science and Technology of China 2Huawei Noah s Ark Lab |
| Pseudocode | Yes | D Pseudo Code |
| Open Source Code | Yes | The code is publicly available at https://github.com/xqwustc/PCG. |
| Open Datasets | Yes | Following previous works of conversational recommender systems [23, 40], we conduct experiments on two conversational recommendation datasets set in movie scenarios. The two datasets are classic benchmarks for movie conversational recommender systems, containing many high-quality interactions with the systems. |
| Dataset Splits | Yes | For both datasets, the original data was randomly split into training, validation, and test sets with a ratio of 8:1:1. |
| Hardware Specification | Yes | using a single A100-80G GPU with a batch size of 1 |
| Software Dependencies | No | As mentioned in Section H, we use Qwen3-8B 4 as the LLM to generate user inclinations and GPT-4o for evaluation. When fine-tuning PCG Lo RA based on Easy Control, we strictly follow its recommended settings. |
| Experiment Setup | Yes | We generate outputs with the following parameters: a maximum of 128 new tokens, sampling enabled with a temperature of 0.7, top-p sampling with a probability of 0.8, top-k sampling with a limit of 20, and a minimum probability of 0.0. When fine-tuning PCG Lo RA based on Easy Control, we strictly follow its recommended settings. The overall learning rate is set to 1 10 4 (based on the FLUX.1-dev pre-trained model, using a single A100-80G GPU with a batch size of 1). The two types of Lo RAs in the two-stage training share the same learning rate. The optimizer used is Adam W with the parameters: β1 = 0.9, β2 = 0.999, weight decay = 1 10 4. The dimension of the low-rank matrices is set to 128. |