Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Foundation Cures Personalization: Improving Personalized Models’ Prompt Consistency via Hidden Foundation Knowledge

Authors: Yiyang Cai, Zhengkai Jiang, Yulong Liu, Chunyang Jiang, Wei Xue, Yike Guo, Wenhan Luo

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through comprehensive experiments, Free Cure has been verified to effectively restore a wide variety of misaligned attributes produced by various state-of-the-art personalization models. Comprehensive experiments show that our approach improves prompt consistency while maintaining the well-trained ability for identity preservation. 5 Experiments
Researcher Affiliation	Collaboration	Yiyang Cai1, Zhengkai Jiang2, Yulong Liu1, Chunyang Jiang1 Wei Xue1, Yike Guo1, Wenhan Luo1 1 Hong Kong University of Science and Technology (HKUST) 2 Tencent Hunyuan
Pseudocode	No	The paper describes methods in text and mathematical formulas but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our dataset consists of public part and self-collected part. We will release them together with codes. See Sec.5.
Open Datasets	Yes	We collect an extensive dataset including 50 identities, with 30 derived from the Celeb A-HQ [32] and the other 20 non-celebrity identities curated by our team. For attribute segmentation, we leverage Bi Se Net [66] and Segment-Anything [28] for different facial attributes.
Dataset Splits	No	We collect an extensive dataset including 50 identities, with 30 derived from the Celeb A-HQ [32] and the other 20 non-celebrity identities curated by our team. Each identity is represented by a single image, with a spectrum of facial characteristics. The prompt set consists of 20 prompts containing different facial attributes. For each (identity, prompt) pair, we produce 20 images.
Hardware Specification	Yes	We run experiments on one H800 GPU (80GB).
Software Dependencies	No	We adopt CLIP-T [22] to calculate prompt consistency (PC). To calculate identity fidelity (IF), we use MTCNN [68] and Face Net [48] to extract the embedding of the generated/reference faces and compute the cosine similarity. Following Photo Maker, we adopt the face diversity (Face Div.) metric which calculates LPIPS [70] between facial areas. For attribute segmentation, we leverage Bi Se Net [66] and Segment-Anything [28] for different facial attributes.
Experiment Setup	Yes	Free Cure Settings. We set ω in FASA to 2.0, to ensure that attribute information from FD can be sufficiently integrated into PD. The γ in APG is set to 0.5 to maintain the balance between identity fidelity and prompt consistency. Our supplemental ablation analysis indicates that the hyperparameters ω and γ consistently fall within specific ranges across different baselines: ω [1.8, 2.4], γ [0.5, 0.6].