Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

Authors: Teng Hu, Zhentao Yu, Zhengguang Zhou, Jiangning Zhang, Yuan Zhou, Qinglin Lu, Ran Yi

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Poly Vivid achieves superior performance in identity ﬁdelity, video realism, and subject alignment, outperforming existing open-source and commercial baselines. We compared Poly Vivid with the state-of-the arts video customization methods, including commercial products (Vidu-2.0 [45], Kling-1.6 [25], Pika [37], and Hailuo [13]) and open-sourced methods (Skyreels-A2 [11] and VACE-1.3B [24]). For each model, we generate 100 videos, which are employed to compute the quantitative metrics.
Researcher Affiliation	Collaboration	Teng Hu1 Zhentao Yu2 Zhengguang Zhou2 Jiangning Zhang3 Yuan Zhou2 Qinglin Lu2 Ran Yi1 1Shanghai Jiao Tong University 2Tencent Hunyuan 3Zhejiang University
Pseudocode	No	The paper describes methods and processes like 'Clique-based Subject Consolidation' and 'Attention-inherited Identity Injection' using prose and mathematical formulations (equations 2-10), but does not present any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Question: Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justiﬁcation: This project is a relatively large project that contains a lot of real name information. We will open-source the code after the paper is published.
Open Datasets	Yes	We curate a large set of high-quality data from open-source datasets, including Panda-70M [6] and Koala-36M [47], as well as our own collected data.
Dataset Splits	No	The paper mentions two training stages, each involving 5,000 iterations, for single-subject and multi-subject data. It also describes the creation of a 'Test Dataset' consisting of 100 image pairs. However, it does not provide specific train/validation/test splits (e.g., percentages or counts) for the overall datasets used to train the Poly Vivid model (e.g., Panda-70M, Koala-36M).
Hardware Specification	Yes	All training processes are conducted on 256 GPUs, each with more than 80GB of memory, using a batch size of 256.
Software Dependencies	No	The paper mentions various models and tools used, such as LLa VA [31], Hunyuan Video [27], Florence2 [51], YOLOv11 [26], and DINO-v2 [35], but it does not specify explicit version numbers for programming languages, libraries, or frameworks used for the implementation (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	To enhance the efﬁciency of the training process, we divide it into two distinct stages. The ﬁrst stage focuses on modeling the identity preservation capability... This stage involves 5,000 iterations. Once the model has effectively learned identity preservation, we proceed to the second stage... This stage also comprises 5,000 iterations. Additionally, ... in each stage, we initially train the model at reduced sizes for 1,000 iterations (included in the total 5,000 iterations)... All training processes are conducted on 256 GPUs, each with more than 80GB of memory, using a batch size of 256.