Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
GeoVideo: Introducing Geometric Regularization into Video Generation Model
Authors: Yunpeng Bai, Shaoheng Fang, Chaohui Yu, Fan Wang, Qixing Huang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across multiple datasets show that our approach produces significantly more stable and geometrically consistent results than existing baselines. (Abstract) and "4 Experimental Results" section. |
| Researcher Affiliation | Collaboration | 1The University of Texas at Austin, 2DAMO Academy, Alibaba Group, 3Hupan Lab |
| Pseudocode | No | No explicit pseudocode or algorithm block found in the paper. The methodology is described through text and mathematical equations (1-8). |
| Open Source Code | No | At this stage, we do not release the code but provide generated video results from our model. |
| Open Datasets | Yes | For static scenes, we train on the DL3DV-10K [33] dataset. For dynamic videos, we collect a large-scale dataset of approximately 200,000 videos from online sources such as Pexels [40]. |
| Dataset Splits | No | We use 544 and 1000 videos from the two datasets for evaluation, respectively. This statement only indicates the number of videos used for evaluation, not specific training/test/validation splits or their proportions. |
| Hardware Specification | Yes | We train the model with a learning rate of 2e-5 on 8 H20 GPUs for 20,000 steps. |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries/solvers with versions) required for reproduction. |
| Experiment Setup | Yes | We train the model with a learning rate of 2e-5 on 8 H20 GPUs for 20,000 steps. The batch size is 1 per GPU, with 15K steps for stage 1 and 5K steps for stage 2. The video resolution is set to 768 × 1360 with 81 frames, following the standard configuration supported by Cog Video X. ... λdepth(t) = min(1.0, 0.1 + αt), (6) where t is the training step and α is set to 0.0001. ... The final training objective becomes: Ltotal = LRGB diff + λdepth LD diff + λgeo Lgeo. (8) Here, λgeo is set to 0.5. |