Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
Authors: Chenjie Cao, Chaohui Yu, Fan Wang, Xiangyang Xue, Yanwei Fu
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Sufficient scene-level experiments on both object-centric and forward-facing datasets verify the effectiveness of MVInpainter, including diverse tasks, such as multiview object removal, synthesis, insertion, and replacement. |
| Researcher Affiliation | Collaboration | Chenjie Cao1,2,3, Chaohui Yu2,3, Fan Wang2,3, Xiangyang Xue1, Yanwei Fu1 1Fudan University, 2DAMO Academy, Alibaba Group, 3Hupan Lab |
| Pseudocode | No | The paper includes figures illustrating the pipeline and components (e.g., Figure 2, Figure 3), but it does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | our codes will also be open-released. |
| Open Datasets | Yes | MVInpainter-O is trained on the object-centric data that includes full categories of CO3D [57] and MVImg Net [95]. Moreover, we regard the Omni3D [6] as the zero-shot validation. MVInpainter-F is trained on the forward-facing data with Real10K [103], Scannet++ [89], and DL3DV [41], including both indoor and outdoor scenes. We further employ comparison on SPIn Ne RF [51] to verify the object removal ability. |
| Dataset Splits | No | The paper mentions using 'zero-shot validation' for Omni3D and 'mixed scene-level validation' for Real10K, Scannet++, and DL3DV, and refers to test sets for these datasets (e.g., '10 scenes are selected from SPIn Ne RF [51] test set'), but it does not specify explicit numerical percentages or counts for training, validation, and test splits across all datasets used for training. |
| Hardware Specification | Yes | All trainings are accomplished on 8 A800 GPUs. |
| Software Dependencies | No | The paper mentions various software components and models used (e.g., 'SD1.5-inpainting', 'Animate Diff', 'RAFT', 'SAM-tracking'), but it does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | We train MVInpainter-O and MVInpainter-F for 100k and 60k steps with batch size 64, frame number 12, learning rate 1e-4 for 3 days and 2 days respectively. Then we fine-tune the model with dynamic frames for 10k steps. All images are resized and cropped into 256 256 for both inpainting and flow extraction. |