Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Mitigating Occlusions in Virtual Try-On via A Simple-Yet-Effective Mask-Free Framework
Authors: Chenghu Du, Shengwu Xiong, junyin Wang, Yi Rong, Shili Xiong
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three VTON datasets validate the effectiveness and generalization ability of our method. Both qualitative and quantitative results demonstrate that our method outperforms recently proposed VTON benchmarks. |
| Researcher Affiliation | Academia | 1School of Computer Science and Artificial Intelligence, Wuhan University of Technology 2Interdisciplinary Artificial Intelligence Research Institute, Wuhan College 3Shanghai Artificial Intelligence Laboratory EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1: Training and Inference Procedures |
| Open Source Code | Yes | We will release of code and data. |
| Open Datasets | Yes | We conduct experiments on three challenging datasets: VITON [20], VITON-HD [21], and Dress Code [22]. |
| Dataset Splits | Yes | VITON dataset contains 16,253 image groups, each with a resolution of 256 192. It is divided into a training set of 14,221 groups and a testing set of 2,032 groups. VITON-HD dataset, with a resolution of 512 384, comprises 13,679 image groups and is split into a training set of 11,647 groups and a testing set of 2,032 groups. Dress Code dataset, also with a resolution of 512 384, includes 15,363 image groups and is divided into a training set of 12,863 groups and a testing set of 2,500 groups. |
| Hardware Specification | Yes | It is fine-tuned for 100 epochs on six NVIDIA RTX 4090 GPUs under Ubuntu 22.04 LTS. |
| Software Dependencies | Yes | Our model is built on the Diffusers framework4 with Stable Diffusion v1.5 as the backbone and initialized from the official Cat VTON checkpoint [9]. It is fine-tuned for 100 epochs on six NVIDIA RTX 4090 GPUs under Ubuntu 22.04 LTS. Training employs T =1,000 denoising steps with a linear noise schedule, the Adam W [29] optimizer (β1 =0.5, β2 =0.999) in fp32 precision, a batch size of 8, and a learning rate of 1 10 5. |
| Experiment Setup | Yes | Training employs T =1,000 denoising steps with a linear noise schedule, the Adam W [29] optimizer (β1 =0.5, β2 =0.999) in fp32 precision, a batch size of 8, and a learning rate of 1 10 5. |