Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Mitigating Occlusions in Virtual Try-On via A Simple-Yet-Effective Mask-Free Framework

Authors: Chenghu Du, Shengwu Xiong, junyin Wang, Yi Rong, Shili Xiong

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three VTON datasets validate the effectiveness and generalization ability of our method. Both qualitative and quantitative results demonstrate that our method outperforms recently proposed VTON benchmarks.
Researcher Affiliation Academia 1School of Computer Science and Artificial Intelligence, Wuhan University of Technology 2Interdisciplinary Artificial Intelligence Research Institute, Wuhan College 3Shanghai Artificial Intelligence Laboratory EMAIL EMAIL
Pseudocode Yes Algorithm 1: Training and Inference Procedures
Open Source Code Yes We will release of code and data.
Open Datasets Yes We conduct experiments on three challenging datasets: VITON [20], VITON-HD [21], and Dress Code [22].
Dataset Splits Yes VITON dataset contains 16,253 image groups, each with a resolution of 256 192. It is divided into a training set of 14,221 groups and a testing set of 2,032 groups. VITON-HD dataset, with a resolution of 512 384, comprises 13,679 image groups and is split into a training set of 11,647 groups and a testing set of 2,032 groups. Dress Code dataset, also with a resolution of 512 384, includes 15,363 image groups and is divided into a training set of 12,863 groups and a testing set of 2,500 groups.
Hardware Specification Yes It is fine-tuned for 100 epochs on six NVIDIA RTX 4090 GPUs under Ubuntu 22.04 LTS.
Software Dependencies Yes Our model is built on the Diffusers framework4 with Stable Diffusion v1.5 as the backbone and initialized from the official Cat VTON checkpoint [9]. It is fine-tuned for 100 epochs on six NVIDIA RTX 4090 GPUs under Ubuntu 22.04 LTS. Training employs T =1,000 denoising steps with a linear noise schedule, the Adam W [29] optimizer (β1 =0.5, β2 =0.999) in fp32 precision, a batch size of 8, and a learning rate of 1 10 5.
Experiment Setup Yes Training employs T =1,000 denoising steps with a linear noise schedule, the Adam W [29] optimizer (β1 =0.5, β2 =0.999) in fp32 precision, a batch size of 8, and a learning rate of 1 10 5.