Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Vicinity-Guided Discriminative Latent Diffusion for Privacy-Preserving Domain Adaptation

Authors: Jing Wang, Wonho Bae, Jiahong Chen, Wenxu Wang, Junhyug Noh

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate DVD on three tasks: source-free domain adaptation (Section 4.1), supervised single-domain classification (Section 4.2), and domain generalization (Section 4.3). We then analyze hyperparameter sensitivity (Section 4.4) and runtime efficiency (Section 4.5). Additional results and ablation studies are provided in the Appendix. Results. Tables 1, 2, and 3 present adaptation results on Vis DA-C 2017, Office-Home, and Office31, respectively. Results. Table 4 presents the top-1 error (%). Results. Table 5 shows the classification accuracy (%) for the target domain. Figure 3 demonstrates that DVD s performance remains strong and stable across a wide range of settings, confirming that tuning for robust cross-domain classification is straightforward.
Researcher Affiliation	Collaboration	Jing Wang1 Wonho Bae1 Jiahong Chen3 Wenxu Wang4 Junhyug Noh2 1University of British Columbia 2Ewha Womans University 3Amazon Inc. 4Ocean University of China EMAIL EMAIL
Pseudocode	Yes	Algorithms 1 and 2 detail the training and sampling procedures. Algorithm 1: Latent Diffusion Training Algorithm 2: Latent Diffusion Sampling
Open Source Code	Yes	Code is available on our Github: https://github.com/Jing Wang18/DVD-SFDA.
Open Datasets	Yes	We conduct experiments on three widely recognized SFDA benchmarks: Office-31 [Saenko et al., 2010]: 4,652 images across 31 classes collected from three domains Amazon (A), Webcam (W), and DSLR (D). Office-Home [Venkateswara et al., 2017]: 15,500 images in 65 classes from four domains Artistic (Ar), Clipart (Cl), Product (Pr), and Real-World (Rw). Vis DA-C 2017 [Peng et al., 2017]: 280,000 images spanning 12 classes, where the source domain is rendered via 3D models, and the target domain consists of real images captured by RGB cameras. Datasets. We evaluate DVD-based latent augmentation on three standard benchmarks (additional results on domain adaptation datasets are in the Appendix C.3, demonstrating improved sourcedomain classification performance): CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009]: Each dataset has 60,000 images (50,000 for training, 10,000 for testing). CIFAR-10 contains 10 classes, and CIFAR-100 has 100 classes. Image Net [Russakovsky et al., 2015]: The well-known ILSVRC 2012 benchmark with 1.2 million images in 1,000 object classes. Datasets. We evaluate DVD-generated features on four standard domain generalization benchmarks: PACS [Li et al., 2017]: 9,991 images across 7 classes, with each domain contributing roughly 2,000 images. VLCS [Fang et al., 2013]: 10,729 images spanning 5 categories, each domain providing about 2,000 to 3,000 images. Office-Home [Venkateswara et al., 2017]: The same 65-class dataset used in SFDA, consisting of four distinct domains. Domain Net [Li et al., 2017]: A large-scale benchmark with 586,575 images across 6 domains and 345 object categories.
Dataset Splits	Yes	CIFAR-10 and CIFAR-100 [Krizhevsky et al., 2009]: Each dataset has 60,000 images (50,000 for training, 10,000 for testing). CIFAR-10 contains 10 classes, and CIFAR-100 has 100 classes.
Hardware Specification	Yes	All timings were collected on an Nvidia V100 GPU and are reported as mean std over 5 runs using identical conditions.
Software Dependencies	No	Res Net-50 serves as the encoder G for Office-31 and Office-Home, while Res Net-101 is used for Vis DA-C 2017 to align with existing SFDA baselines. A two-layer linear head F performs classification. We use a conditional UNet [Ho et al., 2020] for D, with 16 diffusion steps in both training and inference. An SGD optimizer (learning rate 3 10 3, momentum 0.9, batch size 128) is used for parameter updates. We employ the Info NCE objective from Sim CLR [Chen et al., 2020]
Experiment Setup	Yes	Experimental Setup. To simplify the implementation of DVD, we adopt a consistent framework for all benchmarks. Res Net-50 serves as the encoder G for Office-31 and Office-Home, while Res Net-101 is used for Vis DA-C 2017 to align with existing SFDA baselines. A two-layer linear head F performs classification. We use a conditional UNet [Ho et al., 2020] for D, with 16 diffusion steps in both training and inference. An SGD optimizer (learning rate 3 10 3, momentum 0.9, batch size 128) is used for parameter updates. We employ the Info NCE objective from Sim CLR [Chen et al., 2020] with a temperature τ = 0.13. We define three parameters: kdif s and kdif t (number of neighbors used to parameterize the priors for DVD), and kt (number of neighbors for preserving the local target feature structure). Unless otherwise stated, we set (kdif s , kdif t , kt) = (15, 15, 6). Experimental Setup. We follow standard training protocols for supervised classification benchmarks. Specifically, we use SGD with momentum (0.9), a weight decay of 5 10 4, a mini-batch size of 128, and train for 200 epochs. The learning rate starts at 0.1 and follows a cosine annealing schedule [Loshchilov and Hutter, 2017]. DVD is trained with the same hyperparameters used in the SFDA experiments. At test time, for each data, we identify its latent k-NNs, generate an augmented feature via DVD, and pass it into the classifier.