Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation
Authors: Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Bo Li, Yang Tang, Pan Zhou
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities. |
| Researcher Affiliation | Collaboration | Ruihao Xia1,2 , Yu Liang2 , Peng-Tao Jiang2 , Hao Zhang2 Bo Li2 , Yang Tang1,3 , Pan Zhou4 1East China University of Science and Technology, 2vivo Mobile Communication Co., Ltd 3Peng Cheng Laboratory, 4Singapore Management University |
| Pseudocode | No | The paper provides a framework diagram (Figure 2) but does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | We open-source our code and models at https://github.com/Xia-Rho/MADM. |
| Open Datasets | Yes | In our experiments, we adopt the Cityscapes-Image [13] dataset as the source modality and the DELIVER-Depth [5], FMB-Infrared [6], and DSEC-Event [7] datasets as the target modalities. |
| Dataset Splits | Yes | Cityscapes [13] is the source dataset in our experiments... split into 2,975 training images and 500 validation images... DELIVER [5]... contains 3,983/2,005/1,897 samples for training/validation/testing... |
| Hardware Specification | Yes | Experiments are conducted on a NVIDIA H800 GPU, occupying about 57G memory. |
| Software Dependencies | No | The paper mentions using the Stable Diffusion v1-4 model and DAFormer components but does not specify version numbers for general software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We train our MADM for 10k iterations with a batch size of 2 and an image resolution of 512 ร 512. The optimization is instantiated with Adam W [45] with a learning rate of 5e-6. For hyperparameters ฮฒ, ฮณ, and ฮปreg in DPLG and LPLR, we set them to {5000,60,1.0}/{8000,50,1.0}/{8000,50,10.0} for depth/infrared/event modalities, respectively. |