Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation

Authors: Sicheng Yang, Zhaohu Xing, Lei Zhu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the LC dataset and other public benchmarks demonstrate the effectiveness of our method, which outperforms state-of-the-art approaches. Extensive experiments conducted on our collected Lung Cancer (LC) dataset (comprising 828 annotated cases) and the open-source ACDC dataset demonstrate that VQ-Seg significantly outperforms state-of-the-art semi-supervised segmentation methods across key evaluation metrics, including Dice, Jaccard, HD95, and ASD. Detailed ablation studies further validate the efficacy and synergistic effects of the individual components of VQ-Seg.
Researcher Affiliation	Academia	1The Hong Kong University of Science and Technology (Guangzhou) 2The Hong Kong University of Science and Technology
Pseudocode	No	The paper describes the methodology in prose and refers to figures, but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Codes will be released1. 1 https://github.com/script-Yang/VQ-Seg
Open Datasets	Yes	Extensive experiments conducted on our collected Lung Cancer (LC) dataset (comprising 828 annotated cases) and the open-source ACDC dataset demonstrate that VQ-Seg significantly outperforms state-of-the-art semi-supervised segmentation methods... ACDC dataset. This dataset [43] is a cardiac MRI collection comprising 100 short-axis cine-MRI scans, acquired using both 3T and 1.5T scanners.
Dataset Splits	Yes	Following previous studies, we applied the 70 10 20 split ratio for training, validation, and testing on the LC dataset.
Hardware Specification	Yes	All experiments are conducted on a cloud computing platform equipped with four NVIDIA Ge Force RTX 4090 GPUs.
Software Dependencies	Yes	Our model is implemented using Py Torch 2.0.1 with CUDA 11.8 and MONAI 1.3.0.
Experiment Setup	Yes	All 2D slices are cropped to 128 128 and used as input, with a batch size of 4 per GPU. The training process runs for 100 epochs, employing the cross-entropy loss and an SGD optimizer with a polynomial learning rate scheduler (initial learning rate of 1 10 4 and decay of 3 10 5). ... In our main experiments, we set the VQ codebook size to K = 16,384 ... α = 0.99 denotes the EMA decay rate.