Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective

Authors: Yuanhong Zhang, Muyao Yuan, Weizhan Zhang, Tieliang Gong, Wen Wen, Jiangyong Ying, Weijie Shi

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across diverse benchmarks validate Info SAM s effectiveness in improving SAM family s performance on real-world tasks, demonstrating its adaptability and superiority in handling specialized scenarios.
Researcher Affiliation	Collaboration	1School of Computer Science and Technology, Xi an Jiaotong University, Xi an, China 2Ministry of Education Key Laboratory of Intelligent Networks and Network Security, Xi an Jiaotong University, Xi an, China 3Shaanxi Province Key Laboratory of Big Data Knowledge Engineering, Xi an Jiaotong University, Xi an, China 4China Telecom E-surfing Vision Technology Co., Ltd, Hangzhou, China 5School of Electrical Engineering, Xi an Jiaotong University, Xi an, China.
Pseudocode	Yes	Further details and Py Torch-style pseudocode for Info SAM are provided in Appendix A.2 and A.3.
Open Source Code	Yes	The code and models are available at Info SAM project page.
Open Datasets	Yes	In the natural image domain, we focus on camouflaged object segmentation (Skurowski et al., 2018; Le et al., 2019; Fan et al., 2020a). For medical imaging, we investigate polyp segmentation (Bernal et al., 2015; Jha et al., 2020) and skin lesion segmentation (Codella et al., 2018). In agriculture and remote sensing, we use leaf disease segmentation (Rath, 2023) and road segmentation datasets (Mnih, 2013) as representative examples, respectively.
Dataset Splits	Yes	COD10K contains 3,040 training and 2,026 testing samples, CHAMELEON provides 76 testing images, and CAMO includes 1,000 training and 250 testing images. The combined dataset of COD10K and CAMO training images is used, with 10% randomly split for validation, and testing is performed on all three datasets. ... (Fan et al., 2020b) splits the images into a 9:1 ratio for training and testing. Additionally, 20% of the training set is randomly selected as a validation set for use during training. ... ISIC 2017 dataset (Codella et al., 2018) for skin lesion segmentation, which contains 2,000 images for training, 150 images for validation, and 600 images for testing. ... Leaf Disease Segmentation dataset (Rath, 2023), which includes 498 images for training and 90 for testing, with 20% of the training set randomly split for validation. ... Massachusetts Roads Dataset (Mnih, 2013), containing 1,107 images for training, 13 for validation, and 48 for testing.
Hardware Specification	No	No specific hardware details (like GPU models or CPU types) are mentioned in the paper for running experiments.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) are mentioned in the paper.
Experiment Setup	Yes	We use a batch size of 4 and the Adam optimizer with an initial learning rate of 2 10 4, utilizing a Cosine Annealing scheduler that decays to a final learning rate of 2 10 5. All the methods are trained for 10 epochs with structure loss (i.e., the combination of weighted Io U loss and binary cross entropy loss) unless otherwise specified. During training, prompts are randomly selected from noised ground truth boxes and points at a 1:1 ratio.