Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Authors: Jiacheng Ruan, Wenzhen Yuan, Zehao Lin, Ning Liao, Zhiyu Li, Feiyu Xiong, Ting Liu, Yuzhuo Fu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on the Cam Obj Bench with Cam Obj-Llava, 8 existing open-source and 3 closed-source LVLMs. Surprisingly, the results indicate that our model achieves a 25.84% improvement in 4 out of 7 tasks compared to GPT-4o.
Researcher Affiliation	Academia	1 Shanghai Jiao Tong University, Shanghai, China 2 Institute for Advanced Algorithms Research, Shanghai, China EMAIL
Pseudocode	No	No structured pseudocode or algorithm blocks are present in the paper. Methodologies are described in narrative text.
Open Source Code	Yes	Code https://github.com/JCruan519/MM-Cam Obj
Open Datasets	Yes	In our study, the images in the MM-Cam Obj dataset are sourced from publicly available camouflage scene understanding datasets. These datasets not only provide accurate category annotations but also include segmentation masks for camouflaged objects. Specifically, as shown in Table 1, we carefully select 11,963 camouflaged target images from (Pang et al. 2023; Fan et al. 2020; Cheng et al. 2022; Yang 2023; Zheng et al. 2018).
Dataset Splits	Yes	Of these, 11,363 images are used to construct Cam Obj-Align and Cam Obj-Instruct, while the remaining 600 images are utilized to construct Cam Obj-Bench... To validate the performance of our Cam Obj-Llava-7B with limited data, we randomly selected 10%, 20%, and 50% of the samples from the Cam Obj-Align and Cam Obj-Instruct datasets for training, while keeping the rest of the experimental settings consistent with those described in Sec. Training Details.
Hardware Specification	Yes	All experiments on conducted on 8 NVIDIA A800 GPUs.
Software Dependencies	Yes	We utilize BGE-M3 (Chen et al. 2023) and BGE-v1.5-en (Xiao et al. 2023) to obtain embeddings for the image and text, calculating the cosine similarity between them.
Experiment Setup	Yes	During the alignment stage, we set the learning rate to 5e-4, and for the instruction fine-tuning stage, the learning rate is adjusted to 2.5e-4. Both stages are trained for 1 epoch using the Adam W (Loshchilov and Hutter 2017) optimizer, with a cosine decay strategy (Loshchilov and Hutter 2016) to dynamically adjust the learning rate. Specifically, we apply Lo RA (Hu et al. 2021) modules to each linear layer of the large language model, with the rank r and scaling factor α set to 128 and 256.