Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generalized Debiased Semi-Supervised Hashing for Large-Scale Image Retrieval

Authors: Xingbo Liu, Xuening Zhang, Xiushan Nie, Yang Shi, Yilong Yin

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three single-label and three multi-label image benchmarks demonstrate that GDSH remarkably outperforms the state-of-the-arts in different semi-supervised settings.
Researcher Affiliation Collaboration Xingbo Liu1, Xuening Zhang2*, Xiushan Nie1,3, Yang Shi4, Yilong Yin4 1School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101, China 2School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China 3Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Co., Ltd, Jinan, China 4School of Software, Shandong University, Jinan 250101, China
Pseudocode No The paper refers to 'Algorithm 1' in sections like 'Out-of-sample Extension' and 'Theoretical Analysis', but the pseudocode or algorithm block for 'Algorithm 1' is not provided within the main text of the paper.
Open Source Code No The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes To verify the superiority of the proposed method, we carried out experiments using six widely-used image benchmarks, including three single-label datasets, CALTECH-101(Fei Fei, Fergus, and Perona 2007), CIFAR-10(Krizhevsky and Hinton 2009), Image Net, and three multi-label datasets, MSCOCO(Lin et al. 2014), NUS-WIDE(Chua et al. 2009), MIRFlickr(Huiskes and Lew 2008).
Dataset Splits No The paper mentions using '30% supervision' and that 'the labeled subsets for all six benchmarks were kept the same throughout the experiments', but it does not specify exact training/validation/test splits (e.g., percentages or counts) for the overall datasets.
Hardware Specification Yes All the experiments were conducted on a computer with an Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz, 64GB RAM and a 64-bit Windows operating system.
Software Dependencies No The paper mentions 'existing tools in MATLAB' but does not provide specific version numbers for any software dependencies used in the implementation.
Experiment Setup Yes For comparison with baselines, we empirically set β = 0.3, µ = 105, θ = 105, ρ = 104, α = 10 6. The best choice of γ is equally set to 100 on CALTECH-101, CIFAR-10, MS-COCO, NUS-WIDE, MIRFlickr, and 104 on Image Net. k = 1 for single-label datasets, while k = 2 for multi-label datasets. The iteration numbers t and T are respectively set to 10 and 4. ... we finely tuned δ1 and δ2 via grid search, and use δ1 = 1 for all datasets. δ2 is set to 10 6 for single-label datasets, and 10 3 for multi-label datasets.