Group-Invariant Cross-Modal Subspace Learning
Authors: Jian Liang, Ran He, Zhenan Sun, Tieniu Tan
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two benchmark datasets demonstrate that the proposed unsupervised algorithm even achieves comparable performance to some state-of-the-art supervised cross-modal algorithms. 4 Experiments |
| Researcher Affiliation | Academia | 1 National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences (CASIA) 2 Center for Research on Intelligent Perception and Computing, CASIA 3 Center for Excellence in Brain Science and Intelligence Technology, CAS |
| Pseudocode | Yes | Algorithm 1 Simultaneous Pairwise and Groupwise Correspondences Maximization (SPGCM) |
| Open Source Code | No | No explicit statement providing concrete access to source code for the methodology (e.g., a specific repository link, an explicit code release statement, or code in supplementary materials) was found. |
| Open Datasets | Yes | Experiments are conducted on the Wiki [Rasiwasia et al., 2010] and Pascal VOC [Hwang and Grauman, 2012] datasets. |
| Dataset Splits | Yes | In [Costa Pereira et al., 2014], the authors randomly split the whole set into 2,173/ 693 (training/ testing) sets respectively, which is adopted in the following experiments as Protocol I. However, taking the unbalanced distribution into consideration in [Wang et al., 2013], we split it into 1,300/ 1,566 (130 pairs per class training/ testing) as Protocol II. resulting in 2,808 training and 2,841 testing data. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running experiments were mentioned. |
| Software Dependencies | No | No specific ancillary software details (e.g., library or solver names with version numbers) were found. The paper mentions Caffe as a tool for feature extraction, but without version information. |
| Experiment Setup | Yes | For our proposed SPGCM, we use empirical value β as 0.01, and as 0.01. Regarding the group size K, we directly fix it as the number of different groudtruth labels, i.e., K = 10 for the Wiki dataset, and K = 20 for the VOC dataset. The subspace dimension c is validated for the best performance for all methods, we further investigate its influence in Section 4.4. Besides, for the initialization of F, we simply utilize the cluster indicator obtained by spherical K-means clustering on the text modality. |