Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification

Authors: Suchith Chidananda Prabhu, Bhavyajeet Singh, Anshul Mittal, Siddarth Asokan, Shikhar Mohan, Deepak Saini, Yashoteja Prabhu, Lakshya Kumar, Jian Jiao, Amit S, Niket Tandon, Manish Gupta, Sumeet Agarwal, Manik Varma

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The MOGIC algorithm improves precision@1 and propensity-scored precision@1 of XC disciple models by 1 2% on six standard datasets, at no additional inference-time cost. We show that MOGIC can be used in a plug-and-play manner to enhance memory-free XC models such as NGAME or DEXA. Lastly, we demonstrate the robustness of the MOGIC algorithm to missing and noisy metadata.
Researcher Affiliation	Collaboration	1Yardi School of Artificial Intelligence, IIT Delhi, India 2Microsoft Research, India 3Mircosoft, India 4Mircosoft, USA. Correspondence to: Suchith Chidananda Prabhu <EMAIL>.
Pseudocode	No	The paper describes the MOGIC framework and its two-phase training (oracle training and oracle-guided disciple training) in detail within Section 3. The overall architecture is visualized in Figure 1, Figure 3, and Figure 4. However, it does not contain any explicit block labeled "Pseudocode" or "Algorithm" with structured steps.
Open Source Code	Yes	The code is publicly available at https://github.com/suchith720/mogic.
Open Datasets	Yes	The XML Repository (Bhatia et al., 2016) provides various public XC datasets... The Wikipedia datasets were created from publicly available Wikipedia dumps 1 dated 2022-05-20. ... the Amazon datasets are created from publicly available Amazon Product review dumps2. 1https://dumps.wikimedia.org/enwiki// 2https://cseweb.ucsd.edu/~jmcauley/datasets.html#amazon_reviews
Dataset Splits	Yes	Table 9 in Appendix F.1 summarizes the dataset statistics. ... LF-Wiki See Also Titles-320K # Train Queries (Q) 693K # Test Queries 177K ... LF-Wiki Titles-500K # Train Queries (Q) 1.8M # Test Queries 783K ... LF-Amazon Titles-131K # Train Queries (Q) 294K # Test Queries 134K
Hardware Specification	Yes	All models were trained using the Py Torch library on a machine with 4 AMD MI200 GPUs. ... Training: All models were trained on AMD 4x MI200 GPUs.
Software Dependencies	No	The paper mentions using the "Py Torch library" and specific models like "Distil BERT", "LLa MA-2", and "Phi-2" along with "Lo RA finetuning", but does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	Table 10 in Appendix F.2 summarizes the various hyperparameters used for each dataset. We remark that MOGIC uses golden linkages for the metadata only at training time... Table 10: Hyper-parameter values for MOGIC on all datasets to enable reproducibility. Most hyperparameters were set to their default values across all datasets. LR is learning rate. Margin γ = 0.3 was used for contrastive loss. ... LMOGIC = LDisciple + α LAlignment + β LMatching where α, β are tunable hyper-parameters and set to 1.0 and 0.1 respectively in our experiments.