Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement
Authors: Fuwei Zhang, Zhao Zhang, Xiang Ao, Dehong Gao, Fuzhen Zhuang, Yi Wei, Qing He4345-4353
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, experimental results demonstrate that HIKE achieves substantial improvements over state-of-the-art competitors. |
| Researcher Affiliation | Collaboration | 1 Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China 2 University of Chinese Academy of Sciences, Beijing 100049, China 3 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China 4 Institute of Intelligent Computing Technology, Suzhou, CAS 5 Alibaba Group, Hangzhou, China 6 Institute of Artificial Intelligence, Beihang University, Beijing 100191, China 7 SKLSDE, School of Computer Science, Beihang University, Beijing 100191, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code. |
| Open Datasets | Yes | We evaluate the HIKE model in a public CLIR dataset CLIRMatrix (Sun and Duh 2020). Specifically, we use the MULTI-8 set in CLIRMatrix, in which queries and documents are jointly aligned in 8 different languages. |
| Dataset Splits | Yes | The training sets of every language pair contain 10,000 queries, while the validation and the test sets contain 1,000 queries. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'multilingual BERT' and 'BERT-base, multilingual cased' but does not provide specific version numbers for software dependencies or frameworks. |
| Experiment Setup | Yes | In the training stage, the number of heads for the multi-head attention mechanism in knowledge-level fusion is set to 6. The learning rates are divided into two parts: the BERT lr1 and the other modules lr2. And we set lr1 to 1e-5 and lr2 to 1e-3. We set the number of neighboring entities in KG as 3. We randomly sample 1600 query-document pairs as our training data per epoch. The maximum training epochs are set to 15. |