Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval

Authors: Yunqiu Shao, Jiaxin Mao, Yiqun Liu, Weizhi Ma, Ken Satoh, Min Zhang, Shaoping Ma

IJCAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on the benchmark of the relevant case retrieval task in COLIEE 2019. Experimental results demonstrate that our proposed method outperforms existing solutions.
Researcher Affiliation	Academia	1BNRist, DCST, Tsinghua University, Beijing, China 2National Institute of Informatics, Tokyo, Japan
Pseudocode	No	The paper describes the model architecture and steps using text and mathematical equations, but it does not include a dedicated pseudocode block or algorithm figure.
Open Source Code	Yes	BERT-PLI5https://github.com/ThuYShao/BERT-PLI-IJCAI2020. The implementation has been available.
Open Datasets	Yes	Our experiments are conducted based on the COLIEE 2019 datasets [Rabelo et al., 2019]. ... Table 1 gives a statistical summary of the raw datasets in these two tasks. Task 1 Train Test ... Task 2 Train Test
Dataset Splits	Yes	The training set is split into two parts. 20% queries from the training set as well as all of their candidates are treated as the validation set.
Hardware Specification	No	The paper states that experiments were conducted, but it does not specify any particular hardware components such as GPU models, CPU types, or memory sizes.
Software Dependencies	No	The paper mentions using 'BERT' and 'Adam optimizer' and that 'ARCII and Match Pyramid are implemented by Match Zoo', but it does not provide specific version numbers for any software dependencies (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We set K = 50 (top 50 candidates for each query)... We use the Adam optimizer and set the learning rate as 10 5. ... We set N = 54 and M = 40... In BERT-PLI, HB = 768... As for RNN, HR is set as 256 and only one hidden layer is used for both LSTM and GRU. ... train the model on the training data left for no more than 60 epochs... start learning rate as 10 4 with a weight decay of 10 6.