BERT-PLI: Modeling Paragraph-Level Interactions for Legal Case Retrieval

Authors: Yunqiu Shao, Jiaxin Mao, Yiqun Liu, Weizhi Ma, Ken Satoh, Min Zhang, Shaoping Ma

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on the benchmark of the relevant case retrieval task in COLIEE 2019. Experimental results demonstrate that our proposed method outperforms existing solutions.
Researcher Affiliation Academia 1BNRist, DCST, Tsinghua University, Beijing, China 2National Institute of Informatics, Tokyo, Japan
Pseudocode No The paper describes the model architecture and steps using text and mathematical equations, but it does not include a dedicated pseudocode block or algorithm figure.
Open Source Code Yes BERT-PLI5https://github.com/ThuYShao/BERT-PLI-IJCAI2020. The implementation has been available.
Open Datasets Yes Our experiments are conducted based on the COLIEE 2019 datasets [Rabelo et al., 2019]. ... Table 1 gives a statistical summary of the raw datasets in these two tasks. Task 1 Train Test ... Task 2 Train Test
Dataset Splits Yes The training set is split into two parts. 20% queries from the training set as well as all of their candidates are treated as the validation set.
Hardware Specification No The paper states that experiments were conducted, but it does not specify any particular hardware components such as GPU models, CPU types, or memory sizes.
Software Dependencies No The paper mentions using 'BERT' and 'Adam optimizer' and that 'ARCII and Match Pyramid are implemented by Match Zoo', but it does not provide specific version numbers for any software dependencies (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup Yes We set K = 50 (top 50 candidates for each query)... We use the Adam optimizer and set the learning rate as 10 5. ... We set N = 54 and M = 40... In BERT-PLI, HB = 768... As for RNN, HR is set as 256 and only one hidden layer is used for both LSTM and GRU. ... train the model on the training data left for no more than 60 epochs... start learning rate as 10 4 with a weight decay of 10 6.