Unsupervised Legal Evidence Retrieval via Contrastive Learning with Approximate Aggregated Positive

Authors: Feng Yao, Jingyuan Zhang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Yun Liu, Weixing Shen

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train our models on tens of thousands of unlabeled cases and evaluate them on a labeled dataset containing 919 cases and 4, 336 queries. Experimental results indicate that our approach is effective and outperforms other state-of-the-art representation and retrieval models. The dataset and code are available at https://github.com/yaof20/LER.
Researcher Affiliation Collaboration Feng Yao1, Jingyuan Zhang2*, Yating Zhang2, Xiaozhong Liu3, Changlong Sun2, Yun Liu1*, Weixing Shen1* 1School of Law, Institute for AI and Law, Tsinghua University, Beijing, China 2DAMO Academy, Alibaba Group, Hangzhou, Zhejiang, China 3Worcester Polytechnic Institute, MA, USA
Pseudocode No The paper describes methodologies using equations and textual explanations, but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes The dataset and code are available at https://github.com/yaof20/LER. To motivate other scholars, the dataset and code are publicly available.
Open Datasets Yes To facilitate the research of LER, we propose a large-scale dataset named LERD, consisting of more than 300k fact queries and over 11 million query (fact) and candidate (evidence) pairs, within which 4, 436 queries and their corresponding 234, 693 candidates are annotated with the relevance ranking scores. The dataset and code are available at https://github.com/yaof20/LER.
Dataset Splits Yes in the unsupervised setting, we use LERD-usp for training and split LERD-sup into valid and test sets for evaluation. And we split LERD-sup into train, valid, and test sets for the supervised experiments. The statistics of the data splits in both settings are shown in Table 2.
Hardware Specification Yes We train SWAP on 1 Tesla-A100 80G GPU with a batch size of 8 and optimize the model with Adam W with a learning rate of 1e-5, 10% steps for warmup and 5 epochs.
Software Dependencies No The paper mentions specific models like RoBERTa and optimizers like Adam W, but it does not specify the versions of general software dependencies such as Python, PyTorch, TensorFlow, or CUDA.
Experiment Setup Yes During the training stage for SWAP, we use case-level examples to retain the structure information of each case in the mini-batch. We randomly sample cases from the training data and set the maximum input length of facts and evidence to 128 tokens. ... We train SWAP on 1 Tesla-A100 80G GPU with a batch size of 8 and optimize the model with Adam W with a learning rate of 1e-5, 10% steps for warmup and 5 epochs. The temperature hyper-parameter τ is 0.1. The dense representations of the facts and evidence are obtained by the average pooling strategy. We use cosine similarity as the function to measure the similarity between the fact and evidence representations.