Judgment Prediction via Injecting Legal Knowledge into Neural Networks

Authors: Leilei Gan, Kun Kuang, Yi Yang, Fei Wu12866-12874

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We take private loan scenario as a case study and demonstrate the effectiveness of the proposed method through comprehensive experiments and analyses conducted on the collected dataset. ... The effectiveness of the proposed method is evaluated through comprehensive experiments and analyses conducted on the collected datasets. ... In this section, we compare our method with other deep learning-based baselines on a collected private loan dataset, discussing the role the legal knowledge playing in the performance. ... We use Macro F1 and Micro F1 (Mac.F1 and Mic.F1 for short) as the main metrics for algorithm evaluation.
Researcher Affiliation Academia Leilei Gan, Kun Kuang*, Yi Yang and Fei Wu* College of Computer Science and Technology, Zhejiang University, China {leileigan, kunkuang, yangyics, wufei}@zju.edu.cn
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Code and dataset will be publicly available at https://github.com/leileigan/Law Reasoning.
Open Datasets Yes In the experiments, we collected a total of 61, 611 private loan law cases. Each instance in the dataset consists of a fact description and the plaintiff s multiple claims. ... To the best of our knowledge, this is the very first large private loan judgment prediction dataset. We will release all the experiment data to motivate other scholars to further investigate this problem4. (The footnote 4 states the link: https://github.com/leileigan/Law Reasoning.)
Dataset Splits Yes Split Support Partially Support Reject Training Set 70,386 18,921 6,438 Validation Set 8,777 2,440 858 Test Set 8,839 2,293 855 Table 1: Statistics of private loan dataset
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions
Experiment Setup Yes We use the Skip-Gram model (Mikolov et al. 2013) to train word embeddings on the judgment documents. The dimension of word embeddings is set to 300. The size of hidden states of bidirectional-LSTM is 256. The neural networks are trained using Adam Optimization (Kingma and Ba 2014) with a learning rate set to 0.001, and perform the mini-batch gradient descent with a batch size of 16. For BERT, the learning rate and batch size are set to 5e-6 and 1, respectively. ... An early stopping strategy is used that if the sum of Mac.F1 and Mic.F1 on the development dataset does not increase for ten epochs, the training process is terminated. Table 2 shows the values of model hyper-parameters. Parameter Value Parameter Value Word emb size 300 Dropout 0.2 BERT emb size 768 Batch size 16 LSTM layer 1 Learning rate decay 0.05 LSTM hidden 256 Early Stoping 10