GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction
Authors: Oscar Sainz, Iker García-Ferrero, Rodrigo Agerri, Oier Lopez de Lacalle, German Rigau, Eneko Agirre
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluation empirically demonstrates that Go LLIE is able to generalize to and follow unseen guidelines, outperforming previous attempts at zero-shot information extraction. The ablation study shows that detailed guidelines are key for good results. |
| Researcher Affiliation | Academia | Oscar Sainz , Iker Garc ıa-Ferrero Rodrigo Agerri, Oier Lopez de Lacalle, German Rigau, Eneko Agirre Hi TZ Basque Center for Language Technology Ixa NLP Group University of the Basque Country (UPV/EHU) {oscar.sainz, iker.garciaf}@ehu.eus |
| Pseudocode | No | The paper includes Python code examples for input/output representation (e.g., Figure 2, 3, 5, 6), but these are examples of data format and not presented as structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code, data, and models are publicly available: https://github.com/hitz-zentroa/Go LLIE. |
| Open Datasets | Yes | Table 1: Datasets used on the experiments. The table shows the domain, tasks and whether are use for training, evaluation or both. ACE05 (Walker et al., 2006) News ... Co NLL 2003 (Tjong Kim Sang & De Meulder, 2003) News |
| Dataset Splits | Yes | Regarding the splits, we use the standard train, dev test splits for every dataset. In the case of ACE, we follow the split provided by Lin et al. (2020). In the case of CASIE, we took the first 200 instances as validation and the last 2000 as test. |
| Hardware Specification | Yes | Our training infrastructure was 2 NVIDIA s A100 with 80gb each. ... QLo RA approach was trained using just one Nvidia A100 80GB GPU thanks to the 4-bit quantization of the frozen model Dettmers et al. (2023). Training the full model required a minimum of four Nvidia A100 80GB GPUs to fit the model into memory. |
| Software Dependencies | No | The paper mentions using "QLo RA" and "Deep Speed" but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The models were trained for 3 epochs with an effective batch size of 32 and a learning rate of 3e-4 with a cosine scheduler. |