Language Independent Feature Extractor

Authors: Young-Seob Jeong, Ho-Jin Choi

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We performed time information extraction using LIFE with the dataset of Temp Eval-2. We used CRF++ library to extract TIMEX3 extents, and predict their types, with English and Korean datasets. For both languages, we observed that the features obtained from LIFE are better than the features of a base-line model, in which the base-line is the probabilistic model (Griffiths et al. 2004). The performance comparison for Korean is shown in Fig. 3. We argue that LIFE provides the most language independent features for the following three reasons: (1) it does not require any preprocessing (e.g., stop-word filtering, stemming, morpheme analysis), (2) it does not require labeled dataset, and (3) it generates the features which capture letter-level patterns using the BOLS scheme. We proved the usefulness of LIFE by experimental results of time information extraction.
Researcher Affiliation Academia Young-Seob Jeong and Ho-Jin Choi Department of Computer Science, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, South Korea {pinode,hojinc}@kaist.ac.kr
Pseudocode No The paper describes the components and functions of LIFE but does not include structured pseudocode or algorithm blocks.
Open Source Code No More detailed explanation can be found in https://sites.google.com/site/pinodewaider/home/life.
Open Datasets No We performed time information extraction using LIFE with the dataset of Temp Eval-2.
Dataset Splits No We performed time information extraction using LIFE with the dataset of Temp Eval-2.
Hardware Specification No No specific hardware details used for running experiments were mentioned in the paper.
Software Dependencies No We used CRF++ library to extract TIMEX3 extents.
Experiment Setup No We performed time information extraction using LIFE with the dataset of Temp Eval-2. We used CRF++ library to extract TIMEX3 extents, and predict their types, with English and Korean datasets.