Learning to Learn Morphological Inflection for Resource-Poor Languages
Authors: Katharina Kann, Samuel R. Bowman, Kyunghyun Cho8058-8065
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments with two model architectures on 29 target languages from 3 families show that our suggested approach outperforms all baselines. In particular, it obtains a 31.7% higher absolute accuracy than a previously proposed cross-lingual transfer model and outperforms the previous state of the art by 1.7% absolute accuracy on average over languages. |
| Researcher Affiliation | Academia | Katharina Kann, Samuel R. Bowman, Kyunghyun Cho New York University, USA {kann, kyunghyun.cho, bowman}@nyu.edu |
| Pseudocode | Yes | Algorithm 1: First-order approximation of the MAML algorithm. |
| Open Source Code | No | The paper refers to code for baseline models by other authors (e.g., 'We use the code and hyperparmeters from Makarov and Clematide (2018a).4' and 'We use the hyperparameters suggested by Sharma, Katrapati, and Sharma (2018):6'). There is no explicit statement or link indicating that the authors' own source code for their proposed method is publicly available. |
| Open Datasets | Yes | To simplify comparison with other work, we experiment on a collection of datasets provided by Cotterell et al. (2018) for the 2018 edition of the Co NLL SIGMORPHON shared task on morphological inflection. |
| Dataset Splits | Yes | For all considered resource-poor settings, we use their low datasets, which contain 100 examples each. For resource-rich languages, we take their high datasets, which contain 10,000 examples each. ... Portuguese (Romance), Macedonian (Slavic) and Finnish (Uralic) are our development languages, which we use for hyperparameter tuning, and all other languages in the dataset which belong to either the Romance, Slavic, or Uralic family are used for testing. |
| Hardware Specification | Yes | It further benefited from the donation of a Titan V GPU by NVIDIA Corporation. |
| Software Dependencies | No | The paper mentions optimizers (Adadelta, Adam) and models (MED, PG) but does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For all MED models, we use the hyperparameters suggested by Kann and Sch utze (2016): In particular, the encoder and decoder hidden states are 100-dimensional, and embeddings are 300-dimensional. For training, we use Adadelta (Zeiler 2012) with a batch size of 20. ... Both our hidden states and embeddings are 100-dimensional. We use Adam (Kingma and Ba 2014) for training and dropout (Srivastava et al. 2014) with a probability parameter of 0.5. ... Multi-task training or training with MAML is carried out for 60 epochs for all model architectures. We fine-tune for at least 300 epochs... |