All-in Text: Learning Document, Label, and Word Representations Jointly
Authors: Jinseok Nam, Eneldo Loza Mencía, Johannes Fürnkranz
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The potential of our method is demonstrated on the multi-label classification task of assigning keywords from the Medical Subject Headings (Me SH) to publications in biomedical research, both in a conventional and in a zero-shot learning setting. |
| Researcher Affiliation | Academia | Jinseok Nam, Eneldo Loza Menc ıa, Johannes F urnkranz Knowledge Discovery in Scientific Literature, TU Darmstadt Knowledge Engineering Group, TU Darmstadt Research Training Group AIPHES, TU Darmstadt |
| Pseudocode | Yes | Algorithm 1: Training Ai Text ML |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | We use the Bio ASQ Task 3a dataset, a collection of scientific publications in biomedical research, to examine our proposed method.1 It contains about 12 million publications, each of which is associated with around 11 descriptors on average out of 27,455, which come from the Medical Subject Headings (Me SH) hierarchy.2 1http://www.bioasq.org/participate/data |
| Dataset Splits | Yes | We split the dataset by year so that the training set includes all papers by 2004 and the rest of papers published between 2005 and 2015 belongs to the test set. Thus, descriptors introduced to the Me SH hierarchy after 2004 can be considered as unseen labels. 100,000 papers before 2005 were randomly sampled and set aside as the validation set for tuning hyperparameters. |
| Hardware Specification | Yes | We performed all experiments on a machine with two Intel Xeon E5-2670 CPUs and 32GB of memory. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers, such as programming languages or libraries like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | We used the validation set to set our hyperparameters as follows: the number of negative samples κ = 5, the dimensionality of all representations 100, the size of the context window c = 5, learning rate η = 0.025, margin m = 0.1, and the control variables α = 1/3, β = 1/3, γ = 1/3. |