Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings
Authors: Kan Xu, Xuanyi Zhao, Hamsa Bastani, Osbert Bastani
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as the interpretability of the results. |
| Researcher Affiliation | Academia | Kan Xu 1 Xuanyi Zhao 1 Hamsa Bastani 1 Osbert Bastani 1 1University of Pennsylvania, Pennsylvania, USA. |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Source code is available at https://github.com/kanxu526/GroupTLWordEmbedding. |
| Open Datasets | Yes | All the Wikipedia text data were downloaded from English Wikipedia database dumps2 in January 2020. The data provides a label (eligible or not eligible), and a corresponding short free-text statement that describes the eligibility criterion and the study intervention and condition... has been made publicly available by the authors on Kaggle4. |
| Dataset Splits | Yes | We use 5-fold cross validation to tune λ and we keep 20% of the gold data as the cross validation set. We split the 50 observations into 20% for testing and 80% for training and cross-validation. We use 5-fold cross-validation to tune the hyperparameters in regularized logistic regression. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions several tools and models like GloVe, Word2Vec, and Dict2Vec, but it does not specify the version numbers of any software dependencies or libraries used for implementation or experiments. |
| Experiment Setup | Yes | We take the pretrained Glo Ve word embedding as described above. Similar to Glo Ve, we create the co-occurrence matrix using a symmetric context window of length 5. We choose the dimension of the word embedding to be 100 and use the default weighting function of Glo Ve. We fix λ = 0.05 for both approaches; we found our results to be robust to this choice. In this task, we set the word embedding dimension to 100. We tune our regularization parameter λ separately for each type of pre-trained embeddings. We use logistic regression with an ℓ2 penalty. |