Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Authors: Kan Xu, Xuanyi Zhao, Hamsa Bastani, Osbert Bastani

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as the interpretability of the results.
Researcher Affiliation Academia Kan Xu 1 Xuanyi Zhao 1 Hamsa Bastani 1 Osbert Bastani 1 1University of Pennsylvania, Pennsylvania, USA.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Source code is available at https://github.com/kanxu526/GroupTLWordEmbedding.
Open Datasets Yes All the Wikipedia text data were downloaded from English Wikipedia database dumps2 in January 2020. The data provides a label (eligible or not eligible), and a corresponding short free-text statement that describes the eligibility criterion and the study intervention and condition... has been made publicly available by the authors on Kaggle4.
Dataset Splits Yes We use 5-fold cross validation to tune λ and we keep 20% of the gold data as the cross validation set. We split the 50 observations into 20% for testing and 80% for training and cross-validation. We use 5-fold cross-validation to tune the hyperparameters in regularized logistic regression.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions several tools and models like GloVe, Word2Vec, and Dict2Vec, but it does not specify the version numbers of any software dependencies or libraries used for implementation or experiments.
Experiment Setup Yes We take the pretrained Glo Ve word embedding as described above. Similar to Glo Ve, we create the co-occurrence matrix using a symmetric context window of length 5. We choose the dimension of the word embedding to be 100 and use the default weighting function of Glo Ve. We fix λ = 0.05 for both approaches; we found our results to be robust to this choice. In this task, we set the word embedding dimension to 100. We tune our regularization parameter λ separately for each type of pre-trained embeddings. We use logistic regression with an ℓ2 penalty.