reproducibilityindex.ai

CoVeR: Learning Covariate-Specific Vector Representations with Tensor Decompositions

Authors: Kevin Tian, Teng Zhang, James Zou

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that our joint model learns substantially better covariate-speciﬁc embeddings compared to the standard approach of learning a separate embedding for each covariate using only the relevant subset of data, as well as other related methods. We empirically evaluate the beneﬁts of our algorithm on datasets, and demonstrate how it can be used to address many natural questions about covariate effects.
Researcher Affiliation	Academia	1Department of Computer Science, Stanford University 2Department of Management Science and Engineering, Stanford University 3Department of Biomedical Data Science Stanford University.
Pseudocode	No	The paper describes the 'Objective Function and Discussion' and 'Algorithm Details' in prose, but does not include a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	Accompanying code to this paper can be found at http://github.com/kjtian/Co Ve R.
Open Datasets	No	The paper describes the 'book dataset' and the 'politics dataset' used in experiments but does not provide concrete access information (e.g., a specific URL, DOI, or a formal citation with authors and year to a publicly available source) for these datasets.
Dataset Splits	No	The paper states 'individual books contained between 26747 and 355814 words' and 'The vocabulary size was 5,020' but does not specify the training, validation, and test dataset splits using percentages, sample counts, or by referencing predefined standard splits. It also mentions 'tuning our algorithm' which might imply a validation set, but no explicit details are given.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU specifications, or cloud computing resources used for running the experiments.
Software Dependencies	No	The paper mentions using the 'Adam (Kingma & Ba, 2014) algorithm' and refers to 'GloVe (Pennington et al., 2014)' but does not provide specific version numbers for any software libraries, dependencies, or programming languages used.
Experiment Setup	Yes	The vocabulary size was 5,020, and after tuning our algorithm to embed this dataset, we used 100 dimensions and a learning rate of 10^-5. The embedding used 200 dimensions and a learning rate of 10^-5.