Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning and Inference via Maximum Inner Product Search

Authors: Stephen Mussmann, Stefano Ermon

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that it performs well both on synthetic data and neural language models with large output spaces.The main purpose of our empirical evaluation is to demonstrate that our MIPS reduction using Gumbels (MRG) doesnโ€™t affect the accuracy of sampling or inference. We show the results of the reduction on model averaging and learning via gradient descent, two tasks introduced in the Background section. We also show the empirical speedup achieved using a particular MIPS technique.
Researcher Affiliation Academia Stephen Mussmann EMAIL Stanford University, 450 Serra Mall, Stanford, CA 94305 USA
Pseudocode Yes Algorithm 1 MIPS-Gumbel Initialization, Algorithm 2 MIPS-Gumbel t Samples, Algorithm 3 MIPS-Gumbel Inverse Partition Estimate
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The real data we use is the word2vec dataset, a word embedding dataset released by Google (Mikolov et al., 2013a;b).
Dataset Splits No The paper mentions using synthetic data and word2vec data but does not provide explicit training, validation, or test dataset splits.
Hardware Specification No The paper only states 'a single core running in python' without specifying any particular CPU, GPU models, or detailed hardware specifications.
Software Dependencies No The paper mentions 'python' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes For the MIPS reduction, k = 5 and t = 100. In general, a larger k will make the samples for different ฮธ less dependent and a larger t will decrease the variance of the estimate.and a Gaussian prior is put on the parameters. We achieve this by using Equation 5 with an extra term for the Gaussian prior.