Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Zero-Inflated Exponential Family Embeddings
Authors: Li-Ping Liu, David M. Blei
ICML 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically evaluate the Zero-Inflated Embeddings. We compare four models, two baselines and two variants of our model, in the following subsections: 1) EFE is the basic exponential family embedding model; 2) EFE-dz assigns weight 0.1 to zero entries in the training data (same as (Rudolph et al., 2016)); 3) ZIE-0 is the zero-inflated embedding model and fits the exposure probabilities with the intercept term only; and 4) ZIE-cov fits exposure probabilities with covariates. |
| Researcher Affiliation | Academia | 1Columbia University, 500 W 120 St., New York, NY 10027 2Tufts University, 161 College Ave., Medford, MA 02155. |
| Pseudocode | No | No explicit pseudocode or algorithm blocks were found. |
| Open Source Code | No | The code is from the repository constructed by Jastrzebski et al. (2017). This refers to a third-party repository used for evaluation, not the authors' own source code for the proposed method. No statement about releasing their own code. |
| Open Datasets | Yes | All models are evluated with four datasets, e Bird-PA, Movie Lens-100K, Market, and Wiki-S, which will be introduced in detail in the following subsections. Their general information is tabulated in Table 1. |
| Dataset Splits | Yes | One tenth of the training set is separated out as the validation set, whose log-likelihood is used to check whether the optimization procedure converges. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | All models are optimized by Ada Grad (Duchi et al., 2011) implemented in Tensor Flow1. No version numbers are provided for TensorFlow or any other software/libraries. |
| Experiment Setup | Yes | All models are optimized by Ada Grad (Duchi et al., 2011) implemented in Tensor Flow1, and the Ada Grad parameter η for step length is set to 0.1. One tenth of the training set is separated out as the validation set, whose log-likelihood is used to check whether the optimization procedure converges. The variance parameters of α, ρ, and w are set to 1 for all experiments. [...] The embedding dimension K iterates over the set {32, 64, 128}. |