Zero-Inflated Exponential Family Embeddings

Authors: Li-Ping Liu, David M. Blei

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically evaluate the Zero-Inflated Embeddings. We compare four models, two baselines and two variants of our model, in the following subsections: 1) EFE is the basic exponential family embedding model; 2) EFE-dz assigns weight 0.1 to zero entries in the training data (same as (Rudolph et al., 2016)); 3) ZIE-0 is the zero-inflated embedding model and fits the exposure probabilities with the intercept term only; and 4) ZIE-cov fits exposure probabilities with covariates.
Researcher Affiliation Academia 1Columbia University, 500 W 120 St., New York, NY 10027 2Tufts University, 161 College Ave., Medford, MA 02155.
Pseudocode No No explicit pseudocode or algorithm blocks were found.
Open Source Code No The code is from the repository constructed by Jastrzebski et al. (2017). This refers to a third-party repository used for evaluation, not the authors' own source code for the proposed method. No statement about releasing their own code.
Open Datasets Yes All models are evluated with four datasets, e Bird-PA, Movie Lens-100K, Market, and Wiki-S, which will be introduced in detail in the following subsections. Their general information is tabulated in Table 1.
Dataset Splits Yes One tenth of the training set is separated out as the validation set, whose log-likelihood is used to check whether the optimization procedure converges.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No All models are optimized by Ada Grad (Duchi et al., 2011) implemented in Tensor Flow1. No version numbers are provided for TensorFlow or any other software/libraries.
Experiment Setup Yes All models are optimized by Ada Grad (Duchi et al., 2011) implemented in Tensor Flow1, and the Ada Grad parameter η for step length is set to 0.1. One tenth of the training set is separated out as the validation set, whose log-likelihood is used to check whether the optimization procedure converges. The variance parameters of α, ρ, and w are set to 1 for all experiments. [...] The embedding dimension K iterates over the set {32, 64, 128}.