reproducibilityindex.ai

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Authors: Mathias Niepert, Pasquale Minervini, Luca Franceschi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The set of experiments can be divided into three parts. First, we analyze and compare the behavior of I-MLE with (i) the score function and (ii) the straight-through estimator using a toy problem. Second, we explore the latent variable setting where both hv and fu in Eq. (1) are neural networks and the optimal structure is not available during training. Finally, we address the problem of differentiating through black-box combinatorial optimization problems, where we use the target distribution derived in Section 4.
Researcher Affiliation	Collaboration	Mathias Niepert NEC Laboratories Europe mathias.niepert@neclab.eu Pasquale Minervini University College London p.minervini@ucl.ac.uk Luca Franceschi Istituto Italiano di Tecnologia University College London ucablfr@ucl.ac.uk
Pseudocode	Yes	Algorithm 1 Instance of I-MLE with perturbation-based implicit differentiation.
Open Source Code	Yes	We provide implementations and Python notebooks at https://github.com/nec-research/tf-imle
Open Datasets	Yes	The BEERADVOCATE dataset [Mc Auley et al., 2012] consists of free-text reviews and ratings for 4 different aspects of beer: appearance, aroma, palate, and taste.
Dataset Splits	Yes	Since the original dataset [Mc Auley et al., 2012] did not provide separate validation and test sets, we compute 10 different evenly sized validation/test splits of the 10k held out set and compute mean and standard deviation over 10 models, each trained on one split.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU models, CPU types, memory amounts) for running its experiments.
Software Dependencies	No	The paper mentions 'modern deep learning pipelines' and 'Adam settings' but does not provide specific version numbers for software dependencies such as Python, PyTorch, TensorFlow, or other libraries.
Experiment Setup	Yes	Hyperparameters are optimized against L for all methods independently. Statistics are over 100 runs. We used the standard hyperparameter settings of Chen et al. [2018] and choose the temperature parameter t {0.1, 0.5, 1.0, 2.0}. For I-MLE we choose λ {101, 102, 103}, while for both I-MLE and STE we choose τ {k, 2k, 3k} based on the validation MSE. We used the standard Adam settings.