Maximum Likelihood Estimation for Learning Populations of Parameters

Authors: Ramya Korlakai Vinayak, Weihao Kong, Gregory Valiant, Sham Kakade

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Numerical Experiments Recall that the MLE (Equation 2) is a convex optimization problem, ˆPmle arg max Q D s=0 hobs s log EQ[hs], where D is the set of all distributions on [0, 1]. We discretize the interval [0, 1] into a uniform grid of width 1 m. Note that as long as the error due to discretization O( 1 m) is smaller than the expected error in earth mover s distance (EMD), we will not be losing much numerically. Unless otherwise specified, we use grid length of m = 1000. The discretized set ˆD can then be written as ˆDm := q Rm+1 : q 0, 1 q = 1 . We then solve the MLE which is convex on this discrete convex set using cvx (Grant & Boyd, 2014; 2008) for Matlab R .
Researcher Affiliation Academia 1Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle 2Department of Computer Science, Stanford University, Stanford. Correspondence to: Ramya Korlakai Vinayak <ramya@cs.washington.edu>.
Pseudocode No The paper describes mathematical derivations and theoretical proofs but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We ran the MLE on two real datasets used in (Tian et al., 2017): (1) A dataset on political leanings of counties in the US with data on whether a county leaned Democratic or Republican for N = 3116 counties in t = 8 presidential elections from 1976 to 2004. (2) A dataset of delays of flights with N = 25, 156 flights.
Dataset Splits No The paper focuses on estimating a population distribution from observed data rather than training a predictive model on explicit dataset splits. It describes the number of individuals (N) and observations per individual (t) but does not provide information on training, validation, or test dataset splits in the conventional machine learning sense.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'cvx' and 'Matlab R' for solving the MLE, but it does not specify version numbers for these software components.
Experiment Setup Yes We use grid length of m = 1000. With the population size N = 1e6, we vary t from 2 to 12. For t = 10, we vary the population size N from 10 to 108 in multiples of 10. For N = 1e6, we vary the number of tosses t from 2 to 10 in steps of two and then t = [50, 100, 500, 1000] to illustrate the performance of the MLE as t varies widely.