Adaptive Sampling for Estimating Probability Distributions

Authors: Shubhanshu Shekhar, Tara Javidi, Mohammad Ghavamzadeh

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify our theoretical findings through some experiments.
Researcher Affiliation Collaboration 1ECE Department, University of California, San Diego 2Google Research.
Pseudocode Yes Algorithm 1 Optimistic Tracking Algorithm
Open Source Code No The paper does not provide a specific link or explicit statement about the release of source code for the described methodology.
Open Datasets No The paper uses
Dataset Splits No The paper does not specify traditional train/validation/test dataset splits. It describes running simulations with a total budget of samples and a number of trials, but no partitioning of a dataset into distinct training, validation, and testing sets.
Hardware Specification No The paper does not explicitly mention any specific hardware specifications (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software names with version numbers for reproducibility.
Experiment Setup Yes Setup. We study the performance of the proposed adaptive schemes on a problem with K = 2, and l = 10. We set P1 as the uniform distribution in l and P2 = Pϵ for ϵ {0.1, 0.2, . . . , 0.9}, where Pϵ = (pj)l j=1 with p1 = ϵ and pj = (1 ϵ)/(l 1) for 1 < j l. To compare the performance of the adaptive schemes, we used three baseline schemes: (i) Uniform allocation, in which each arm is allocated n/K samples. Note that the uniform allocation is the oracle scheme for Dχ2 (see Appendix G.4), (ii) Greedy allocation, in which the arms are pulled by plugging in the current empirical estimate (ˆci,t)K i=1 of (ci)K i=1 in the objective function, and (iii) Forced Exploration, in which the arms are pulled according to the greedy scheme, while also ensuring that at any time t, each arm is pulled at least t times. This scheme is motivated by the strategy of Antos et al. (2008). For every value of ϵ, we ran 500 trials of all the allocation schemes with the budget n = 5000.