Performance Bounds for Active Binary Testing with Information Maximization

Authors: Aditya Chattopadhyay, Benjamin David Haeffele, Rene Vidal, Donald Geman

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Next, we demonstrate on two machine learning datasets (CUB-200-2011 (Wah et al., 2011) and Aw A2 (Xian et al., 2018)) that the given set of tests T is δ-unpredictable for modest values of δ (0.22 and 0.17 respectively) and subsequently show that our bound is closer to the true mean number of tests the greedy strategy requires on these datasets to identify Y than previously known bounds. ... Table 1. Comparison of different bounds with the empirical performance of the greedy strategy (Info Max in column 4)
Researcher Affiliation Academia 1Johns Hopkins University, USA 2University of Pennsylvania, USA.
Pseudocode No The paper describes the Info Max algorithm and refers to a flowchart (Figure 4 in the appendix), but it does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described.
Open Datasets Yes Next, we demonstrate on two machine learning datasets (CUB-200-2011 (Wah et al., 2011) and Aw A2 (Xian et al., 2018))
Dataset Splits No The paper mentions using empirical probabilities and simulating prior distributions but does not specify explicit training, validation, or test splits for data used in experiments or model training.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software libraries or dependencies used in the experiments.
Experiment Setup Yes We use the empirical probabilities in the dataset to compute all the entropic quantities required for running the greedy strategy (algorithm in equation 3). ... Construct an augmented dataset by repeating every label Y = y (in the original dataset) 1000P(y) times, where . is the floor function to ensure an integer value and 1000 is a chosen hyper-parameter to ensure we have enough samples to accurately estimate the sampled prior P(Y ) (obtained in the previous step).