reproducibilityindex.ai

Performance Bounds for Active Binary Testing with Information Maximization

Authors: Aditya Chattopadhyay, Benjamin David Haeffele, Rene Vidal, Donald Geman

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Next, we demonstrate on two machine learning datasets (CUB-200-2011 (Wah et al., 2011) and Aw A2 (Xian et al., 2018)) that the given set of tests T is δ-unpredictable for modest values of δ (0.22 and 0.17 respectively) and subsequently show that our bound is closer to the true mean number of tests the greedy strategy requires on these datasets to identify Y than previously known bounds. ... Table 1. Comparison of different bounds with the empirical performance of the greedy strategy (Info Max in column 4)
Researcher Affiliation	Academia	1Johns Hopkins University, USA 2University of Pennsylvania, USA.
Pseudocode	No	The paper describes the Info Max algorithm and refers to a flowchart (Figure 4 in the appendix), but it does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described.
Open Datasets	Yes	Next, we demonstrate on two machine learning datasets (CUB-200-2011 (Wah et al., 2011) and Aw A2 (Xian et al., 2018))
Dataset Splits	No	The paper mentions using empirical probabilities and simulating prior distributions but does not specify explicit training, validation, or test splits for data used in experiments or model training.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software libraries or dependencies used in the experiments.
Experiment Setup	Yes	We use the empirical probabilities in the dataset to compute all the entropic quantities required for running the greedy strategy (algorithm in equation 3). ... Construct an augmented dataset by repeating every label Y = y (in the original dataset) 1000P(y) times, where . is the floor function to ensure an integer value and 1000 is a chosen hyper-parameter to ensure we have enough samples to accurately estimate the sampled prior P(Y ) (obtained in the previous step).