reproducibilityindex.ai

Active Statistical Inference

Authors: Tijana Zrnic, Emmanuel Candes

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate active inference on datasets from public opinion research, census analysis, and proteomics. [...] We will show in our experiments that active inference can save over 80% of the sample budget required by classical methods. [...] Experiments
Researcher Affiliation	Academia	1Department of Statistics and Stanford Data Science, Stanford University, USA 2Department of Statistics and Department of Mathematics, Stanford University, USA. Correspondence to: Tijana Zrnic <tijana.zrnic@stanford.edu>, Emmanuel J. Cand es <candes@stanford.edu>.
Pseudocode	Yes	The batch and sequential active inference methods used in our experiments are outlined in Algorithm 1 and Algorithm 2 in the Appendix.
Open Source Code	Yes	Code for reproducing the experiments is available at this link. [link points to https://github.com/tijana-z/active-statistical-inference]
Open Datasets	Yes	The Pew dataset is available at (Pew, 2020); the census dataset is available through Folktables (Ding et al., 2021); the Alphafold dataset is available at (Angelopoulos et al., 2023b).
Dataset Splits	No	The paper mentions using a number of labeled examples for training initial models or fine-tuning (e.g., "10 labeled examples", "100 labeled examples") and states that "the underlying data points (Xi, Yi) are fixed and the randomness comes from the labeling decisions ξi." It does not provide specific percentages or counts for train/validation/test splits for the overall experimental setup, nor does it specify cross-validation details.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. It does not mention any particular machines or cloud resources with specifications.
Software Dependencies	No	The paper mentions using an "XGBoost model" but does not specify the version number of XGBoost or any other key software dependencies (e.g., Python, specific libraries or frameworks like scikit-learn, PyTorch, TensorFlow) with version numbers.
Experiment Setup	Yes	The target error level is α = 0.1 throughout. We report the average interval width and coverage for varying sample sizes nb, averaged over 1000 and 100 trials for the batch and sequential settings, respectively. [...] We train an XGBoost model on only 10 labeled examples [...] Active inference with fine-tuning continues to finetune the model with every B = 100 new survey responses. [...] We train initial XGBoost models f1 and e1 on 100 labeled examples.