Active Statistical Inference

Authors: Tijana Zrnic, Emmanuel Candes

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate active inference on datasets from public opinion research, census analysis, and proteomics. [...] We will show in our experiments that active inference can save over 80% of the sample budget required by classical methods. [...] Experiments
Researcher Affiliation Academia 1Department of Statistics and Stanford Data Science, Stanford University, USA 2Department of Statistics and Department of Mathematics, Stanford University, USA. Correspondence to: Tijana Zrnic <tijana.zrnic@stanford.edu>, Emmanuel J. Cand es <candes@stanford.edu>.
Pseudocode Yes The batch and sequential active inference methods used in our experiments are outlined in Algorithm 1 and Algorithm 2 in the Appendix.
Open Source Code Yes Code for reproducing the experiments is available at this link. [link points to https://github.com/tijana-z/active-statistical-inference]
Open Datasets Yes The Pew dataset is available at (Pew, 2020); the census dataset is available through Folktables (Ding et al., 2021); the Alphafold dataset is available at (Angelopoulos et al., 2023b).
Dataset Splits No The paper mentions using a number of labeled examples for training initial models or fine-tuning (e.g., "10 labeled examples", "100 labeled examples") and states that "the underlying data points (Xi, Yi) are fixed and the randomness comes from the labeling decisions ξi." It does not provide specific percentages or counts for train/validation/test splits for the overall experimental setup, nor does it specify cross-validation details.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. It does not mention any particular machines or cloud resources with specifications.
Software Dependencies No The paper mentions using an "XGBoost model" but does not specify the version number of XGBoost or any other key software dependencies (e.g., Python, specific libraries or frameworks like scikit-learn, PyTorch, TensorFlow) with version numbers.
Experiment Setup Yes The target error level is α = 0.1 throughout. We report the average interval width and coverage for varying sample sizes nb, averaged over 1000 and 100 trials for the batch and sequential settings, respectively. [...] We train an XGBoost model on only 10 labeled examples [...] Active inference with fine-tuning continues to finetune the model with every B = 100 new survey responses. [...] We train initial XGBoost models f1 and e1 on 100 labeled examples.