Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Real-Time Selection Under General Constraints via Predictive Inference
Authors: Yuyang Huo, Lin Lu, Haojie Ren, Changliang Zou
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the breadth of applicability of the II-COS procedure by experiments on simulated data and real-data applications. |
| Researcher Affiliation | Academia | 1School of Statistics and Data Sciences, LPMC, KLMDASR and LEBPS, Nankai University, Tianjin, China 2School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China |
| Pseudocode | Yes | Algorithm 1 The data-driven II-COS procedure |
| Open Source Code | Yes | Code for implementing II-COS and reproducing the experiments and figures in our paper is available at https://github.com/lulin2023/II-COS. |
| Open Datasets | Yes | We consider the recruitment dataset from Kaggle [22] that contains 45,372 candidates... The other problem is to use 1994 Census Bureau dataset [8] to select a subset of individuals who may have high incomes in precision marketing. |
| Dataset Splits | Yes | we resort to a data-splitting strategy: randomly split historical data D into two parts, the training set Dtr and the calibration one Dcal of sizes n0 and n1 respectively. For each dataset, we randomly partition the data into three parts: ntr = 1,000 training data, ncal = 1,000 calibration data and the rest which are used as the online observations. |
| Hardware Specification | Yes | All the experiments were conducted on 3.11 GHz Intel Gen i5-11300H processors with 16 Gb memory at a Lenovo personal computer |
| Software Dependencies | Yes | R platform with version 4.2.1. implemented by R package nnet and R packages kernlab |
| Experiment Setup | Yes | As an example, we set the stopping rule as selecting total m = 100 samples, i.e., T = Tm = inft{t : Pt i=1 Ύi = m}. The predictor H is taken as random forest with defaulted parameters. We fix training data size ntr = 1, 000. Take α = 0.1 and K = 0.045 for FSR and m ES, respectively. |