Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Selection by Prediction with Conformal p-values
Authors: Ying Jin, Emmanuel J. Candes
JMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the empirical performance of our method via simulations, and apply it to job hiring and drug discovery datasets. |
| Researcher Affiliation | Academia | Ying Jin EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA Emmanuel J. Candes EMAIL Department of Statistics and Department of Mathematics Stanford University Stanford, CA 94305, USA |
| Pseudocode | Yes | The whole procedure for cf BH is summarized in Algorithm 1. Algorithm 1 cf BH: Selection by prediction with conformal p-values Algorithm 2 cf BH0: Selection by prediction with same-class calibration |
| Open Source Code | Yes | The reproduction codes for this part can be found at https://github.com/ying531/selcf_paper. |
| Open Datasets | Yes | We use a small-scale recruitment dataset from Kaggle (Roshan, 2020), as recruitment datasets from companies are often confidential. We use the DAVIS dataset published in Davis et al. (2011), which records real-valued binding affinities for ntot = 30060 drug-target pairs. |
| Dataset Splits | Yes | We randomly split the data into a training set of size |Dtrain| = 86 and a test set of size |Dtest| = 43. We randomly split the data into three folds with ratio 6 : 2 : 2 in size. In particular, we randomly split the dataset into three folds of size 2 : 2 : 6 |
| Hardware Specification | No | We train a small neural network in only 3 epochs so that the whole procedure works well with CPUs; We train a small neural network over 10 epochs. These choices are suitable for experiments on CPUs (one might of course use other more computationally intensive alternatives). |
| Software Dependencies | No | We use gradient boosting, SVM with rbf kernel, and random forest to fit a regression model bµ( ) for E[Y | X], all from the scikit-learn Python library without fine tuning. prediction pipelines established in the Deep Purpose library (Huang et al., 2020). |
| Experiment Setup | Yes | We train a small neural network in only 3 epochs so that the whole procedure works well with CPUs; We train a small neural network over 10 epochs. We use gradient boosting, SVM with rbf kernel, and random forest to fit a regression model bµ( ) for E[Y | X], all from the scikit-learn Python library without fine tuning. |