Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization
Authors: Nicola Bariletto, Nhat Ho
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we provide insights into the workings of our method by applying it to a variety of tasks based on simulated and real datasets. |
| Researcher Affiliation | Academia | Nicola Bariletto Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 nicola.bariletto@utexas.edu Nhat Ho Department of Statistics and Data Sciences The University of Texas at Austin Austin, TX 78712 minhnhat@utexas.edu |
| Pseudocode | Yes | Algorithm 1 in Appendix B details the procedure... |
| Open Source Code | Yes | Code to replicate our experiments can be found at the folllowing link: https://github.com/nbariletto/BNP_ for_DRO. |
| Open Datasets | Yes | we applied our method to predict diabetes development based on a host of features, as collected in the popular and public Pima Indian Diabetes dataset. ... The Wine Quality dataset [8] and the Liver Disorders dataset [15]. |
| Dataset Splits | Yes | To test our method, we randomly select 300 training observations and leave out the rest for as a test sample. Then, we randomly split the training data into 15 folds of size 20 and select, via k-fold cross validation, the optimal DP concentration parameter α over a wide grid of values. |
| Hardware Specification | Yes | All experiments were performed on a desktop with 12th Gen Intel(R) Core(TM) i9-12900H, 2500 Mhz, 14 Core(s), 20 Logical Processor(s) and 32.0 GB RAM. |
| Software Dependencies | No | The paper mentions 'scikit-learn' [31] as a used library, but does not provide specific version numbers for it or any other software dependencies crucial for reproduction. |
| Experiment Setup | Yes | Robust Criterion Parameters. For each simulated sample, we run our robust procedure setting the following parameter values: ϕ(t) = β exp(t/β) β, β {1, }, α = a/n for a {1, 2, 5, 10}, and p0 = N(0, I), where the β = setting corresponds to Ridge regression with regularization parameter α (see Proposition 2.1). Finally, we run 300 Monte Carlo simulations to approximate the criterion, and truncate the Multinomial-Dirichlet approximation at T = 50. Stochastic Gradient Descent Parameters We initialize the algorithm at θ = (0, . . . , 0) and set the step size at ηt = 50/(100 + t). The number of passes over data is set after visual inspection of convergence of the criterion value. |