Differentiable Distributionally Robust Optimization Layers

Authors: Xutao Ma, Chao Ning, Wenli Du

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the effectiveness of the proposed differentiable DRO layers, we conduct experiments on a toy example and the portfolio management problem.
Researcher Affiliation Academia 1Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China 2The Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai 200237, China.
Pseudocode No The paper describes algorithmic steps in narrative text, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Source code of all the experiments is available at https://github.com/DOCU-Lab/Differentiable DRO Layers.
Open Datasets No In generating data, we first sample covariate z, and then sample y conditioned on z. ... In the multi-item newsvendor problem (19) with n = 4, we set ac = (0.25, 0.5, 0.75, 1), ad = 0.95 ac, v = (10.0, 13.0, 16.0, 19.0), b = (2.0, 4.0, 6.0, 8.0), d = (0.5, 1.0, 1.5, 2.0). The covariate z follows the uniform distribution on [0, 1], and conditioned on z, we set the distribution of demand y by...
Dataset Splits Yes We use 2,500 data for training, 500 data for validation, and 1,000 data for testing. ... In each case, the size of the validation data set is set to N/5.
Hardware Specification Yes The experiments are conducted on a laptop with an i7 CPU and 32G RAM.
Software Dependencies No Processes 2 and 4 are built on Cvxpylayers (Agrawal et al., 2019a), and Process 3 is built on commercial solver Gurobi. The paper mentions these software tools but does not provide specific version numbers for them.
Experiment Setup Yes We use 2,500 data for training, 500 data for validation, and 1,000 data for testing. ... In the gradient estimation, we solve the DRO 3 times to construct (15) and use 4 samples to estimate the gradient term by importance sampling. For the energy parameter λ in (10), we initially set it to 10, and subsequently reduce it by one-third every 30 epochs. ... We use the following two-layer full-connected neural network to learn the ambiguity set parameter in both our method and the method proposed in Costa & Iyengar (2023). FC(5, 22) FC(22, 27) FC(27, m)