Deep Contract Design via Discontinuous Networks

Authors: Tonghan Wang, Paul Duetting, Dmitry Ivanov, Inbal Talgam-Cohen, David C. Parkes

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical results that demonstrate success in approximating the principal s utility function with a small number of training samples and scaling to find approximately optimal contracts on problems with a large number of actions and outcomes.
Researcher Affiliation Collaboration Tonghan Wang Harvard University twang1@g.harvard.edu Paul Dütting Google Switzerland duetting@google.com Dmitry Ivanov Israel Institute of Technology divanov@campus.technion.ac.il Inbal Talgam-Cohen Israel Institute of Technology italgam@cs.technion.ac.il David C. Parkes Harvard University parkes@eecs.harvard.edu *Also Deep Mind, London UK
Pseudocode Yes Algorithm 1 Parallel Gradient-Based Inference
Open Source Code No The paper does not provide a direct link to open-source code or explicitly state that the code is publicly available.
Open Datasets No Experiments are carried out on random synthetic examples. The outcome distributions p( |a) are generated by applying Soft Max on a Gaussian random vector in Rm. The outcome value vo is uniform on [0, 10]. The action cost is a mixture, c(a) = (1 βp)cr(a)+βpci(a), where cr(a) = αp Eo p( |a)[vo] for scaling factor αp > 0 is a correlated cost that is proportional to the expected value of the action, ci(a) is an independent cost and uniform on [0, 1], and βp controls the weight of the independent cost. We test different problem sizes by changing the number of outcomes m and actions n.
Dataset Splits No The paper mentions using '50K random samples' for training but does not specify a distinct validation split or explicit percentages for training, validation, and testing.
Hardware Specification Yes Across all experiments, De LU, and the baseline Re LU networks are trained on a NVIDIA A100 GPU. Gradient-based inference is also parallelized on the A100 GPU.
Software Dependencies No The direct LP solver (Oracle LP) and the LP-based inference algorithm are based on the linear programming toolkit Pu LP [47]. However, a specific version number for Pu LP is not provided.
Experiment Setup Yes We train a De LU network with one hidden layer of 32 hidden units. The bias network ζ has a hidden layer with 512 (Tanh-activated) neurons. The two sub-networks η and ζ are trained in an end-to-end manner by the MSE loss (Eq. 5). The optimization is conducted using RMSprop with a learning rate of 1e-3, α of 0.99, and with no momentum or weight decay. They are trained for 100 epochs with 50K random samples.