Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Deep Contract Design via Discontinuous Networks

Authors: Tonghan Wang, Paul Duetting, Dmitry Ivanov, Inbal Talgam-Cohen, David C. Parkes

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical results that demonstrate success in approximating the principal s utility function with a small number of training samples and scaling to find approximately optimal contracts on problems with a large number of actions and outcomes.
Researcher Affiliation Collaboration Tonghan Wang Harvard University EMAIL Paul Dütting Google Switzerland EMAIL Dmitry Ivanov Israel Institute of Technology EMAIL Inbal Talgam-Cohen Israel Institute of Technology EMAIL David C. Parkes Harvard University EMAIL *Also Deep Mind, London UK
Pseudocode Yes Algorithm 1 Parallel Gradient-Based Inference
Open Source Code No The paper does not provide a direct link to open-source code or explicitly state that the code is publicly available.
Open Datasets No Experiments are carried out on random synthetic examples. The outcome distributions p( |a) are generated by applying Soft Max on a Gaussian random vector in Rm. The outcome value vo is uniform on [0, 10]. The action cost is a mixture, c(a) = (1 βp)cr(a)+βpci(a), where cr(a) = αp Eo p( |a)[vo] for scaling factor αp > 0 is a correlated cost that is proportional to the expected value of the action, ci(a) is an independent cost and uniform on [0, 1], and βp controls the weight of the independent cost. We test different problem sizes by changing the number of outcomes m and actions n.
Dataset Splits No The paper mentions using '50K random samples' for training but does not specify a distinct validation split or explicit percentages for training, validation, and testing.
Hardware Specification Yes Across all experiments, De LU, and the baseline Re LU networks are trained on a NVIDIA A100 GPU. Gradient-based inference is also parallelized on the A100 GPU.
Software Dependencies No The direct LP solver (Oracle LP) and the LP-based inference algorithm are based on the linear programming toolkit Pu LP [47]. However, a specific version number for Pu LP is not provided.
Experiment Setup Yes We train a De LU network with one hidden layer of 32 hidden units. The bias network ζ has a hidden layer with 512 (Tanh-activated) neurons. The two sub-networks η and ζ are trained in an end-to-end manner by the MSE loss (Eq. 5). The optimization is conducted using RMSprop with a learning rate of 1e-3, α of 0.99, and with no momentum or weight decay. They are trained for 100 epochs with 50K random samples.