Statistical Cost Sharing
Authors: Eric Balkanski, Umar Syed, Sergei Vassilvitskii
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We formalize this approach, which we call STATISTICAL COST SHARING, and consider the computation of the core and the Shapley value. Expanding on the work by Balcan et al. [2015], we give precise sample complexity bounds for computing cost shares that satisfy the core property with high probability for any function with a non-empty core. For the Shapley value, which has never been studied in this setting, we show that for submodular cost functions with bounded curvature it can be approximated from samples from the uniform distribution to a p1 factor, and that the bound is tight. We then define statistical analogues of the Shapley axioms, and derive a notion of statistical Shapley value and that these can be approximated arbitrarily well from samples from any distribution and for any function. |
| Researcher Affiliation | Collaboration | Eric Balkanski Harvard University ericbalkanski@g.harvard.edu Umar Syed Google NYC usyed@google.com Sergei Vassilvitskii Google NYC sergeiv@google.com |
| Pseudocode | No | The paper describes algorithms conceptually and discusses their properties (e.g., 'Our algorithm, which runs in polynomial time, directly computes a cost allocation...'), but it does not include any formal pseudocode blocks or algorithm listings. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide links to a code repository. |
| Open Datasets | No | The paper describes a theoretical model where samples are drawn from a distribution (e.g., uniform distribution), but it does not use or provide concrete access information for any named, publicly available dataset typically used for empirical training and evaluation. |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages or counts for training, validation, and test sets), as its focus is theoretical analysis and sample complexity rather than empirical validation. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running experiments, as it is a theoretical paper focusing on mathematical analysis and algorithms rather than empirical implementation. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks, solvers), as its content is primarily theoretical. |
| Experiment Setup | No | The paper does not provide specific experimental setup details such as hyperparameters or system-level training configurations, as it focuses on theoretical proofs and algorithms rather than empirical implementation. |