Learning to Price Homogeneous Data

Authors: Keran Chen, Joon Suk Huh, Kirthevasan Kandasamy

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study a data pricing problem, where a seller has access to N homogeneous data points (e.g. drawn i.i.d. from some distribution). There are m types of buyers in the market, where buyers of the same type i have the same valuation curve vi : [N] [0, 1], where vi(n) is the value for having n data points. A priori, the seller is unaware of the distribution of buyers, but can repeat the market for T rounds so as to learn the revenue-optimal pricing curve p : [N] [0, 1]. To solve this online learning problem, we first develop novel discretization schemes to approximate any pricing curve. When compared to prior work, the size of our discretization schemes scales gracefully with the approximation parameter, which translates to better regret in online learning. Under assumptions like smoothness and diminishing returns which are satisfied by data, the discretization size can be reduced further. We then turn to the online learning problem, both in the stochastic and adversarial settings. On each round, the seller chooses an anonymous pricing curve pt. A new buyer appears and may choose to purchase some amount of data. She then reveals her type only if she makes a purchase. Our online algorithms build on classical algorithms such as UCB and FTPL, but require novel ideas to account for the asymmetric nature of this feedback and to deal with the vastness of the space of pricing curves. Our algorithms achieve e O(m T) regret in the stochastic setting and e O(m 3/2 T) regret in the adversarial setting.
Researcher Affiliation Academia Keran Chen UW-Madison kchen429@wisc.edu Joon Suk Huh UW-Madison jhuh23@wisc.edu Kirthevasan Kandasamy UW-Madison kandasamy@cs.wisc.edu
Pseudocode Yes Algorithm 1 Price discretization scheme under monotonicity; Algorithm 2 Price discretization scheme monotonic valuations under diminishing returns; Algorithm 3 Online data pricing in the stochastic setting.; Algorithm 4 Online data pricing in the adversarial setting.; Algorithm 6 Auxiliary Price Adjustment
Open Source Code No The paper does not provide any explicit statements about open-source code release or links to code repositories for the described methodology.
Open Datasets Yes For instance, with Image Net s [21], N 1.4 million data points, different types of buyers could perform different learning tasks such as object detection, identification, and segmentation, and/or train different models such as Alex Net [36], Res Net [26], and Goog Le Net [42].
Dataset Splits No The paper is theoretical and does not describe dataset splits for its own experimental methodology.
Hardware Specification No The paper is theoretical and does not include details on hardware specifications as it does not conduct experiments.
Software Dependencies No The paper is theoretical and does not mention specific software dependencies with version numbers as it does not conduct experiments.
Experiment Setup No The paper is theoretical and does not provide details on an experimental setup, hyperparameters, or training configurations as it does not conduct experiments.