Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-scale Online Learning: Theory and Applications to Online Auctions and Pricing

Authors: Sébastien Bubeck, Nikhil R. Devanur, Zhiyi Huang, Rad Niazadeh

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We consider revenue maximization in online auction/pricing problems. A seller sells an identical item in each period to a new buyer, or a new set of buyers. For the online pricing problem, both when the arriving buyer bids or only responds to the posted price, we design algorithms whose regret bounds scale with the best fixed price in-hindsight, rather than the range of the values. [...] These results are obtained by generalizing the classical learning from experts and multiarmed bandit problems to their multi-scale versions. [...] We obtain almost optimal multi-scale regret bounds by introducing a new Online Mirror Descent (OMD) algorithm whose mirror map is the multi-scale version of the negative entropy function. We further generalize to the bandit setting by introducing the stochastic variant of this OMD algorithm.
Researcher Affiliation Collaboration Sébastien Bubeck EMAIL Microsoft Research, 1 Microsoft Way, Redmond, WA 98052, USA. Nikhil R. Devanur EMAIL Microsoft Research, 1 Microsoft Way, Redmond, WA 98052, USA. Zhiyi Huang EMAIL Department of Computer Science, The University of Hong Kong, Pokfulam, Hong Kong. Rad Niazadeh EMAIL Department of Computer Science, Stanford University, Stanford, CA 94305, USA.
Pseudocode Yes Algorithm 1 MSMW Algorithm 2 Online single buyer auction (for unknown h) Algorithm 3 Bandit-MSMW
Open Source Code No The paper includes a license for the publication itself (CC-BY 4.0) but does not provide any specific links or statements about the availability of source code for the described algorithms or methodology.
Open Datasets No The paper is theoretical, focusing on online learning algorithms for auctions and pricing. It discusses 'valuations' and 'rewards' as part of the theoretical model but does not use or refer to any specific publicly available empirical datasets for experimental evaluation.
Dataset Splits No The paper is theoretical and does not conduct experiments on datasets, therefore no dataset splits are provided.
Hardware Specification No The paper is theoretical and focuses on algorithm design and regret bounds; it does not describe any experimental hardware used for empirical evaluation.
Software Dependencies No The paper is theoretical and does not describe any specific software or programming libraries with version numbers used for implementation or experimentation.
Experiment Setup No The paper is theoretical, presenting algorithms and proving regret bounds. It does not contain an experimental setup with specific hyperparameter values, training configurations, or system-level settings.