On the convergence of no-regret learning in selfish routing

Authors: Walid Krichene, Benjamin Drighès, Alexandre Bayen

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We show that convergence holds for a large class of online learning algorithms, inspired from the continuous-time replicator dynamics. In particular, the discounted Hedge algorithm is proved to belong to this class, which guarantees its convergence. and Theorem 2. If for all k, the population strategies (µk(τ))τ satisfy an AREP algorithm with sublinear regret, then the sequence (µ(τ)) converges to the set of Nash equilibria.
Researcher Affiliation Academia Walid Krichene WALID@EECS.BERKELEY.EDU University of California, 652 Sutardja Dai Hall, Berkeley, CA 94720 USA Benjamin Drigh es BENJAMIN.DRIGHES@POLYTECHNIQUE.EDU Ecole Polytechnique, Route de Saclay, 91120 Palaiseau, France Alexandre Bayen BAYEN@BERKELEY.EDU University of California, 642 Sutardja Dai Hall, Berkeley, CA 94720 USA
Pseudocode Yes Algorithm 1 Online learning setting Input: For every player x Xk, a learning algorithm (hx τ)τ and initial distribution π(0)(x) Pk. 1: for each time step τ do 2: Every player x independently draws a path A(τ)(x) π(τ)(x). 3: For all k, the vector of path losses ℓk(µ(τ)) is revealed to players in Xk. Players incur losses corresponding to their path choice. 4: Every player updates her strategy: π(τ+1)(x) = hx τ (ℓk(µ(t)))t τ, π(τ)(x) .
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it include specific repository links or explicit code release statements.
Open Datasets No Figure 2 mentions that 'Latency functions are taken to be quadratic increasing, and generated randomly', indicating that no publicly available or open dataset with concrete access information was used.
Dataset Splits No The paper does not provide specific dataset split information, exact percentages, sample counts, or detailed splitting methodology needed to reproduce data partitioning.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments or illustrations.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate any computational aspects.
Experiment Setup Yes Figure 2 states 'Latency functions are taken to be quadratic increasing, and generated randomly.' and provides specific learning rates: 'With a constant learning rate γ = 0.7, (µk(τ))τ does not converge (left). With a harmonic sequence of learning rates, γτ = 1 1+τ/10, (µk(τ))τ converges to the set of Nash equilibria.'