Target Tracking for Contextual Bandits: Application to Demand Side Management
Authors: Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels. |
| Researcher Affiliation | Collaboration | 1EDF R&D, Palaiseau, France 2Laboratoire de math ematiques d Orsay, Universit e Paris-Sud, CNRS, Universit e Paris-Saclay, Orsay, France 3INRIA D epartement d Informatique de l Ecole Normale Sup erieure, PSL Research University, Paris, France. |
| Pseudocode | Yes | Protocol 1 Target Tracking for Contextual Bandits |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the code for the work described, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We consider open data published by UK Power Networks and containing energy consumption (in k Wh per half hour) at half hourly intervals of a thousand customers subjected to dynamic energy prices... Smart Meter Energy Consumption Data in London Households see https://data.london.gov.uk/dataset/smartmeter-energyuse-data-in-london-households |
| Dataset Splits | No | The paper describes a 'training period' and a 'testing period' but does not specify a separate validation split, nor does it provide exact split percentages or sample counts for each partition (train/validation/test). |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run its experiments, such as specific CPU or GPU models, or detailed cloud resource specifications. |
| Software Dependencies | No | The paper mentions the use of 'the R package mgcv' but does not specify its version number or any other software dependencies with their respective versions. |
| Experiment Setup | Yes | We create one year of data using historical contexts and assume that only Normal tariffs are picked at first: pt = (0, 1, 0); this is a training period... Then the provider starts exploring the effects of tariffs for an additional month (a January month, based on the historical contexts) and freely picks the pt according to our algorithm; this is the testing period... For learning to then focus on the parameters j, as other parameters were decently estimated in the training period, we modify the exploration term t,p of (3) into t,p = 2CBt 1(δt 2)k V 1/2 t 1 φ(xt, p) k with Vt 1 = λId + P t 1 s=1 φ(xs, ps)φ(xs, ps)T . We pick a convenient λ. |