The Forget-me-not Process
Authors: Kieran Milan, Joel Veness, James Kirkpatrick, Michael Bowling, Anna Koop, Demis Hassabis
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide regret guarantees with respect to piecewise stationary data sources under the logarithmic loss, and validate the method empirically across a range of sequence prediction and task identification problems. |
| Researcher Affiliation | Collaboration | Kieran Milan , Joel Veness , James Kirkpatrick, Demis Hassabis Google Deep Mind {kmilan,aixi,kirkpatrick,demishassabis}@google.com Anna Koop, Michael Bowling University of Alberta {anna,bowling}@cs.ualberta.ca |
| Pseudocode | Yes | Algorithm 1 FORGET-ME-NOT FMNd(x1:n) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It mentions a video demonstrating results but no code repository. |
| Open Datasets | Yes | We partitioned the MNIST data into m = 10 classes, one for each distinct digit, which we used to derive ten digit-specific empirical distributions. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning for training, validation, and testing. It mentions sequence lengths and repeated runs, but not explicit dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software like "KT-estimator", "MADE [9]", and "ADAGRAD [8]" but does not specify their version numbers, which are required for reproducible software dependency details. |
| Experiment Setup | Yes | A Fistful of Digits (FOD), we used MADE [9], a recently introduced, general purpose neural density estimator, with 500 hidden units, trained online using ADAGRAD [8] with a learning rate of 0.1; MADE was also the base model for the Continual Atari task, but here a smaller network consisting of 50 neurons was used for reasons of computational efficiency. For the FMN results, the MBOC hyper-parameters are k = 15, α = 0, β = 0, c = 4 and sub-sample sizes of 100; the FOD hyper-parameters are k = 30, α = 0.2, β = 0.06, c = 4 with sub-sample sizes of 10. |