Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An Imitation Learning Approach for Cache Replacement

Authors: Evan Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, Junwhan Ahn

ICML 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When evaluated on 13 of the most memory-intensive SPEC applications, PARROT increases cache miss rates by 20% over the current state of the art. In addition, on a large-scale web search benchmark, PARROT increases cache hit rates by 61% over a conventional LRU policy.
Researcher Affiliation Collaboration 1Department of Computer Science, Stanford University, California, USA 2Google Research, Sunnyvale, California, USA.
Pseudocode Yes Algorithm 1 PARROT training algorithm
Open Source Code Yes Code for PARROT and our cache replacement Gym environment is available at https://github.com/google-research/ google-research/tree/master/cache_ replacement.
Open Datasets Yes For benchmark workloads, we evaluate on the memory-intensive SPEC CPU2006 (Henning, 2006) applications used by Shi et al. (2019). In addition, we evaluate on Google Web Search, an industrial-scale application that serves billions of queries per day...
Dataset Splits Yes We train replacement policies on the ๏ฌrst 80% of this sequence, validate on the next 10%, and report test results on the ๏ฌnal 10%.
Hardware Specification No The paper discusses cache architectures (L1/L2/last-level caches) but does not specify the particular CPU, GPU, or other hardware used to run the experiments, only general terms like 'CPU caches'.
Software Dependencies No The paper mentions using a 'Gym environment' and 'PyTorch' (via a citation), but it does not specify version numbers for these or any other software dependencies required to reproduce the experiments.
Experiment Setup Yes For PARROT, we report results averaged over 3 random seeds, using the same minimally-tuned hyperparameters in all domains. These hyperparameters were tuned exclusively on the validation set of omnetpp (full details in Appendix B).