Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Parsimonious Learning-Augmented Caching
Authors: Sungjin Im, Ravi Kumar, Aditya Petety, Manish Purohit
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate our algorithm on a real-world dataset and demonstrate the empirical dependence of the competitive ratio on the number of queries made as well as on the prediction errors. |
| Researcher Affiliation | Collaboration | 1UC Merced 2Google Mountain View. |
| Pseudocode | Yes | Algorithm 1: A generic marking algorithm. Algorithm 2: Naive eviction. Algorithm 3: Adaptive query eviction. |
| Open Source Code | No | The paper does not provide a specific link or statement indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use the Citi Bike dataset as in Lykouris & Vassilvitskii (2018). The dataset comes from a publicly-available1 bike sharing platform in New York City. |
| Dataset Splits | No | The paper describes using each month's data as an input sequence of 25,000 events to evaluate the online caching algorithms. It does not provide specific train/validation/test splits for these sequences. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Finally we set the cache size k = 500. For each page p in the cache, its predicted next request time is set to its actual next request time plus a noise, which is drawn i.i.d. from a lognormal distribution whose underlying normal distribution has mean 0 and standard deviation σ. Our algorithm is parameterized by b, the number of queries made per cache miss. |