reproducibilityindex.ai

Why Are Learned Indexes So Effective?

Authors: Paolo Ferragina, Fabrizio Lillo, Giorgio Vinciguerra

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our general result is then specialised to ﬁve well-known distributions: Uniform, Lognormal, Pareto, Exponential, and Gamma; and it is corroborated in precision and robustness by a large set of experiments.
Researcher Affiliation	Academia	1Department of Computer Science, University of Pisa, Italy 2Department of Mathematics, University of Bologna, Italy.
Pseudocode	No	The paper describes algorithms but does not include any structured pseudocode blocks or clearly labeled algorithm figures.
Open Source Code	Yes	The code to reproduce the experiments is available at https://github.com/gvinciguerra/Learnedindexes-effectiveness.
Open Datasets	Yes	Figure 6 shows the results of our ﬁnal experiment, which measured the average segment length of OPT on real-world datasets of 200 million elements from Kipf et al. (2019). The books dataset represents book sale popularity from Amazon, while fb contains Facebook user IDs.
Dataset Splits	No	The paper does not provide specific training/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology).
Hardware Specification	Yes	The experiments were run on an Intel Xeon Gold 6132 CPU.
Software Dependencies	No	The paper mentions that code is available but does not specify any software dependencies (e.g., programming languages, libraries, or solvers) with version numbers.
Experiment Setup	No	The paper does not provide specific details about the experimental setup, such as hyperparameter values (e.g., learning rate, batch size) or system-level training settings for the algorithms.