Why Are Learned Indexes So Effective?
Authors: Paolo Ferragina, Fabrizio Lillo, Giorgio Vinciguerra
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our general result is then specialised to five well-known distributions: Uniform, Lognormal, Pareto, Exponential, and Gamma; and it is corroborated in precision and robustness by a large set of experiments. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Pisa, Italy 2Department of Mathematics, University of Bologna, Italy. |
| Pseudocode | No | The paper describes algorithms but does not include any structured pseudocode blocks or clearly labeled algorithm figures. |
| Open Source Code | Yes | The code to reproduce the experiments is available at https://github.com/gvinciguerra/Learnedindexes-effectiveness. |
| Open Datasets | Yes | Figure 6 shows the results of our final experiment, which measured the average segment length of OPT on real-world datasets of 200 million elements from Kipf et al. (2019). The books dataset represents book sale popularity from Amazon, while fb contains Facebook user IDs. |
| Dataset Splits | No | The paper does not provide specific training/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | Yes | The experiments were run on an Intel Xeon Gold 6132 CPU. |
| Software Dependencies | No | The paper mentions that code is available but does not specify any software dependencies (e.g., programming languages, libraries, or solvers) with version numbers. |
| Experiment Setup | No | The paper does not provide specific details about the experimental setup, such as hyperparameter values (e.g., learning rate, batch size) or system-level training settings for the algorithms. |