Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Establishing Guaranteed Error for Learned Database Operations

Authors: Sepanta Zeighami, Cyrus Shahabi

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we embark on the first theoretical study of such guarantees for learned methods, presenting the necessary conditions for such guarantees to hold when using machine learning to perform indexing, cardinality estimation and range-sum estimation. and 4 EMPIRICAL RESULTS We present experiments comparing our bounds with the error obtained by training different models on datasets sampled from different distributions.
Researcher Affiliation Academia Sepanta Zeighami UC Berkeley EMAIL Cyrus Shahabi University of Southern California EMAIL
Pseudocode No The paper describes methods and proofs using mathematical notation and prose but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statements about open-sourcing its code or links to a code repository for the methodology described.
Open Datasets No The paper states: 'We consider 1-dimensional datasets sampled from uniform and 2-component Gaussian mixture model distributions.' It describes how data was generated but does not provide concrete access information (link, DOI, or specific citation with authors/year) to a pre-existing publicly available dataset.
Dataset Splits No The paper discusses training and testing concepts in the context of learned models but does not provide specific train/validation/test dataset splits (e.g., percentages, counts, or references to standard splits) for its own empirical experiments.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types) used for running its experiments.
Software Dependencies No The paper mentions types of models and general techniques but does not list specific software dependencies (libraries, frameworks) with their version numbers.
Experiment Setup No The paper mentions 'empirical hyperparameter tuning' generally but does not provide specific experimental setup details such as concrete hyperparameter values (e.g., learning rate, batch size, epochs) or training configurations for the models evaluated in its empirical section.