reproducibilityindex.ai

Learning Augmented Binary Search Trees

Authors: Honghao Lin, Tian Luo, David Woodruff

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Further, we experimentally evaluate our learned treap on synthetic datasets and demonstrate a performance advantage over other search tree data structures. We also present experiments on real world datasets with known frequency estimation oracles and show improvements as well.
Researcher Affiliation	Academia	1Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA. Correspondence to: Honghao Lin <honghaol@andrew.cmu.edu>, Tian Luo <tianl1@andrew.cmu.edu>, David P. Woodruff <dwoodruf@andrew.cmu.edu>.
Pseudocode	No	The paper describes algorithms and operations in paragraph form, but does not include structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper discusses the use of machine learning models and oracles from other published works (e.g., Hsu et al. (2019) and Du et al. (2021)) but does not provide a statement or link for the open-source code of their own described methodology.
Open Datasets	Yes	Data: The internet traffic data was collected by CAIDA using a commercial backbone link (Tier 1 ISP) (CAIDA, 2016). URL https://www.caida.org/catalog/datasets/passive_dataset. Data: This dataset contains approximately 21 million queries on AOL collected over 90 days in 2006. The distribution follows Zipf s Law (see Hsu et al. (2019)).
Dataset Splits	Yes	The first 7 minutes of the dataset was used as training sets with the next 2 minutes used as the validation sets. The rest of the dataset was used for testing.
Hardware Specification	No	The paper describes the use of deep learning models (RNN, LSTM) for the frequency estimation oracle but does not provide any specific details about the hardware used for training or running experiments (e.g., CPU, GPU models, or cloud computing specifications).
Software Dependencies	No	The paper mentions that an RNN was used for modeling by Hsu et al. (2019) but does not list specific software dependencies with version numbers for its own experimental implementation (e.g., programming languages, libraries, or frameworks with versions).
Experiment Setup	No	The paper describes the data splits for training, validation, and testing of the machine learning oracle and the type of neural network used (RNN, LSTM cells) from Hsu et al. (2019), but it does not specify hyperparameters (e.g., learning rate, batch size, epochs) or other system-level configuration details for its own experimental setup.