Learning Augmented Binary Search Trees
Authors: Honghao Lin, Tian Luo, David Woodruff
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Further, we experimentally evaluate our learned treap on synthetic datasets and demonstrate a performance advantage over other search tree data structures. We also present experiments on real world datasets with known frequency estimation oracles and show improvements as well. |
| Researcher Affiliation | Academia | 1Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, USA. Correspondence to: Honghao Lin <honghaol@andrew.cmu.edu>, Tian Luo <tianl1@andrew.cmu.edu>, David P. Woodruff <dwoodruf@andrew.cmu.edu>. |
| Pseudocode | No | The paper describes algorithms and operations in paragraph form, but does not include structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper discusses the use of machine learning models and oracles from other published works (e.g., Hsu et al. (2019) and Du et al. (2021)) but does not provide a statement or link for the open-source code of their own described methodology. |
| Open Datasets | Yes | Data: The internet traffic data was collected by CAIDA using a commercial backbone link (Tier 1 ISP) (CAIDA, 2016). URL https://www.caida.org/catalog/datasets/passive_dataset. Data: This dataset contains approximately 21 million queries on AOL collected over 90 days in 2006. The distribution follows Zipf s Law (see Hsu et al. (2019)). |
| Dataset Splits | Yes | The first 7 minutes of the dataset was used as training sets with the next 2 minutes used as the validation sets. The rest of the dataset was used for testing. |
| Hardware Specification | No | The paper describes the use of deep learning models (RNN, LSTM) for the frequency estimation oracle but does not provide any specific details about the hardware used for training or running experiments (e.g., CPU, GPU models, or cloud computing specifications). |
| Software Dependencies | No | The paper mentions that an RNN was used for modeling by Hsu et al. (2019) but does not list specific software dependencies with version numbers for its own experimental implementation (e.g., programming languages, libraries, or frameworks with versions). |
| Experiment Setup | No | The paper describes the data splits for training, validation, and testing of the machine learning oracle and the type of neural network used (RNN, LSTM cells) from Hsu et al. (2019), but it does not specify hyperparameters (e.g., learning rate, batch size, epochs) or other system-level configuration details for its own experimental setup. |