Learning-Augmented Data Stream Algorithms
Authors: Tanqiu Jiang, Yi Li, Honghao Lin, Yisong Ruan, David P. Woodruff
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our results, demonstrating also our improvements in practice. We conduct experiments for the distinct elements and the Fp moment (p > 2) problems, on both real-world and synthetic data, which demonstrate significant practical benefits. |
| Researcher Affiliation | Academia | Tanqiu Jiang Department of Electrical and Computer Engineering Lehigh University Bethlehem, PA 18015, USA taj320@lehigh.edu, Yi Li School of Physical and Mathematical Sciences Nanyang Technological University Singapore 637371 yili@ntu.edu.sg, Honghao Lin Zhiyuan College Shanghai Jiao Tong University Shanghai, China 200240 honghao lin@sjtu.edu.cn, Yisong Ruan Department of Software engineering Xiamen University Xiamen, Fujian, China 361000 24320152202802@stu.edu.xmu.cn, David P. Woodruff Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213, USA dwoodruf@cs.cmu.edu |
| Pseudocode | No | The paper describes algorithms such as ROUGHL0ESTIMATOR and EXACTCOUNT in textual form but does not provide pseudocode or formally labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | The traffic data is collected at a backbone link of a Tier1 ISP between Chicago and Seattle in 2016 (CAIDA). http://www.caida.org/data/monitors/ passive-equinix-chicago.xml. |
| Dataset Splits | Yes | They use the first 7 minutes for training, the following minute for validation, and estimate the packet counts in subsequent minutes. [...] They use the first 5 days for training, the following day for validation, and estimate the number of times different search queries appear in subsequent days. |
| Hardware Specification | No | The paper does not specify the exact hardware components (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper describes the use of algorithms and models (e.g., RNNs, LSTM) but does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | For ROUGHL0ESTIMATOR, we set c = 10 and η = 1/4. We use the heavy hitter oracle to predict whether the coordinate will be larger than 210. We randomly select a prime from [11, 31] for the hash buckets. [...] We plot the results for different values of k = 10, 20, 30. |