Learning-based Support Estimation in Sublinear Time
Authors: Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed algorithms on a collection of data sets, using the neuralnetwork based estimators from Hsu et al, ICLR 19 as predictors. Our experiments demonstrate substantial (up to 3x) improvements in the estimation accuracy compared to the state of the art algorithm. |
| Researcher Affiliation | Collaboration | Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139, USA |
| Pseudocode | Yes | Our algorithm is presented in Algorithm 1. |
| Open Source Code | No | The paper does not explicitly state that the source code for the described methodology is being released or provide a link to a repository. |
| Open Datasets | Yes | From CAIDA internet traces 2016, https://www.caida.org/data/monitors/ passive-equinix-chicago.xml. |
| Dataset Splits | No | The paper describes how the *predictor* was trained and validated (e.g., 'trained on the first 5 days and the 6th day is used for validation' for AOL data), but it does not specify train/validation/test splits for the main support estimation algorithm’s data, which uses entire daily or minute distributions. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'neural-network based estimators' and 'recurrent neural network (RNN)' but does not specify any software names with version numbers (e.g., 'TensorFlow 2.x' or 'PyTorch 1.x'). |
| Experiment Setup | Yes | Algorithm 1 uses two parameters that need to be set: The polynomial degree L, and the base parameter b. The performance of the algorithm is not very sensitive to L, and for simplicity we use the same setting as Wu & Yang (2019), L = 0.45 log n. The setting of b requires more care. ... In our implementation, we start by running Algorithm 1 with b = 2. If any interval fails the sanity check, we increment b and repeat the algorithm (with the same set of samples). The final base we use is twice the minimal one such that all intervals pass the sanity check, where the final doubling is to ensure we are indeed past the point where all checks succeed. |