reproducibilityindex.ai

Learning-based Support Estimation in Sublinear Time

Authors: Talya Eden, Piotr Indyk, Shyam Narayanan, Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed algorithms on a collection of data sets, using the neuralnetwork based estimators from Hsu et al, ICLR 19 as predictors. Our experiments demonstrate substantial (up to 3x) improvements in the estimation accuracy compared to the state of the art algorithm.
Researcher Affiliation	Collaboration	Computer Science and Artiﬁcial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139, USA
Pseudocode	Yes	Our algorithm is presented in Algorithm 1.
Open Source Code	No	The paper does not explicitly state that the source code for the described methodology is being released or provide a link to a repository.
Open Datasets	Yes	From CAIDA internet traces 2016, https://www.caida.org/data/monitors/ passive-equinix-chicago.xml.
Dataset Splits	No	The paper describes how the predictor was trained and validated (e.g., 'trained on the ﬁrst 5 days and the 6th day is used for validation' for AOL data), but it does not specify train/validation/test splits for the main support estimation algorithm’s data, which uses entire daily or minute distributions.
Hardware Specification	No	The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'neural-network based estimators' and 'recurrent neural network (RNN)' but does not specify any software names with version numbers (e.g., 'TensorFlow 2.x' or 'PyTorch 1.x').
Experiment Setup	Yes	Algorithm 1 uses two parameters that need to be set: The polynomial degree L, and the base parameter b. The performance of the algorithm is not very sensitive to L, and for simplicity we use the same setting as Wu & Yang (2019), L = 0.45 log n. The setting of b requires more care. ... In our implementation, we start by running Algorithm 1 with b = 2. If any interval fails the sanity check, we increment b and repeat the algorithm (with the same set of samples). The ﬁnal base we use is twice the minimal one such that all intervals pass the sanity check, where the ﬁnal doubling is to ensure we are indeed past the point where all checks succeed.