reproducibilityindex.ai

Statistical Estimation from Dependent Data

Authors: Vardis Kandiros, Yuval Dagan, Nishanth Dikkala, Surbhi Goel, Constantinos Daskalakis

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our estimation approach on real networked data, showing that it outperforms standard regression approaches that ignore dependencies, across three text classiﬁcation datasets: Cora, Citeseer and Pubmed.
Researcher Affiliation	Collaboration	Yuval Dagan 1 Constantinos Daskalakis 1 Nishanth Dikkala 2 Surbhi Goel 3 Vardis Kandiros 1 1MIT EECS 2Google Research 3Microsoft Research NYC.
Pseudocode	No	The paper describes the Maximum Pseudo-Likelihood Estimator (MPLE) and its use but does not provide a formally structured pseudocode block or algorithm listing.
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets	Yes	We utilize three public citation datasets Cora, Citeseer and Pubmed (Yang et al., 2016).
Dataset Splits	Yes	Following the semi-supervised setup of (Kipf & Welling, 2016; Feng et al., 2020) and others, we compare performance of MPLE-0 and MPLE-β over a public split which includes only 20 nodes per class as training, 500 nodes for validation and 1000 nodes for testing. Each split maintains class distribution by splitting the set of nodes of each class into 60%(train)-20%(val)-20%(test).
Hardware Specification	No	The paper states 'We run our code on a GPU' but does not specify the model or any other specific hardware details (CPU, memory, etc.) used for the experiments.
Software Dependencies	No	The paper mentions using 'a GPU' and 'Adam' for training but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions, or specific library versions).
Experiment Setup	No	The paper mentions using a '2-layer neural network with 32 units in the hidden layer and Re LU activations' and 'Adam' optimizer. However, it states 'for our algorithms we do not perform a hyper-parameter search but use the parameters used in prior work (Feng et al., 2020),' without explicitly listing these critical hyperparameters like learning rate, batch size, or number of epochs.