reproducibilityindex.ai

Robust Gradient-Based Markov Subsampling

Authors: Tieliang Gong, Quanhan Xi, Chen Xu4004-4011

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To assess the performance of GMS, we conduct experiments on both simulation studies and real data examples. All numerical studies are conducted in software R on Compute Canada clusters with 2.1 GHz CPUs and 128 GB memory. In simulation studies, we generate the data by y = Xβ +ε, where the n d design matrix X is generated by a mixture of Gaussian distributions. Due to space limitation, we only show the results for the setting n = 1M, d = 500. Other results are given in the supplementary material. Figs. 3 and 4 record the boxplots based on 50 times empirical estimation error. The mean and standard deviation of EE are reported in Tables 1 and 2.
Researcher Affiliation	Academia	Tieliang Gong, Quanhan Xi, Chen Xu Deparment of Mathematics and Statistics, University of Ottawa, Ottawa, ON, K1N6N5, Canada
Pseudocode	Yes	Algorithm 1 Robust Gradient-based Markov Subsampling
Open Source Code	No	The paper does not provide any links or explicit statements about the availability of its source code.
Open Datasets	Yes	Online News Popularity (n = 39797, d = 61), Wave Energy Converters (n = 288000, d = 32) and Poker Hands (n = 25010, d = 11) 1. Footnote 1 refers to https://archive.ics.uci.edu/ml/datasets.php
Dataset Splits	No	The paper describes a subsampling strategy for estimation but does not provide explicit training, validation, and test dataset splits in the conventional sense (e.g., percentages or counts for each split).
Hardware Specification	Yes	All numerical studies are conducted in software R on Compute Canada clusters with 2.1 GHz CPUs and 128 GB memory.
Software Dependencies	No	The paper mentions 'software R' but does not specify a version number or any specific R packages with version numbers.
Experiment Setup	Yes	In all experiments, the subsample size is set by nsub = sr n, where sr represents the sampling ratio. We set sr = 0.001, 0.005, 0.01 for each model. If required, a pilot estimator is calculated by uniform subsampling of size n0 = nsub.