reproducibilityindex.ai

Statistical Inference Using SGD

Authors: Tianyang Li, Liu Liu, Anastasios Kyrillidis, Constantine Caramanis

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To show the merits of our scheme, we apply it to both synthetic and real data sets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation. Finally, in the experimental section, we provide parts of our numerical experiments that illustrate the behavior of our algorithm, and corroborate our theoretical ﬁndings. We do this using synthetic data for linear and logistic regression, and also by considering the Higgs detection (Baldi, Sadowski, and Whiteson 2014) and the LIBSVM Splice data sets.
Researcher Affiliation	Collaboration	Tianyang Li The University of Texas at Austin lty@cs.utexas.edu Liu Liu The University of Texas at Austin liuliu@utexas.edu Anastasios Kyrillidis IBM T.J. Watson Research Center, Yorktown Heights anastasios.kyrillidis@ibm.com Constantine Caramanis The University of Texas at Austin constantine@utexas.edu
Pseudocode	No	The paper includes Figure 1 to illustrate the procedure, but it is a diagram and not a pseudocode block or a formally structured algorithm block.
Open Source Code	No	The paper does not provide any specific links to open-source code for the described methodology, nor does it explicitly state that the code is released or available in supplementary materials.
Open Datasets	Yes	We do this using synthetic data for linear and logistic regression, and also by considering the Higgs detection (Baldi, Sadowski, and Whiteson 2014) and the LIBSVM Splice data sets. The Splice data set 3 contains 60 distinct features with 1000 data samples. 3https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html
Dataset Splits	No	The paper mentions the total number of samples used in experiments (e.g., 'n = 20 i.i.d. samples', 'n = 100 samples', 'n = 1000 samples') but does not specify how these datasets were split into training, validation, or test sets.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used to run the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	For the parameters, we used η = 0.8, t = 5, d = 10, b = 20, and mini batch size of 2. For the parameters, we used η = 0.1, t = 100, d = 5, b = 100, and mini batch size of 5. In these experiments, we set d = 100, used mini-batch size of 4, and used 200 SGD samples. We used 10000 samples from both bootstrap and our SGD inference procedure with t = 500, d = 100, η = 0.2, and mini batch size of 6.