Statistical Inference Using SGD
Authors: Tianyang Li, Liu Liu, Anastasios Kyrillidis, Constantine Caramanis
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the merits of our scheme, we apply it to both synthetic and real data sets, and demonstrate that its accuracy is comparable to classical statistical methods, while requiring potentially far less computation. Finally, in the experimental section, we provide parts of our numerical experiments that illustrate the behavior of our algorithm, and corroborate our theoretical findings. We do this using synthetic data for linear and logistic regression, and also by considering the Higgs detection (Baldi, Sadowski, and Whiteson 2014) and the LIBSVM Splice data sets. |
| Researcher Affiliation | Collaboration | Tianyang Li The University of Texas at Austin lty@cs.utexas.edu Liu Liu The University of Texas at Austin liuliu@utexas.edu Anastasios Kyrillidis IBM T.J. Watson Research Center, Yorktown Heights anastasios.kyrillidis@ibm.com Constantine Caramanis The University of Texas at Austin constantine@utexas.edu |
| Pseudocode | No | The paper includes Figure 1 to illustrate the procedure, but it is a diagram and not a pseudocode block or a formally structured algorithm block. |
| Open Source Code | No | The paper does not provide any specific links to open-source code for the described methodology, nor does it explicitly state that the code is released or available in supplementary materials. |
| Open Datasets | Yes | We do this using synthetic data for linear and logistic regression, and also by considering the Higgs detection (Baldi, Sadowski, and Whiteson 2014) and the LIBSVM Splice data sets. The Splice data set 3 contains 60 distinct features with 1000 data samples. 3https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html |
| Dataset Splits | No | The paper mentions the total number of samples used in experiments (e.g., 'n = 20 i.i.d. samples', 'n = 100 samples', 'n = 1000 samples') but does not specify how these datasets were split into training, validation, or test sets. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | For the parameters, we used η = 0.8, t = 5, d = 10, b = 20, and mini batch size of 2. For the parameters, we used η = 0.1, t = 100, d = 5, b = 100, and mini batch size of 5. In these experiments, we set d = 100, used mini-batch size of 4, and used 200 SGD samples. We used 10000 samples from both bootstrap and our SGD inference procedure with t = 500, d = 100, η = 0.2, and mini batch size of 6. |