reproducibilityindex.ai

Linear Regression using Heterogeneous Data Batches

Authors: Ayush Jain, Rajat Sen, Weihao Kong, Abhimanyu Das, Alon Orlitsky

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our algorithm with the one in [KSS+20] on simulated datasets, to show that our algorithm performs better in the setting of the latter paper as well as generalizes to settings that are outside the assumptions of [KSS+20].
Researcher Affiliation	Collaboration	Ayush Jain Granica Computing Inc. ayush.jain@granica.ai Rajat Sen Google Research senrajat@google.com Weihao Kong Google Research weihaokong@google.com Abhimanyu Das Google Research abhidas@google.com Alon Orlitsky UC San Diego alon@ucsd.edu
Pseudocode	Yes	Algorithm 1 MAINALGORITHM ... Algorithm 2 GRADEST ... Algorithm 3 SELECTING THE REGRESSION VECTOR
Open Source Code	Yes	We also include the scripts to reproduce all our results in supplementary material.
Open Datasets	No	The paper uses 'simulated datasets' as described in Section 4 and Appendix L. While the generation process is detailed for reproducibility, it does not provide concrete access information (e.g., a link or citation to an existing repository) for a pre-existing, publicly available dataset.
Dataset Splits	No	The paper mentions 'training and test details' in the NeurIPS Paper Checklist, but does not explicitly provide information about 'validation' dataset splits or a validation set being used in its experiments.
Hardware Specification	Yes	Our experiments are not compute intensive and were run on Macbook pro 16 inch laptop with 16 GB ram and intel processor (2020 model) and each of them took less than 6 hours to finish.
Software Dependencies	No	The paper states that scripts are included in the supplementary material but does not explicitly list specific software dependencies with their version numbers.
Experiment Setup	Yes	We fix data dimension d = 100, α = 1/16, the number of small batches to \|Bs\| = min{8dk2, 8d/α2} and the number of medium batches to \|Bm\| = 256. In all the plots, we averaged over 10 runs and report the standard error.