Linear Regression using Heterogeneous Data Batches

Authors: Ayush Jain, Rajat Sen, Weihao Kong, Abhimanyu Das, Alon Orlitsky

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our algorithm with the one in [KSS+20] on simulated datasets, to show that our algorithm performs better in the setting of the latter paper as well as generalizes to settings that are outside the assumptions of [KSS+20].
Researcher Affiliation Collaboration Ayush Jain Granica Computing Inc. ayush.jain@granica.ai Rajat Sen Google Research senrajat@google.com Weihao Kong Google Research weihaokong@google.com Abhimanyu Das Google Research abhidas@google.com Alon Orlitsky UC San Diego alon@ucsd.edu
Pseudocode Yes Algorithm 1 MAINALGORITHM ... Algorithm 2 GRADEST ... Algorithm 3 SELECTING THE REGRESSION VECTOR
Open Source Code Yes We also include the scripts to reproduce all our results in supplementary material.
Open Datasets No The paper uses 'simulated datasets' as described in Section 4 and Appendix L. While the generation process is detailed for reproducibility, it does not provide concrete access information (e.g., a link or citation to an existing repository) for a pre-existing, publicly available dataset.
Dataset Splits No The paper mentions 'training and test details' in the NeurIPS Paper Checklist, but does not explicitly provide information about 'validation' dataset splits or a validation set being used in its experiments.
Hardware Specification Yes Our experiments are not compute intensive and were run on Macbook pro 16 inch laptop with 16 GB ram and intel processor (2020 model) and each of them took less than 6 hours to finish.
Software Dependencies No The paper states that scripts are included in the supplementary material but does not explicitly list specific software dependencies with their version numbers.
Experiment Setup Yes We fix data dimension d = 100, α = 1/16, the number of small batches to |Bs| = min{8dk2, 8d/α2} and the number of medium batches to |Bm| = 256. In all the plots, we averaged over 10 runs and report the standard error.