Linear Regression using Heterogeneous Data Batches
Authors: Ayush Jain, Rajat Sen, Weihao Kong, Abhimanyu Das, Alon Orlitsky
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare our algorithm with the one in [KSS+20] on simulated datasets, to show that our algorithm performs better in the setting of the latter paper as well as generalizes to settings that are outside the assumptions of [KSS+20]. |
| Researcher Affiliation | Collaboration | Ayush Jain Granica Computing Inc. ayush.jain@granica.ai Rajat Sen Google Research senrajat@google.com Weihao Kong Google Research weihaokong@google.com Abhimanyu Das Google Research abhidas@google.com Alon Orlitsky UC San Diego alon@ucsd.edu |
| Pseudocode | Yes | Algorithm 1 MAINALGORITHM ... Algorithm 2 GRADEST ... Algorithm 3 SELECTING THE REGRESSION VECTOR |
| Open Source Code | Yes | We also include the scripts to reproduce all our results in supplementary material. |
| Open Datasets | No | The paper uses 'simulated datasets' as described in Section 4 and Appendix L. While the generation process is detailed for reproducibility, it does not provide concrete access information (e.g., a link or citation to an existing repository) for a pre-existing, publicly available dataset. |
| Dataset Splits | No | The paper mentions 'training and test details' in the NeurIPS Paper Checklist, but does not explicitly provide information about 'validation' dataset splits or a validation set being used in its experiments. |
| Hardware Specification | Yes | Our experiments are not compute intensive and were run on Macbook pro 16 inch laptop with 16 GB ram and intel processor (2020 model) and each of them took less than 6 hours to finish. |
| Software Dependencies | No | The paper states that scripts are included in the supplementary material but does not explicitly list specific software dependencies with their version numbers. |
| Experiment Setup | Yes | We fix data dimension d = 100, α = 1/16, the number of small batches to |Bs| = min{8dk2, 8d/α2} and the number of medium batches to |Bm| = 256. In all the plots, we averaged over 10 runs and report the standard error. |