Coresets for Regressions with Panel Data
Authors: Lingxiao Huang, K Sudhir, Nisheeth Vishnoi
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we assess our approach with a synthetic and a real-world datasets; the coreset sizes constructed using our approach are much smaller than the full dataset and coresets indeed accelerate the running time of computing the regression objective. We implement our coreset algorithms for GLSE, and compare the performance with uniform sampling on synthetic datasets and a real-world dataset. |
| Researcher Affiliation | Collaboration | Lingxiao Huang Huawei K. Sudhir Yale University Nisheeth K. Vishnoi Yale University |
| Pseudocode | Yes | Algorithm 1: CGLSE: Coreset construction of GLSE. Algorithm 2: CGLSEk: Coreset construction of GLSEk. |
| Open Source Code | Yes | 1Codes are in https://github.com/huanglx12/Coresets-for-regressions-with-panel-data. |
| Open Datasets | No | The paper describes the synthetic and real-world datasets used, but does not provide concrete access information (e.g., URL, DOI, specific citation for public availability) for either dataset. |
| Dataset Splits | No | The paper mentions running experiments on the 'full dataset' and 'coresets' but does not specify any train/validation/test splits, cross-validation, or other data partitioning strategies used for reproduction. |
| Hardware Specification | Yes | The experiments are conducted by Py Charm on a 4-Core desktop CPU with 8GB RAM. |
| Software Dependencies | No | The paper mentions using PyCharm as an IDE and implementing IRLS, but does not provide specific version numbers for any programming languages or libraries. |
| Experiment Setup | Yes | We vary ε = 0.1, 0.2, 0.3, 0.4, 0.5 and generate 100 independent random tuples ζ = (β, ρ) Rd+q (the same as described in the generation of the synthetic dataset). For each ε, we run our algorithm CGLSE and Uni to generate coresets. We also implement IRLS [32] for solving GLSE. We run IRLS on both the full dataset and coresets and record the runtime. |