reproducibilityindex.ai

Data subsampling for Poisson regression with pth-root-link

Authors: Han Cheng Lie, Alexander Munteanu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	G.3 Experimental illustration. All experiments were run on a commodity machine with Intel Core i7-7700K processor (4 cores, 4.2GHz, 32GB RAM) and took overall around 50 minutes to complete. The Python code of [34] was adapted to the Poisson regression setting.4 We applied it with the appropriate p {1, 2} to the datasets with dimensions n = 100 000, d = 7 generated as detailed in the previous section. We compared our method to uniform sampling as a baseline.
Researcher Affiliation	Academia	Han Cheng Lie Institut für Mathematik Universität Potsdam Germany hanlie@uni-potsdam.de. Alexander Munteanu Department of Statistics TU Dortmund University Germany alexander.munteanu@tu-dortmund.de
Pseudocode	Yes	G.1 Pseudocode. Here we give pseudocode for our coreset construction Algorithm 1 and for the subsequent optimization procedure Algorithm 2:
Open Source Code	Yes	Our new code is available at https://github.com/Tim907/poisson-regression/.
Open Datasets	No	G.2 Synthetic data generation. We generated for each p {1, 2} a dataset with dimensions n = 100 000, d = 7 with n labels corresponding to each point. (The paper describes generating synthetic data but does not state that it is publicly available or provide access information for it.)
Dataset Splits	No	The paper discusses 'reduced size' for subsampling but does not explicitly provide details about training, validation, or test dataset splits (percentages, counts, or splitting methodology).
Hardware Specification	Yes	All experiments were run on a commodity machine with Intel Core i7-7700K processor (4 cores, 4.2GHz, 32GB RAM)
Software Dependencies	No	The Python code of [34] was adapted to the Poisson regression setting. (It mentions 'Python code' but does not specify Python version or any other software dependencies with version numbers.)
Experiment Setup	Yes	We applied it with the appropriate p {1, 2} to the datasets with dimensions n = 100 000, d = 7 generated as detailed in the previous section. We compared our method to uniform sampling as a baseline. We varied the reduced size between 50 and 600 in equal increments of size 50. For each reduced size and each method, we performed 201 independent repetitions.