reproducibilityindex.ai

DNNR: Differential Nearest Neighbors Regression

Authors: Youssef Nader, Leon Sixt, Tim Landgraf

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In a large-scale evaluation on over 250 datasets, we ﬁnd that DNNR performs comparably to state-of-the-art gradient boosting methods and MLPs while maintaining the simplicity and transparency of KNN. This allows us to derive theoretical error bounds and inspect failures. In times that call for transparency of ML models, DNNR provides a good balance between performance and interpretability.2
Researcher Affiliation	Academia	1Department of Computer Science, Freie Universit at Berlin, Germany. Correspondence to: Youssef Nader <youssef.nader@fu-berlin.de>, Leon Sixt <leon.sixt@fu-berlin.de>, Tim Landgraf <tim.landgraf@fu-berlin.de>.
Pseudocode	Yes	Algorithm 1 Pseudocode of DNNR s prediction for a query point X. The feature scaling is omitted in this pseudocode. The OLS function solves an ordinary least-squares problem.
Open Source Code	Yes	For code, see https://github.com/younader/ DNNR paper code
Open Datasets	Yes	Benchmark Datasets: The goal of this benchmark is to inspect DNNR s performance on eight real-world regression datasets: Yacht, California, Protein, Airfoil, Concrete, Sarcos, CO2 Emissions, and NOX emissions. The last two datasets are part of the Gas Emission dataset. All datasets were taken from the UCI repository (Dua & Graff, 2017), except California (Kelley Pace & Barry, 1997) and Sarcos (Rasmussen & Williams, 2006). These datasets were also used in previous work (Bui et al., 2016). Feynman Benchmark As the second benchmark, we selected the Feynman Symbolic Regression Database, which consists of 119 datasets sampled from classical and quantum physics equations (Udrescu & Tegmark, 2020). PMLB Benchmark The PMLB benchmark contains real and synthetic datasets with categorical features, discrete targets, and noisy data in general. In total, we used 133 PMLB datasets.
Dataset Splits	Yes	Each model, except Tab Net, was optimized using a grid search over multiple parameters, and the models were reﬁt using their best parameters on the validation data before test inference. For the Feynman benchmark, the evaluation was executed with 10 different splits for each dataset and noise level (std=0, 0.001, 0.01) similar to (Cava et al., 2021). For the ﬁrst split, we divided the data into 70/5/25% train, validation, and test sets. The hyperparameter tuning was done with validation data of the ﬁrst split.
Hardware Specification	Yes	We thank the HPC Service of ZEDAT, Freie Universit at Berlin, for generous allocations of computation time (Bennett et al., 2020). Approx. 4.1k CPU hours were used to run the experiments.
Software Dependencies	No	The paper does not explicitly state specific software dependencies with version numbers, such as Python versions or library versions (e.g., PyTorch, scikit-learn).
Experiment Setup	Yes	Each model, except Tab Net, was optimized using a grid search over multiple parameters, and the models were reﬁt using their best parameters on the validation data before test inference. We ensured that each method had a comparable search space, which are listed in Appendix D).