DNNR: Differential Nearest Neighbors Regression

Authors: Youssef Nader, Leon Sixt, Tim Landgraf

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a large-scale evaluation on over 250 datasets, we find that DNNR performs comparably to state-of-the-art gradient boosting methods and MLPs while maintaining the simplicity and transparency of KNN. This allows us to derive theoretical error bounds and inspect failures. In times that call for transparency of ML models, DNNR provides a good balance between performance and interpretability.2
Researcher Affiliation Academia 1Department of Computer Science, Freie Universit at Berlin, Germany. Correspondence to: Youssef Nader <youssef.nader@fu-berlin.de>, Leon Sixt <leon.sixt@fu-berlin.de>, Tim Landgraf <tim.landgraf@fu-berlin.de>.
Pseudocode Yes Algorithm 1 Pseudocode of DNNR s prediction for a query point X. The feature scaling is omitted in this pseudocode. The OLS function solves an ordinary least-squares problem.
Open Source Code Yes For code, see https://github.com/younader/ DNNR paper code
Open Datasets Yes Benchmark Datasets: The goal of this benchmark is to inspect DNNR s performance on eight real-world regression datasets: Yacht, California, Protein, Airfoil, Concrete, Sarcos, CO2 Emissions, and NOX emissions. The last two datasets are part of the Gas Emission dataset. All datasets were taken from the UCI repository (Dua & Graff, 2017), except California (Kelley Pace & Barry, 1997) and Sarcos (Rasmussen & Williams, 2006). These datasets were also used in previous work (Bui et al., 2016). Feynman Benchmark As the second benchmark, we selected the Feynman Symbolic Regression Database, which consists of 119 datasets sampled from classical and quantum physics equations (Udrescu & Tegmark, 2020). PMLB Benchmark The PMLB benchmark contains real and synthetic datasets with categorical features, discrete targets, and noisy data in general. In total, we used 133 PMLB datasets.
Dataset Splits Yes Each model, except Tab Net, was optimized using a grid search over multiple parameters, and the models were refit using their best parameters on the validation data before test inference. For the Feynman benchmark, the evaluation was executed with 10 different splits for each dataset and noise level (std=0, 0.001, 0.01) similar to (Cava et al., 2021). For the first split, we divided the data into 70/5/25% train, validation, and test sets. The hyperparameter tuning was done with validation data of the first split.
Hardware Specification Yes We thank the HPC Service of ZEDAT, Freie Universit at Berlin, for generous allocations of computation time (Bennett et al., 2020). Approx. 4.1k CPU hours were used to run the experiments.
Software Dependencies No The paper does not explicitly state specific software dependencies with version numbers, such as Python versions or library versions (e.g., PyTorch, scikit-learn).
Experiment Setup Yes Each model, except Tab Net, was optimized using a grid search over multiple parameters, and the models were refit using their best parameters on the validation data before test inference. We ensured that each method had a comparable search space, which are listed in Appendix D).