reproducibilityindex.ai

TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks

Authors: Benjamin Feuer, Robin Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen, Colin White

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on nineteen algorithms over 98 datasets and find that Tune Tables achieves the best performance on average, outperforming boosted trees such as Cat Boost, while optimizing fewer than 5% of Tab PFN s parameters.
Researcher Affiliation	Collaboration	1 New York University, 2 University of Freiburg, 3 University of Maryland, 4 Abacus.AI
Pseudocode	No	The paper describes implementation details and the algorithm in prose and tables (e.g., Appendix D), but it does not include a formal pseudocode block or algorithm listing.
Open Source Code	Yes	We open-source our code and raw results at https://github.com/penfever/Tune Tables.
Open Datasets	Yes	We run the algorithms on the Tab Zilla Benchmark Suite introduced in [51]. This suite consists of 98 classification datasets from Open ML [74] with a diversity of sizes and number of features [51]. See Table 4 in Appendix C for a list of all datasets with their statistics.
Dataset Splits	Yes	For all algorithms, we report the test performance of the hyperparameter set with the best performance on the validation set and cross-validate on three train/test folds from Open ML. ... We validate our tuned prompts every epoch on a subset of the entire validation set if the validation set is large.
Hardware Specification	Yes	We conduct our experiments on an NVIDIA L4 TPU with 24GB VRAM.
Software Dependencies	No	The paper mentions software like 'Py Torch', 'Optuna', 'Cat Boost', 'LightGBM', 'XGBoost', and 'scikit-learn' but does not provide specific version numbers for these dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	For all algorithms other than Tune Tables, we perform light hyperparameter tuning by running one default setting and 29 iterations of random search using Optuna [4]; see Appendix C for details. Following [51], all of the algorithms come with their default set of hyperparameters used in the official implementation, and we used all of these settings. For Tune Tables, we optimize via a grid search described in Appendix D. ... We fit for up to 100 epochs with early stopping. ... Table 5: Tune Tables and Tab PFNs3000 hyperparameter configurations based on number of samples.