Rejection via Learning Density Ratios
Authors: Alexander Soen, Hisham Husain, Philip Schulz, Vu Nguyen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our framework is tested empirically over clean and noisy datasets. |
| Researcher Affiliation | Collaboration | Alexander Soen Amazon The Australian National University alexander.soen@anu.edu.au Hisham Husain hisham.husain@protonmail.com Philip Schulz Amazon phschulz@amazon.com Vu Nguyen Amazon vutngn@amazon.com |
| Pseudocode | Yes | Algorithm 1 Density-Ratio Rejection |
| Open Source Code | Yes | Our rejector s code public at: https://github.com/alexandersoen/density-ratio-rejection. |
| Open Datasets | Yes | We consider 6 multiclass classification datasets. For tabular datasets, we consider the gas drift dataset [68] and the human activity recognition (HAR) dataset [5]...the MNIST image dataset [46] (10 classes)...CIFAR-10 [44] (10 classes); and Org MNIST / Organ SMNIST (11 classes) and Oct MNIST (4 classes) from the Med MNIST collection [69, 70]. All datasets we consider are in the public domain, e.g., UCI [6]. |
| Dataset Splits | Yes | In our case, we have a fixed ρ which allows easy tuning of τ given a validation dataset, similar to other confidence based rejection approaches, e.g., tuning a threshold for the margin of a classifier [7]. All evaluation uses 5-fold cross validation. |
| Hardware Specification | Yes | All implementation use Py Torch and training was done on a p3.2xlarge AWS instance. |
| Software Dependencies | No | All implementation use Py Torch and training was done on a p3.2xlarge AWS instance. No specific version numbers for software dependencies are provided. |
| Experiment Setup | Yes | For our tests, we fix λ = 1. Throughout our evaluation, we assume that a neural network (NN) model without rejection is accessible for all (applicable) approaches...For our density ratio rejectors, we utilize the log-loss, practical considerations in Section 4.3, and Algorithm 1...Each is trained with 50 equidistant costs c, τ [0, 0.5), except on Oct MNIST which uses 10 equidistant costs...All training utilizes the Adam [41] optimizer. (And specific network architectures, batch sizes, epochs, learning rates are detailed in Appendix Q.I). |