Task-Agnostic Undesirable Feature Deactivation Using Out-of-Distribution Data
Authors: Dongmin Park, Hwanjun Song, Minseok Kim, Jae-Gil Lee
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the task-agnostic nature of TAUFE, we rigorously validate its performance on three tasks, classification, regression, and a mix of them, on CIFAR-10, CIFAR-100, Image Net, CUB200, and CAR datasets. The results demonstrate that TAUFE consistently outperforms the state-of-the-art method as well as the baselines without regularization. |
| Researcher Affiliation | Collaboration | Dongmin Park1, Hwanjun Song2, Min Seok Kim1, Jae-Gil Lee1 1 KAIST, 2 NAVER AI Lab Republic of Korea |
| Pseudocode | Yes | Algorithm 1 describes the overall procedure of TAUFE, which is self-explanatory. Algorithm 1 TAUFE |
| Open Source Code | Yes | For reproducibility, we provide the source code at https://github.com/kaist-dmlab/TAUFE. |
| Open Datasets | Yes | We choose CIFAR-10, CIFAR-100 [23], and Image Net [24] for the target in-distribution data. For the CIFAR datasets, two out-of-distribution datasets are carefully mixed for evaluation LSUN [25],... and SVHN [26],... A large-scale collection of place scene images with 365 classes, Places365 [27], is also used as another OOD data for Image Net-10. |
| Dataset Splits | No | The paper mentions a grid search for hyperparameters, implying the use of a validation set, but does not specify the exact split percentages or sample counts for training, validation, and test sets. For example: 'The value of λ is set to be 0.1 and 0.01 for CIFARs and Image Net-10, respectively, where the best values are obtained via a grid search.' |
| Hardware Specification | Yes | All methods are implemented with Py Torch 1.8.0 and executed using four NVIDIA Tesla V100 GPUs. |
| Software Dependencies | Yes | All methods are implemented with Py Torch 1.8.0 and executed using four NVIDIA Tesla V100 GPUs. |
| Experiment Setup | Yes | For CIFAR datasets, Res Net-18 [28] is trained from scratch for 200 epochs using SGD with a momentum of 0.9, a batch size of 64, a weight decay of 0.0005. To support the original resolution, we drop the first pooling layer and change the first convolution layer with a kernel size of 3, a stride size of 1, and a padding size of 1. An initial learning rate of 0.1 is decayed by a factor of 10 at 100-th and 150-th epochs, following the same configuration in OAT [8]. |