Deep Submodular Peripteral Networks
Authors: Gantavya Bhatt, Arnav Das, Jeff A Bilmes
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments We evaluate the effectiveness of our framework by training a DSPN to emulate a costly target oracle FL function. We assert that DSPN training is deemed successful if the subsets recovered by maximizing the learnt DSPN are (1) assigned high values by the target function ( 5.1); and (2) are usable for a real downstream task such as training a predictive model ( 5.2). |
| Researcher Affiliation | Academia | University of Washington, Seattle, WA 98195 |
| Pseudocode | No | The paper describes methods textually and with diagrams but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | We plan to release the codebase in a separate Git Hub repository under the name bhattg/Deep Submodular Peripteral Networks. |
| Open Datasets | Yes | Datasets: We consider four image classification datasets: Imagenette, Imagewoof, CIFAR100, and Imagenet100. |
| Dataset Splits | Yes | During the evaluation, both learned set functions and target oracle use a held-out ground set V of images, that is, V V = .We search over β {0.01, 0.1, 0.5} and τ {1, 5, 10} and report the normalized target evaluation for Imagenet100 results in Table 4 and Table 5. We select which hyperparameter configuration to use based on normalized FL evaluation which is described in Figure 4. |
| Hardware Specification | Yes | We train every DSPN on 2 NVIDIA-A100 GPUs with a cyclic learning rate scheduler [94], using Adam optimizer [47]. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'PyTorch [78]' but does not provide specific version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | We list the hyperparameters we use for the peripteral loss in table 2 and table 3. We observed that β and τ are the most salient hyperparameters since they control the curvature and hinge of the loss respectively. |