DIVA: Dataset Derivative of a Learning Task
Authors: Yonatan Dukler, Alessandro Achille, Giovanni Paolini, Avinash Ravichandran, Marzia Polito, Stefano Soatto
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To illustrate the flexibility of DIVA, we report experiments on sample auto-curation tasks such as outlier rejection, dataset extension, and automatic aggregation of multi-modal data. |
| Researcher Affiliation | Collaboration | Yonatan Dukler1,2 , Alessandro Achille1, Giovanni Paolini1, Avinash Ravichandran1, Marzia Polito1, Stefano Soatto1 1 Amazon Web Services, {aachille, paoling, ravinash, mpolito, soattos}@amazon.com 2 Department of Mathematics, University of California, Los Angeles ydukler@math.ucla.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about the release of source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | For our experiments we use the CUB-200 (Welinder et al., 2010), FGVC-Aircraft, (Maji et al., 2013), Stanford Cars (Krause et al., 2013), Caltech-256 (Griffin et al., 2007), Oxford Flowers 102 (Nilsback & Zisserman, 2008), MIT-67 Indoor (Quattoni & Torralba, 2009), Street View House Number (Netzer et al., 2011), and the Oxford Pets (Parkhi et al., 2012) visual recognition and classification datasets. |
| Dataset Splits | Yes | For Ren et al. (2018), we set aside 20% of the training samples as validation for the reweight step, but use all samples for the final training (in parentheses). We can, of course, compute a validation loss using a separate validation set. However, as we will see in Section 3.3, we can also use a leave-one-out cross-validation loss directly on the training set, without any requirement of a separate validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using specific models and pre-trained networks but does not provide specific software dependencies with version numbers (e.g., programming language, libraries, or frameworks with their respective versions) required for replication. |
| Experiment Setup | Yes | In all experiments, we use the network as a fixed feature extractor, and train a linear classifier on top of the network features using the weighted L2 loss eq. (5) and optimize the weights using DIVA. ... We apply only 1-3 gradient optimization steps with a relatively large learning rate ' 0.1. This early stopping both regularizes the solution and decreases the wall-clock time required by the method. We initialize so that i = 1 for all samples. |