Learning multiple visual domains with residual adapters
Authors: Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed architectures are thoroughly evaluated empirically (section 5). To this end, our second contribution is to introduce the visual decathlon challenge (fig. 1 and section 4), a new benchmark for multiple-domain learning in image recognition. |
| Researcher Affiliation | Academia | Sylvestre-Alvise Rebuffi1 Hakan Bilen1,2 Andrea Vedaldi1 1 Visual Geometry Group University of Oxford {srebuffi,hbilen,vedaldi}@robots.ox.ac.uk 2 School of Informatics University of Edinburgh |
| Pseudocode | No | The paper describes architectural components and processes but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | No | We are planning to make the data and an evaluation server public soon. |
| Open Datasets | Yes | The decathlon challenge combines ten well-known datasets from multiple visual domains: FGVC-Aircraft Benchmark [24]...CIFAR100 [19]...Daimler Mono Pedestrian Classification Benchmark (DPed) [26]...Describable Texture Dataset (DTD) [7]...The German Traffic Sign Recognition (GTSR) Benchmark [36]...Flowers102 [28]...ILSVRC12 (Im Net) [32]...Omniglot [20]...The Street View House Numbers (SVHN) [27]...UCF101 [35]. |
| Dataset Splits | Yes | For each dataset, we specify a training, validation and test subsets. ... For the rest, we use 60%, 20% and 20% of the data for training, validation, and test respectively. For the ILSVRC12, since the test labels are not available, we use the original validation subset as the test subset and randomly sample a new validation set from their training split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions using 'Res Nets' but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch, TensorFlow, scikit-learn versions). |
| Experiment Setup | Yes | In all experiments we choose to use the powerful Res Nets [13] as base architectures...we chose the Res Net28 model [40] which consists of three blocks of four residual units. Each residual unit contains 3 3 convolutional, BN and Re LU modules (fig. 2). The network accepts 64 64 images as input, downscales the spatial dimensions by two at each block and ends with a global average pooling and a classifier layer followed by a softmax. We set the number of filters to 64, 128, 256 for these blocks respectively. Each network is optimized to minimize its cross-entropy loss with stochastic gradient descent. The network is run for 80 epochs and the initial learning rate of 0.1 is lowered to 0.01 and then 0.001 gradually. |