A Simple Cache Model for Image Recognition

Authors: Emin Orhan

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We used deep Res Net [5] and Dense Net [7] models trained on the CIFAR-10, CIFAR-100, and Image Net (ILSVRC2012) datasets. Standard data augmentation was used to double the training set size for the CIFAR-10 and CIFAR-100 datasets. For CIFAR-10 and CIFAR-100, we used the training set to train the models, the validation set to optimize the hyper-parameters of the cache models and finally reported the error rates on the test set. For the Image Net dataset, we took a pre-trained Res Net50 model, split the validation set into two, used the first half to optimize the hyper-parameters of the cache models and reported the error rates on the second half. We compared the performance of the baseline models with the performance of two types of cache model. Table 1 shows the error rates of the models with or without a cache component on the three datasets.
Researcher Affiliation Academia Emin Orhan aeminorhan@gmail.com Baylor College of Medicine & New York University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Figure 1 shows a
Open Source Code No The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available.
Open Datasets Yes As our baseline models, we used deep Res Net [5] and Dense Net [7] models trained on the CIFAR-10, CIFAR-100, and Image Net (ILSVRC2012) datasets. Standard data augmentation was used to double the training set size for the CIFAR-10 and CIFAR-100 datasets. For CIFAR-10 and CIFAR-100, we used all items in the (augmented) training set to generate the keys stored in the cache (90K items in total).
Dataset Splits Yes For CIFAR-10 and CIFAR-100, we used the training set to train the models, the validation set to optimize the hyper-parameters of the cache models and finally reported the error rates on the test set. For the Image Net dataset, we took a pre-trained Res Net50 model, split the validation set into two, used the first half to optimize the hyper-parameters of the cache models and reported the error rates on the second half.
Hardware Specification No The paper mentions a
Software Dependencies No The paper mentions that
Experiment Setup Yes The model thus has only two hyper-parameters, i.e. θ and λ, which we optimize through a simple grid-search procedure on held-out validation data (we search over the ranges 10 θ 90 and 0.1 λ 0.9). For the CIFAR-10 and CIFAR-100 datasets, we used all items in the (augmented) training set to generate the keys stored in the cache (90K items in total). For the Image Net dataset, using the entire training set was not computationally feasible, hence we used a random subset of 275K items (275 items per class) from the training set to generate the keys.