A Simple Cache Model for Image Recognition
Authors: Emin Orhan
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We used deep Res Net [5] and Dense Net [7] models trained on the CIFAR-10, CIFAR-100, and Image Net (ILSVRC2012) datasets. Standard data augmentation was used to double the training set size for the CIFAR-10 and CIFAR-100 datasets. For CIFAR-10 and CIFAR-100, we used the training set to train the models, the validation set to optimize the hyper-parameters of the cache models and finally reported the error rates on the test set. For the Image Net dataset, we took a pre-trained Res Net50 model, split the validation set into two, used the first half to optimize the hyper-parameters of the cache models and reported the error rates on the second half. We compared the performance of the baseline models with the performance of two types of cache model. Table 1 shows the error rates of the models with or without a cache component on the three datasets. |
| Researcher Affiliation | Academia | Emin Orhan aeminorhan@gmail.com Baylor College of Medicine & New York University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. Figure 1 shows a |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | Yes | As our baseline models, we used deep Res Net [5] and Dense Net [7] models trained on the CIFAR-10, CIFAR-100, and Image Net (ILSVRC2012) datasets. Standard data augmentation was used to double the training set size for the CIFAR-10 and CIFAR-100 datasets. For CIFAR-10 and CIFAR-100, we used all items in the (augmented) training set to generate the keys stored in the cache (90K items in total). |
| Dataset Splits | Yes | For CIFAR-10 and CIFAR-100, we used the training set to train the models, the validation set to optimize the hyper-parameters of the cache models and finally reported the error rates on the test set. For the Image Net dataset, we took a pre-trained Res Net50 model, split the validation set into two, used the first half to optimize the hyper-parameters of the cache models and reported the error rates on the second half. |
| Hardware Specification | No | The paper mentions a |
| Software Dependencies | No | The paper mentions that |
| Experiment Setup | Yes | The model thus has only two hyper-parameters, i.e. θ and λ, which we optimize through a simple grid-search procedure on held-out validation data (we search over the ranges 10 θ 90 and 0.1 λ 0.9). For the CIFAR-10 and CIFAR-100 datasets, we used all items in the (augmented) training set to generate the keys stored in the cache (90K items in total). For the Image Net dataset, using the entire training set was not computationally feasible, hence we used a random subset of 275K items (275 items per class) from the training set to generate the keys. |