Fast and Compact Bilinear Pooling by Shifted Random Maclaurin
Authors: Tan Yu, Xiaoyun Li, Ping Li3243-3251
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Systematic experiments conducted on four public datasets demonstrate the effectiveness and efficiency of the proposed FCBN. |
| Researcher Affiliation | Industry | Tan Yu, Xiaoyun Li, Ping Li Cognitive Computing Lab Baidu Research 10900 NE 8th St. Bellevue, WA 98004, USA {tanyuuynat,lixiaoyun996, pingli98}@gmail.com |
| Pseudocode | Yes | Algorithm 1 Random Maclaurin (RM) ... Algorithm 2 Tensor Sketch (TS) ... Algorithm 3 Shifted Random Maclaurin (SRM) |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | In fine-grained recognition task, we test on Caltech-UCSD birds (CUB) dataset (Welinder et al. 2010) and FGVC-Aircraft Benchmark (Aircraft) (). ... In scene recognition task, we test on MIT-scene dataset (MIT) (Quattoni and Torralba 2009)... In texture recognition task, we test on Describable Texture (DTD) (Cimpoi et al. 2014)... |
| Dataset Splits | No | CUB contains 5, 994 training images and 5, 794 testing images from 200 categories. Aircraft contains 6, 667 training images and 3, 333 testing images from 100 categories. MIT-scene dataset (MIT) (Quattoni and Torralba 2009), which contains 4, 014 training images and 1, 339 testing images from 67 classes. In texture recognition task, we test on Describable Texture (DTD) (Cimpoi et al. 2014), containing 1, 880 training images and 3760 testing images from 47 classes. No explicit validation split information is provided. |
| Hardware Specification | No | The paper states, 'We implement all methods in the same server,' but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using a 'VGG-16 network' and 'autograd tool in existing deep learning frameworks' but does not specify any software names with version numbers. |
| Experiment Setup | Yes | On all experiments, we use 448 448 input image size and obtain a 28 28 512 feature map. The training is conducted through two stages. In the first stage, all layers except the last fully-connected (FC) layer are fixed and only weights of the last FC layer are updated. The batch size is 32, the initial learning rate is set as 1. After 30 epochs, we drop learning rate by 10 every 10 epochs until the 60-th epoch. In the second stage, we update weights of all layers. The batch size is set as 32, the initial learning rate is set as 1 10 2. After 30 epochs, the learning rate is decreased to 1 10 3 and the training process finishes in 40 epochs. |