Group-Wise Dynamic Dropout Based on Latent Semantic Variations

Authors: Zhiwei Ke, Zhiwei Wen, Weicheng Xie, Yi Wang, Linlin Shen11229-11236

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The proposed approach was evaluated in comparison with the baseline and several state-of-the-art adaptive dropouts over four public datasets of Fashion-MNIST, CIFAR-10, CIFAR-100 and SVHN.
Researcher Affiliation Academia 1Computer Vision Institute, Shenzhen University, Shenzhen, China 2Dongguan University of Technology, Dongguan, China 3Shenzhen Institute of Artificial Intelligence and Robotics for Society, PR China 4Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, PR China
Pseudocode Yes Algorithm 1 outlines the main steps of our feature density estimation.
Open Source Code No The paper does not include an unambiguous statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The Fashion-MNIST (FM) (Xiao, Rasul, and Vollgraf 2017) is a dataset of Zalando s article images consisting of 60k training samples and 10k testing samples. [...] The CIFAR-10 (C10) (Krizhevsky and Hinton 2009) dataset contains 60k color images belonging to 10 classes [...]. CIFAR-100 (C100) (Krizhevsky and Hinton 2009) is a database with 100 classes and also has 50k training samples and 10k testing samples. The Street View House Numbers (SVHN) (Netzer et al. 2011) dataset contains 73,257 training samples and 26,032 testing samples.
Dataset Splits No The paper provides training and testing split sizes (e.g., 60k training, 10k testing for Fashion-MNIST) but does not explicitly mention a separate validation dataset split or how it was derived from the training set if used.
Hardware Specification Yes We run our experiments with Res Net-18 (He et al. 2016) of 512 neurons in the last FC layer on a 4-kernel Nvidia TITAN GPU Card.
Software Dependencies No The paper mentions software components like "SGD optimizer" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The batch size and learning rate are 64 and 0.01, respectively. [...] The projected features are partitioned into 25, 52 = 25 and 33 = 27 equally-spaced grids in the 1D, 2D and 3D projection space, respectively. [...] In particular, we normalized the image intensity from Fashion MNIST with 0-1 normalization and that from CIFAR-100 with z-score normalization for image processing. Taking FGSM for example, in L norm, the normalized value of perturbation intensity ε = 0.03,0.06,0.12 corresponds to changing 8,16,32 pixels for the Fashion-MNIST images while ε = 0.015,0.03,0.06 corresponds to changing 1,2,4 pixels for the CIFAR-100 images, respectively. The iteration step in BIM and PGD is set to 10 with a step size of ε/10.