MEAL: Multi-Model Ensemble via Adversarial Learning

Authors: Zhiqiang Shen, Zhankui He, Xiangyang Xue4886-4893

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on CIFAR-10/100, SVHN and Image Net datasets demonstrate the effectiveness of our MEAL method.
Researcher Affiliation Academia 1Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai, China 2Beckman Institute, University of Illinois at Urbana-Champaign, IL, USA 3School of Data Science, Fudan University, Shanghai, China
Pseudocode Yes Algorithm 1 Multi-Model Ensemble via Adversarial Learning (MEAL).
Open Source Code No The paper does not contain an explicit statement offering access to the source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes Extensive experiments on CIFAR-10/100, SVHN and Image Net datasets demonstrate the effectiveness of our MEAL method. (Krizhevsky 2009), (Netzer et al. 2011), (Deng et al. 2009)
Dataset Splits Yes The Street View House Number (SVHN) dataset (Netzer et al. 2011) consists of 32x32 colored digit images...Following previous works (Goodfellow et al. 2013; Huang et al. 2016; 2017a; Liu et al. 2017), we split a subset of 6,000 images for validation, and train on the remaining images without data augmentation. The ILSVRC 2012 classification dataset (Deng et al. 2009) consists of 1000 classes, with a number of 1.2 million training images and 50,000 validation images.
Hardware Specification No The paper mentions the use of the PyTorch platform but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper states, 'We implement our method on the Py Torch (Paszke et al. 2017) platform,' but does not specify the version number of PyTorch or other software dependencies.
Experiment Setup Yes Our whole framework is trained end-to-end by the following objective function: L = αLSim + βLGAN where α and β are trade-off weights. We set them as 1 in our experiments by cross validation. We also use the weighted coefficients to balance the contributions of different blocks. For 3-block networks, we ues [0.01, 0.05, 1], and [0.001, 0.01, 0.05, 0.1, 1] for 5-block ones. We use a probability of 0.2 for drop nodes during training. On CIFAR datasets, the standard training budget is 300 epochs. It appears that more than 400 epochs is the optimal choice and our model will fully converge at about 500 epochs.