Aggregated Learning: A Vector-Quantization Approach to Learning Neural Network Classifiers
Authors: Masoumeh Soflaei, Hongyu Guo, Ali Al-Bashabsheh, Yongyi Mao, Richong Zhang5810-5817
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments, applying Agr Learn to the current art of deep learning architectures for image and text classification. Our experimental results suggest that Agr Learn brings significant gain in classification accuracy. |
| Researcher Affiliation | Academia | 1University of Ottawa, Ottawa, Canada, 2National Research Council Canada, 3Beijing Advanced Institution on Big Data and Brain Computing, Beihang University, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Training in n-fold Agr Learn |
| Open Source Code | Yes | Our implementation of Agr Learn is available at https://github.com/SITE5039/Agr Learn |
| Open Datasets | Yes | Experiments are conducted on the CIFAR-10, CIFAR-100 datasets with two widely used deep network architectures, namely Res Net...using two benchmark sentenceclassification datasets, Movie Review (Pang and Lee 2005) and Subjectivity (Pang and Lee 2004). |
| Dataset Splits | No | The paper specifies training and test splits (e.g., '50,000 training images, 10,000 test images' for CIFAR-10; '10% of random examples in each dataset for testing and the rest for training' for text classification) but does not explicitly mention or detail a separate validation split or set aside for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions frameworks like Res Net, Wide Res Net, CNN, LSTM, and references TensorFlow (Abadi et al. 2016) and specific implementations (Liu 2017, Zagoruyko and Komodakis 2016a), but it does not specify any version numbers for these software components or libraries. |
| Experiment Setup | Yes | We use mini-batched backprop for 400 epochs with exactly the same hyper-parameter settings without dropout. Specifically, weight decay is 10^-4, and each mini-batch contains 64 aggregated training examples. The learning rate for the main network is set to 0.1 initially and decays by a factor of 10 after 100, 150, and 250 epochs. |