Dataset Augmentation in Feature Space
Authors: Terrance DeVries, Graham W. Taylor
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In all experiments, we trained a LSTM-based sequence autoencoder in order to learn a feature space from the available training examples. The results of our tests are summarized in Table 1. |
| Researcher Affiliation | Academia | Terrance De Vries and Graham W. Taylor School of Engineering University of Guelph Guelph, ON N1G 2W1, Canada {terrance,gwtaylor}@uoguelph.ca |
| Pseudocode | No | The paper describes algorithmic steps and equations (e.g., Equation 1, 2, 3) but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any statement about making its source code publicly available or provide a link to a code repository. |
| Open Datasets | Yes | The UJI Pen Characters dataset (v2) contains 11,640 instances of 97 different characters handwritten by 60 participants (Llorens et al., 2008). For our first quantitative test we use the Arabic Digits dataset (Lichman, 2013) which contains 8,800 samples of time series mel-frequency cepstrum coefficients (MFCCs) extracted from audio clips of spoken Arabic digits. [...] URL http://archive.ics.uci.edu/ml. Our second quantitative test was conducted on the Australian Sign Language Signs dataset (AUSLAN). AUSLAN was produced by Kadous (2002) The final time series dataset we considered was the UCF Kinect action recognition dataset (Ellis et al., 2013). In our experiments we consider two commonly used small-scale image datasets: MNIST and CIFAR-10. |
| Dataset Splits | Yes | The autoencoders were trained using Adam (Kingma & Ba, 2015) with an initial learning rate of 0.001, which was reduced by half whenever no improvement was observed in the validation set for 10 epochs. To evaluate our data augmentation techniques we used the official train/test split and trained ten models with different random weight initializations. For evaluation, we perform cross validation with 5 folds, as is common practice for the AUSLAN dataset. To compare to previous results, we used 4-fold cross validation MNIST consists of 28 28 greyscale images containing handwritten digits from 0 to 9. There are 60,000 training images and 10,000 test images in the official split. CIFAR-10 consists of 32 32 colour images containing objects in ten generic object categories. This dataset is typically split into 50,000 training and 10,000 test images. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory, or detailed computer specifications) used for running experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions 'Adam' as an optimizer and 'LSTM' as network components, but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | Each hidden layer, including the context vector, had the same number of hidden units and a dropout probability of p = 0.2. The autoencoders were trained using Adam (Kingma & Ba, 2015) with an initial learning rate of 0.001, which was reduced by half whenever no improvement was observed in the validation set for 10 epochs. For each sample in the dataset we found the 10 nearest in-class neighbours by searching in feature space. Both MLP and SA use the same number of hidden units in each layer: 256 per layer for MNIST and 1024 per layer for CIFAR-10. ... For both extrapolation experiments we use three nearest neighbours per sample and γ = 0.5 when generating new data. |