Anchor Data Augmentation

Authors: Nora Schneider, Shirin Goshtasbpour, Fernando Perez-Cruz

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 4 reports empirical evidence that our approach can improve predictions, especially in over-parameterized settings.
Researcher Affiliation Academia 1Computer Science Department, ETH Zurich, Zurich, Switzerland 2Swiss Data Science Center, Zurich, Switzerland nschneide@student.ethz.ch shirin.goshtasbpour@inf.ethz.ch fernando.perezcruz@sdsc.ethz.ch
Pseudocode Yes 3.3 Algorithm Algorithm 1 ADA: Minibatch generation 1: Input: L training data points (X, Y ); prior distribution for γ: p(γ) L q binary matrix A with a one per row indicating the clustering assignment for each sample. 2: Output: ( X, Y ) 3: Sample γ from p(γ) 4: Projection matrix: A A(AT A) AT 5: for i = 0 to row of X do 6: X(i) γ,A X(i)+(pγ 1)( A)(i)X 1+(pγ 1) Pj( A)(ij) , 7: Y(i) γ,A Y(i)+(pγ 1)( A)(i)Y 1+(pγ 1) Pj( A)(ij) 8: end for 9: return ( Xγ,A, Y γ,A)
Open Source Code Yes 1Our Python implementation of ADA is available at: https://github.com/noraschneider/anchordataaugmentation/
Open Datasets Yes Data: We use the California housing dataset [19] and the Boston housing dataset [14]. Data: We use four of the five in-distribution datasets used in [49]. The validation and test data are expected to follow the same distribution as the training data. Airfoil Self-Noise (Airfoil) and NO2 [24] are both tabular datasets, whereas Exchange-Rate and Electricity [27] are time series datasets.
Dataset Splits Yes The validation set has 100,000 samples. We divide the datasets into train-, validation and test data randomly, as the authors of C-Mixup did.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or detailed computer specifications used for running its experiments.
Software Dependencies No The paper mentions 'Our Python implementation of ADA' but does not specify version numbers for Python or any specific software libraries or dependencies (e.g., PyTorch, TensorFlow, scikit-learn) required to replicate the experiment.
Experiment Setup Yes For the ADA and Local-Mixup experiments, we use hyperparameter tuning and grid search to find the optimal training (batch size, learning rate, and number of epochs), and Local-Mixup parameters (distance threshold ) and ADA parameters (number of clusters, range of γ, and whether to use manifold augmentation). To be precise, we define 2 {2, 4, 6, 8, 10} and specify βi = 1 + 1 k/2 i (with i 2 {1, ..., k/2}) and γ 2 { 1 , 1 βk/2 1 , ..., 1 β1 , 1, β1, ..., βk/2 1, }. A is constructed using k-means clustering with q = 8.