Anchor Data Augmentation
Authors: Nora Schneider, Shirin Goshtasbpour, Fernando Perez-Cruz
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4 reports empirical evidence that our approach can improve predictions, especially in over-parameterized settings. |
| Researcher Affiliation | Academia | 1Computer Science Department, ETH Zurich, Zurich, Switzerland 2Swiss Data Science Center, Zurich, Switzerland nschneide@student.ethz.ch shirin.goshtasbpour@inf.ethz.ch fernando.perezcruz@sdsc.ethz.ch |
| Pseudocode | Yes | 3.3 Algorithm Algorithm 1 ADA: Minibatch generation 1: Input: L training data points (X, Y ); prior distribution for γ: p(γ) L q binary matrix A with a one per row indicating the clustering assignment for each sample. 2: Output: ( X, Y ) 3: Sample γ from p(γ) 4: Projection matrix: A A(AT A) AT 5: for i = 0 to row of X do 6: X(i) γ,A X(i)+(pγ 1)( A)(i)X 1+(pγ 1) Pj( A)(ij) , 7: Y(i) γ,A Y(i)+(pγ 1)( A)(i)Y 1+(pγ 1) Pj( A)(ij) 8: end for 9: return ( Xγ,A, Y γ,A) |
| Open Source Code | Yes | 1Our Python implementation of ADA is available at: https://github.com/noraschneider/anchordataaugmentation/ |
| Open Datasets | Yes | Data: We use the California housing dataset [19] and the Boston housing dataset [14]. Data: We use four of the five in-distribution datasets used in [49]. The validation and test data are expected to follow the same distribution as the training data. Airfoil Self-Noise (Airfoil) and NO2 [24] are both tabular datasets, whereas Exchange-Rate and Electricity [27] are time series datasets. |
| Dataset Splits | Yes | The validation set has 100,000 samples. We divide the datasets into train-, validation and test data randomly, as the authors of C-Mixup did. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or detailed computer specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Our Python implementation of ADA' but does not specify version numbers for Python or any specific software libraries or dependencies (e.g., PyTorch, TensorFlow, scikit-learn) required to replicate the experiment. |
| Experiment Setup | Yes | For the ADA and Local-Mixup experiments, we use hyperparameter tuning and grid search to find the optimal training (batch size, learning rate, and number of epochs), and Local-Mixup parameters (distance threshold ) and ADA parameters (number of clusters, range of γ, and whether to use manifold augmentation). To be precise, we define 2 {2, 4, 6, 8, 10} and specify βi = 1 + 1 k/2 i (with i 2 {1, ..., k/2}) and γ 2 { 1 , 1 βk/2 1 , ..., 1 β1 , 1, β1, ..., βk/2 1, }. A is constructed using k-means clustering with q = 8. |