Self-training Avoids Using Spurious Features Under Domain Shift
Authors: Yining Chen, Colin Wei, Ananya Kumar, Tengyu Ma
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our theory for spurious domain shift tasks on semi-synthetic Celeb-A and MNIST datasets. |
| Researcher Affiliation | Academia | Yining Chen , Colin Wei , Ananya Kumar, Tengyu Ma Department of Computer Science Stanford University {cynnjjs, colinwei, ananya1, tengyuma}@stanford.edu |
| Pseudocode | No | The paper refers to "Algorithm (2.3)" and "Algorithm 2.3" and describes the update rules for self-training, but these are presented as equations within the text, not as a formally labeled "Algorithm" or "Pseudocode" block. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available. |
| Open Datasets | Yes | We run simulations on semi-synthetic colored MNIST [19] and celeb A [21] datasets to verify the insights from our theory and show that they apply to multi-layer neural networks and datasets where the spurious features are not necessarily a subset of the input coordinates (Section 5). |
| Dataset Splits | No | The paper mentions training on source data and evaluating on held-out target samples, but it does not provide specific percentages or counts for training, validation, or test splits. There is no explicit mention of a validation set being used for hyperparameter tuning or early stopping. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware used for running experiments, such as GPU or CPU models, or cloud computing specifications. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries). |
| Experiment Setup | No | The paper mentions training a "3-layer feed-forward network" and initializing entropy minimization, but it does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings. |