reproducibilityindex.ai

Self-training Avoids Using Spurious Features Under Domain Shift

Authors: Yining Chen, Colin Wei, Ananya Kumar, Tengyu Ma

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our theory for spurious domain shift tasks on semi-synthetic Celeb-A and MNIST datasets.
Researcher Affiliation	Academia	Yining Chen , Colin Wei , Ananya Kumar, Tengyu Ma Department of Computer Science Stanford University {cynnjjs, colinwei, ananya1, tengyuma}@stanford.edu
Pseudocode	No	The paper refers to "Algorithm (2.3)" and "Algorithm 2.3" and describes the update rules for self-training, but these are presented as equations within the text, not as a formally labeled "Algorithm" or "Pseudocode" block.
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available.
Open Datasets	Yes	We run simulations on semi-synthetic colored MNIST [19] and celeb A [21] datasets to verify the insights from our theory and show that they apply to multi-layer neural networks and datasets where the spurious features are not necessarily a subset of the input coordinates (Section 5).
Dataset Splits	No	The paper mentions training on source data and evaluating on held-out target samples, but it does not provide specific percentages or counts for training, validation, or test splits. There is no explicit mention of a validation set being used for hyperparameter tuning or early stopping.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware used for running experiments, such as GPU or CPU models, or cloud computing specifications.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries).
Experiment Setup	No	The paper mentions training a "3-layer feed-forward network" and initializing entropy minimization, but it does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings.