reproducibilityindex.ai

Better Supervisory Signals by Observing Learning Paths

Authors: Yi Ren, Shangmin Guo, Danica J. Sutherland

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To further support this hypothesis, we conduct experiments on a synthetic Gaussian problem (Figure 1 (a); details in Appendix C), where we can easily calculate p (y \| x) for each sample.
Researcher Affiliation	Academia	Yi Ren UBC renyi.joshua@gmail.com Shangmin Guo University of Edinburgh s.guo@ed.ac.uk Danica J. Sutherland UBC and Amii dsuth@cs.ubc.ca
Pseudocode	Yes	Algorithm 1: Filter-KD.
Open Source Code	Yes	Code, including the experiments producing the ﬁgures and a Filter-KD implementation, is available at https://github.com/Joshua-Ren/better_supervisory_signal.
Open Datasets	Yes	The CIFAR10H dataset (Peterson et al., 2019) is one attempt at a different ptar, using multiple human annotators to estimate ptar. ... We visualize the learning path of data points while training a Res Net18 (He et al., 2016) on CIFAR10 ... The experiments are conducted on CIFAR (Figure 7) and Tiny Image Net (Table 1)
Dataset Splits	Yes	We early-stop the student s training in all settings. ... ESKD uses a teacher stopped early based on validation accuracy. ... Check the early stopping criterion with the help of a validation set. ... make a train/valid/test split with ratio [0.05 0.05, 0.9]
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments. It only mentions general computing resources like 'West Grid, and Compute Canada'.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or specific library versions).
Experiment Setup	Yes	We train an MLP with 3 hidden layers, each with 128 hidden units and ReLU activations. ... we visualize the learning path of data points while training a Res Net18 (He et al., 2016) on CIFAR10 for 200 epochs. ... we focus on self-distillation and a ﬁxed temperature τ = 1 ... α controls the cut-off frequency of low-pass ﬁlter (0.05 here).