Learning to Teach with Dynamic Loss Functions

Authors: Lijun Wu, Fei Tian, Yingce Xia, Yang Fan, Tao Qin, Lai Jian-Huang, Tie-Yan Liu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real world tasks including image classification and neural machine translation demonstrate that our method significantly improves the quality of various student models.
Researcher Affiliation Collaboration 1Sun Yat-sen University, Guangzhou, China 2Microsoft Research, Beijing, China 3University of Science and Technology of China, Hefei, China
Pseudocode Yes Algorithm 1 Training Teacher Model µθ
Open Source Code No The paper does not provide a direct link or explicit statement about releasing the source code for their method.
Open Datasets Yes We choose three widely adopted datasets: the MNIST, CIFAR-10 and CIFAR-100 datasets.
Dataset Splits No The paper mentions 'dev set' and 'development dataset' but does not explicitly provide specific percentages or counts for training, validation, and test splits required for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using Adam [26] optimizer but does not specify software dependencies with version numbers (e.g., Python, TensorFlow/PyTorch versions).
Experiment Setup Yes The teacher models are optimized with Adam [26] and the detailed setting is in Appendix.