reproducibilityindex.ai

Auxiliary Modality Learning with Generalized Curriculum Distillation

Authors: Yu Shen, Xijun Wang, Peng Gao, Ming Lin

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also analyze the conditions under which AML works well from the optimization and data distribution perspectives. To guide various choices to achieve optimal performance using AML, we propose a novel method to assist in choosing the best auxiliary modality and estimating an upper bound performance before executing AML. In addition, we propose a new AML method using generalized curriculum distillation to enable more effective curriculum learning. Our method achieves the best performance compared to other SOTA methods.
Researcher Affiliation	Academia	1Department of Computer Science, University of Maryland, College Park, Maryland, USA. Correspondence to: Yu Shen <yushen@umd.edu>.
Pseudocode	Yes	Algorithm 1 SAMD Training Paradigm Input: Training data from main modality IM, training data from auxiliary modality IA (chosen by method in Sec. 5.1) Output: student network weights θstu Initialisation: Training Round number t, epoch number in each round k, loss correlation β, network weights θstu and θtea. for r = 1 to t do Reset teacher weights with student weights for e = 1 to k do Feed IM and IA into teacher, update teacher weights θtea with Eq. 2 end for for e = 1 to k do Feed IM and IA into student, and feed IM into student, update student weights θstu with Eq. 1 and loss 3 end for end for
Open Source Code	No	Since there s no open-source code, we reimplement the original work, then apply our method to it.
Open Datasets	Yes	We use Audi dataset (Geyer et al., 2020) and Honda dataset (Ramanishka et al., 2018) in this experiment. Also, we use depth map (Type 1), frequency image (Type 2), and attention image (Type 3) as Auxiliary modalities.
Dataset Splits	Yes	For dataset, we use the CARLA (Dosovitskiy et al., 2017) simulator for training and testing, specifically CARLA 0.9.10 which includes 8 publicly available towns. We use 7 towns for training and hold out Town05 for evaluation, as in (Prakash et al., 2021).
Hardware Specification	Yes	All experiments are conducted using one Intel(R) Xeon(TM) W-2123 CPU, two Nvidia GTX 1080 GPUs, and 32G RAM.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as particular deep learning frameworks or libraries.
Experiment Setup	Yes	We use the SGD optimizer with learning rate 0.001 and batch size 128 for training. The number of epochs is 2,000. The loss correlation β is set with different values for different knowledge distillation methods following (Tian et al., 2020). We pick epoch number in each round k = 5 from ablation study of k = 1, 2, 5, 20. We set the round number n = 400 for Audi dataset and n = 40 for Honda dataset.