Hierarchical Gaussian Mixture based Task Generative Model for Robust Meta-Learning

Authors: Yizhou Zhang, Jingchao Ni, Wei Cheng, Zhengzhang Chen, Liang Tong, Haifeng Chen, Yan Liu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on benchmark datasets indicate the effectiveness of our method for both sample classification and novel task detection.
Researcher Affiliation Collaboration Yizhou Zhang1 , Jingchao Ni2 , Wei Cheng3, Zhengzhang Chen3, Liang Tong4 , Haifeng Chen3, Yan Liu1 1University of Southern California 2AWS AI Labs 3NEC Laboratories America 4Stellar Cyber Inc. 1{zhangyiz,yanliu.cs}@usc.edu; 2nijingchao@gmail.com; 3{weicheng,zchen,haifeng}@nec-labs.com; 4ltong@stellarcyber.ai
Pseudocode Yes Algorithm 1: Hierarchical Gaussian Mixture based Task Generative Model (HTGM)
Open Source Code No The paper does not provide an explicit statement about open-source code release or a link to a code repository.
Open Datasets Yes The first dataset is the Plain-Multi benchmark [52]. It includes four fine-grained image classification datasets, i.e., CUB-200-2011 (Bird), Describable Textures Dataset (Texture), FGVC of Aircraft (Aircraft), and FGVCx-Fungi (Fungi). The second dataset is the Art-Multi benchmark [53]...Moreover, we used the Mini-Image Net dataset [47] to evaluate the case of uni-component distribution of tasks, which is discussed in Appendix D.6.
Dataset Splits Yes Both benchmarks were divided into the meta-training, meta-validation, and meta-test sets by following their corresponding papers.
Hardware Specification Yes We evaluated and trained all of the models on RTX 6000 GPU with 24 GB memory.
Software Dependencies No The paper mentions "Adam optimizer" and "Res Net-12" (implying common deep learning frameworks like PyTorch or TensorFlow), but it does not specify versions for any key software components or libraries required to replicate the experiments.
Experiment Setup Yes For training, Adam optimizer was used. Each batch contains 4 tasks. Each model was trained with 20000 episodes. The learning rate of the metric-based methods was 1e 3. The learning rates for the inner- and outer-loops of the optimization-based methods were 1e 3 and 1e 4. The weight decay was 1e 4. For HTGM, we set σ = 1.0, σ = 0.1, α = 0.5 (0.9) for 1-shot (5-shot) tasks. The number of mixture components r varies w.r.t. different datasets, and was grid-searched within [2, 4, 8, 16, 32].