Learning Compact Features via In-Training Representation Alignment

Authors: Xin Li, Xiangrui Li, Deng Pan, Yao Qiang, Dongxiao Zhu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we conduct large-scale experiments on both image and text classifications to demonstrate its superior performance to the strong baselines.
Researcher Affiliation Academia Department of Computer Science, Wayne State University Detroit, Michigan 48202, USA {xinlee, xiangruili, pan.deng, yao, dzhu}@wayne.edu
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described, nor does it state that code will be released.
Open Datasets Yes In this Section, we extensively evaluate the ITRA performance using benchmark datasets on both image classification (i.e., KMNIST (Clanuwat et al. 2018), FMNIST (Xiao, Rasul, and Vollgraf 2017), CIFAR10, CIFAR100 (Krizhevsky and Hinton 2009), STL10 (Coates, Ng, and Lee 2011) and Image Net (Deng et al. 2009)) and text classification (i.e., AG s News, Amazon Reviews, Yahoo Answers and Yelp Reviews) tasks.
Dataset Splits No The paper mentions 'training data' and 'testing data' but does not explicitly provide details about training/validation/test dataset splits, percentages, or methodology.
Hardware Specification Yes The training time of Res Net-101 (CIFAR100) and Res Net-50 (Image Net) using ITRA are 7.5% and 3.9% more than baseline on an RTX 3090 GPU.
Software Dependencies No The paper mentions software like 'Huggingface transformers library', but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes The optimal hyperparameter value λ for each method is also reported. Results on other tuning parameter values as well as experimental details are provided in supplementary materials. ... For batch size, Table 3 demonstrates that the increase of batch size indeed helps the reduction of variation in feature learning (lower CE loss)...