Neural Complexity Measures

Authors: Yoonho Lee, Juho Lee, Sung Ju Hwang, Eunho Yang, Seungjin Choi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose Neural Complexity (NC), a meta-learning framework for predicting generalization. Our model learns a scalar complexity measure through interactions with many heterogeneous tasks in a datadriven way. The trained NC model can be added to the standard training loss to regularize any task learner in a standard supervised learning scenario. We contrast NC s approach against existing manually-designed complexity measures and other meta-learning models, and we validate NC s performance on multiple regression and classification tasks.
Researcher Affiliation Collaboration AITRICS1, Seoul, South Korea, KAIST2, Daejeon, South Korea, BARO AI3, Seoul, South Korea
Pseudocode Yes We show NC s training loop in Figure 2 and also provide a detailed description in Algorithms 1 and 2. Algorithm 1 Task Learning. Algorithm 2 Meta-Learning.
Open Source Code No The paper does not provide any explicit statement or link for the open-sourcing of its code.
Open Datasets Yes We consider five different datasets: three MNIST variants (MNIST [17], FMNIST [35], KMNIST [3]), for which the learner was a 1-layer MLP with 500 units, and SVHN [23] along with CIFAR-10 [14]
Dataset Splits Yes Given one large dataset S = {z1, . . . , z M}, we randomly split S into disjoint training and validation sets. For each task with this random split, the task learner uses the train set to train h, and the meta-learner evaluates LT computed with the validation set as its target.
Hardware Specification No The paper does not explicitly describe the specific hardware used for its experiments.
Software Dependencies No The paper mentions general software components but does not provide specific version numbers for software dependencies.
Experiment Setup Yes During meta-training, the layer size, activation, number of layers, learning rate, and number of steps were all fixed to (40, Re LU, 2, 0.01, 16), respectively. ... Due to space constraints, we describe detailed hyperparameters and NC architectures in the appendix.