Meta-learning from Tasks with Heterogeneous Attribute Spaces

Authors: Tomoharu Iwata, Atsutoshi Kumagai

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments with synthetic datasets and 59 datasets in Open ML, we demonstrate that our proposed method can predict the responses given a few labeled instances in new tasks after being trained with tasks with heterogeneous attribute spaces.
Researcher Affiliation Industry Tomoharu Iwata NTT Communication Science Laboratories tomoharu.iwata.gy@hco.ntt.co.jp Atsutoshi Kumagai NTT Software Innovation Center atsutoshi.kumagai.ht@hco.ntt.co.jp
Pseudocode Yes Algorithm 1 Training procedure of our model: Random Sample(S, N) generates a set of N elements chosen uniformly at random from set S without replacement. Input: Datasets from tasks with heterogeneous attribute spaces {Dd}D d=1, number of support instances NS, number of query instances NQ, batch size B Output: Trained model parameters Φ 1: while End condition is satisfied do 2: Initialize loss, J 0 3: Select task indices for a mini batch, M Random Sample({1, , D}, B) 4: for d M do 5: Generate support set, S Random Sample(Dd, NS) 6: Generate query set, Q Random Sample(Dd, NQ) 7: Calculate loss by Eq. (7), J J + E(Q|S; Φ), and its gradients 8: end for 9: Update model parameters Φ using loss J and its gradient 10: end while
Open Source Code No The paper states 'We implemented the proposed method with Py Torch [21]', but it does not provide any specific link or explicit statement about releasing its source code.
Open Datasets Yes Using a python API for Open ML [6], we obtained datasets based on the following conditions: the number of instances was between 10 and 300, the number of attributes was between 2 and 30, and all the attributes were numerical values. Then we omitted datasets that had the same number of instances and the same number of attributes, and we obtained 59 datasets in total. The number of instances and attributes for each dataset is shown in the supplemental material. We normalized the values for each attribute with a mean of zero and a variance of one. The last attribute was used as the response for each dataset. We randomly split the 59 tasks into 37 training, 5 validation, and 17 target tasks.
Dataset Splits Yes We generated 10,000 training, 30 validation, and 300 target tasks. The number of support instances was NS = 5, and the number of query instances was NQ = 27.We randomly split the 59 tasks into 37 training, 5 validation, and 17 target tasks. The number of support instances was NS = 3, and the number of query instances was NQ = 29.
Hardware Specification No The paper mentions 'Training computational time in hours... on computers with 2.60GHz CPUs', which provides a clock speed but no specific CPU model, GPU, or other detailed hardware specifications.
Software Dependencies No The paper states 'We implemented the proposed method with Py Torch [21]', but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes We used three-layered feed-forward neural networks with 32 hidden units for all neural networks. The parameters were shared between the following pairs of neural networks: (f v, f c), (g v, g c), (fv, fc), (gv, gc), and The number of units at the output layer for fy was one, and it was 32 for the other neural networks. We used rectified linear unit, Re LU(x) = max(0, x), for the activation. We optimized using Adam [14] with learning rate 10 3 and dropout rate 0.1. The validation data were used for early stopping. The batch size was B = 256.