PLATINUM: Semi-Supervised Model Agnostic Meta-Learning using Submodular Mutual Information
Authors: Changbin Li, Suraj Kothawade, Feng Chen, Rishabh Iyer
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on various settings on the mini Image Net, tiered Image Net and CIFAR-FS datasets. Our experiments show that PLATINUM outperforms MAML and semisupervised approaches like pseduo-labeling for semi-supervised FSC, especially for small ratio of labeled to unlabeled samples. |
| Researcher Affiliation | Academia | 1University of Texas at Dallas. Correspondence to: Suraj Kothawade <suraj.kothawade@utd.edu>, Changbin Li <changbin.li@utd.edu>. |
| Pseudocode | Yes | Algorithm 1 PLATINUM (Meta-Training) |
| Open Source Code | Yes | The Py Torch implementation is available at https://github.com/Hugo101/PLATINUM. |
| Open Datasets | Yes | We conduct experiments on three datasets: mini Image Net (Vinyals et al., 2016), tiered Image Net (Ren et al., 2018), and CIFAR-FS (Bertinetto et al., 2018). |
| Dataset Splits | Yes | Following the disjoint class split from (Ravi & Larochelle, 2017), we split it into 64 classes for training, 16 for validation, and 20 for test. Similarly, tiered Image Net is a larger dataset, consisting of 608 classes and each class has 768 1300 images. Classes are split into 351 for training, 97 for validation, and 160 for test (Ren et al., 2018). |
| Hardware Specification | Yes | We use an NVIDIA RTX A6000 GPU for our experiments. |
| Software Dependencies | No | The paper mentions "Py Torch implementation" but does not specify exact version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We implement image classification experiments in 5-way, 1-shot (5-shot) settings. Concretely, at first, all examples of each class will be randomly divided into labeled portion (where S and Q are sampled from) and unlabeled portion (where U is sampled from) based on a predefined labeled ratio ρ, where ρ is the ratio of the number of data points in the labeled portion to the total number of data points in the current class. Then, we sample each task to contain 1 (5) data points in the support set S, and 15 (15) data points in the query set Q per class. For the unlabeled set U, we sample 50 (50) data points for each class. To select a subset for semi-supervision using SMI functions, we use a budget Bin = 25 (25) for the inner loop, and a budget Bout = 50 (50) for the outer loop. Note that we perform a per-class selection to assign pseduo-labels using the SMI functions, which leads to a budget of 5 and 10 data points for the inner and outer loop respectively. For our experiments in Tab. 3, Tab. 4 and Tab. 5, we use a labeled set ratio ρ = 0.01. However, we also compare with a number of other ρ values (see Tab. 6 and Tab. 7). For our experiments with OOD classes in the unlabeled set (Tab. 4), we use 5 distractor classes with 50 data points for each class. To make a fair comparison, we apply the same 4-layer CONV backbone architecture given in (Vinyals et al., 2016; Finn et al., 2017) for our model and all baselines. ... All step sizes (α, β) are chosen from {0.0001, 0.001, 0.01, 0.1}. The Batch size (number of tasks per iteration) is chosen from {1, 2, 4}. The number of iterations are chosen from {10,000, 20,000, 30,000, 40,000, 60,000}. The selected best ones are: learning rate in the inner loop α = 0.01, meta parameters step size (outer learning rate) β = 0.0001; the number of iterations for all experiments is set to be 60,000 (600 epochs, each epoch has 100 iterations). |