reproducibilityindex.ai

Attentional Constellation Nets for Few-Shot Learning

Authors: Weijian Xu, yifan xu, Huaijin Wang, Zhuowen Tu

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approach attains a signiﬁcant improvement over the existing methods in few-shot learning on the CIFAR-FS, FC100, and mini-Image Net benchmarks. [...] We demonstrate the effectiveness of our approach on standard few-shot benchmarks, including FC100 (Oreshkin et al., 2018), CIFAR-FS (Bertinetto et al., 2018) and mini-Image Net (Vinyals et al., 2016) by showing a signiﬁcant improvement over the existing methods. An ablation study also demonstrates the effectiveness of Constellation Net is not achieved by simply increasing the model complexity [...] 5 EXPERIMENT 5.1 DATASETS [...] 5.3 RESULTS ON STANDARD BENCHMARKS Table 1 and 2 summarize the results of the few-shot classiﬁcation tasks on CIFAR-FS, FC100, and mini-Image Net, respectively. Our method shows a notable improvement over several strong baselines in various settings.
Researcher Affiliation	Collaboration	Weijian Xu 1, Yifan Xu 1, Huaijin Wang 1 & Zhuowen Tu1,2 University of California San Diego1, Amazon Web Services2 {wex041,yix081,huw011,ztu}@ucsd.edu
Pseudocode	Yes	Inspired by Sculley (2010), we design a mini-batch soft k-means algorithm to cluster the cell features approximately: Initialization. Randomly initialize global cluster centers V = {v1, v2, ..., v K} and a counter s = (s1, s2, ..., s K) = 0. Cluster Assignment. In forward step, given input cell features U = {u1, u2, ..., un}, we compute the distance vector di = (di1, di2, ...di K) between input cell feature ui and all cluster centers V. We then compute the soft assignment mik R and generate the current mini-batch centers v k: dik = \|\|ui vk\|\|2 2, mik = e βdik P j e βdij , v k = P Centroid Movement. We formulate a count update s = P i mi by summing all assignment maps mi = (mi1, mi2, ...mi K). The current mini-batch centers v k are then updated to the global centers vk with a momentum coefﬁcient η: vk (1 η)vk + ηv k, η = λ sk + sk (5) Counter Update. Counter s is updated and distance vectors {di} are reshaped and returned: s s + s (6)
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We adopt three standard benchmark datasets that are widely used in few-shot learning, CIFAR-FS dataset (Bertinetto et al., 2018), FC100 dataset (Oreshkin et al., 2018), and mini-Image Net dataset (Vinyals et al., 2016). Details about dataset settings in few-shot learning are in Appendix A.2. [...] The CIFAR-FS dataset (Bertinetto et al., 2018) is a few-shot classiﬁcation benchmark containing 100 classes from CIFAR-100 (Krizhevsky et al., 2009). [...] The FC100 dataset (Oreshkin et al., 2018) is another benchmark based on CIFAR-100 [...] The mini-Image Net dataset (Vinyals et al., 2016) is a common benchmark for few-shot classiﬁcation containing 100 classes from ILSVRC2012 (Deng et al., 2009).
Dataset Splits	Yes	The CIFAR-FS dataset (Bertinetto et al., 2018) is a few-shot classiﬁcation benchmark containing 100 classes from CIFAR-100 (Krizhevsky et al., 2009). The classes are randomly split into 64, 16 and 20 classes as meta-training, meta-validation and meta-testing set respectively. [...] The mini-Image Net dataset (Vinyals et al., 2016) is a common benchmark for few-shot classiﬁcation containing 100 classes from ILSVRC2012 (Deng et al., 2009). The classes are randomly split into 64, 16 and 20 classes as meta-training, meta-validation and meta-testing set respectively.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using an 'SGD optimizer' and following 'implementation in Lee et al. (2019)' but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch, TensorFlow, specific Python versions).
Experiment Setup	Yes	Optimization Settings. We follow implementation in Lee et al. (2019), and use SGD optimizer with initial learning rate of 1, and set momentum to 0.9 and weight decay rate to 5 10 4. The learning rate reduces to 0.06, 0.012, and 0.0024 at epoch 20, 40 and 50. The inverse temperature β is set to 100.0 in the cluster assignment step, and λ is set to 1.0 in the centroid movement step.