Learning from Label Proportions with Prototypical Contrastive Clustering

Authors: Laura Elena Cué La Rosa, Dário Augusto Borges Oliveira2153-2161

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimented with our method on two widely used image classification benchmarks and report a new state-of-art LLP performance, achieving results close to fully supervised methods.
Researcher Affiliation Academia 1Electrical Engineering Department, Pontifical Catholic University of Rio de Janeiro, Brazil 2Data Science in Earth Observation, Technical University of Munich (TUM), Germany
Pseudocode Yes Algorithm 1: LLP-Co training loop using two views
Open Source Code No The paper states, 'We implemented our method upon the Sw AV (Caron et al. 2020) algorithm that is released under the Creative Commons Attribution-Non Commercial 4.0 International, introducing the cluster size constraint into the Sinkhorn-Knopp,' but does not provide a link or explicit statement that their own implementation code for LLP-Co is open source or publicly available.
Open Datasets Yes We experimented on two standard image classification benchmarks (CIFAR-10 and CIFAR-100 (Krizhevsky, Nair, and Hinton 2012)) and stated: 'CIFAR-10 and CIFAR-100 datasets are released under the MIT licenses.'
Dataset Splits No The paper mentions creating 'training bags' from the 'training set' and evaluating on a 'test set', but does not explicitly specify a validation set split or detailed train/validation/test partitioning for reproducibility beyond the inherent split of the CIFAR datasets.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or cloud computing specifications.
Software Dependencies No The paper mentions using a 'Res Net18 as backbone architecture' and implementing the method based on 'Sw AV (Caron et al. 2020) algorithm', but does not provide specific version numbers for any software dependencies like deep learning frameworks, Python, or CUDA.
Experiment Setup Yes We used a Res Net18 as backbone architecture followed by a projection head that projects the output of the Res Net18 to a 1024-D space. All the experimented models were trained using stochastic gradient descent (SGD), with a weight decay of 1 10 6 and an initial learning rate of 0.1. We warmed up the learning rate during five epochs and then used the cosine learning rate decay (Loshchilov and Hutter 2016) with a final value of 0.0001. As in (Caron et al. 2020), the softmax temperature τ was set to 0.1, and the prototypes were frozen during the first epoch. All our models were trained for 500 epochs.