Exploiting weakly supervised visual patterns to learn from partial annotations
Authors: Kaustav Kundu, Joseph Tighe
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the effectiveness of our approach across several multi-label computer vision benchmarks, such as CIFAR100 [31], MSCOCO panoptic segmentation [27], Open Image [32] and LVIS [16] datasets. Our approach can outperform baselines by a margin of 2-10% across all the datasets on mean average precision (m AP) and mean F1 metrics. |
| Researcher Affiliation | Industry | Kaustav Kundu , Erhan Bas, Michael Lam, Hao Chen, Davide Modolo, Joseph Tighe Amazon Web Services {kaustavk,erhanbas,michlam,hxen,dmodolo,tighej}@amazon.com |
| Pseudocode | No | The paper describes its methods verbally and mathematically but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | Our code will be made open source upon acceptance. We release all code to reproduce the results presented in this paper. However, no direct link or specific access information (e.g., in supplementary material) is provided in the paper. |
| Open Datasets | Yes | We use CIFAR100 [31], MS COCO detection [35] and MS COCO panoptic segmentation [27] datasets for the synthetically generated partially annotated datasets. For studying the performance on realistic partially annotated datasets, we use Open Image [32] and LVIS [16] datasets. |
| Dataset Splits | Yes | We randomly split the original training set into 116K and 2K validation images, s.t. the training and validation sets have same label distribution. It has 1.6M training, 33K validation and 113K test images with 497 label categories. |
| Hardware Specification | Yes | It takes <1 epoch training time ( 15 min. on a single V100 GPU) |
| Software Dependencies | No | The paper mentions using specific optimizers (sgd, Adam) and network architectures (Res Ne Xt101), but does not provide specific version numbers for software libraries or frameworks (e.g., TensorFlow, PyTorch versions). |
| Experiment Setup | Yes | The networks have been trained using the sgd optimizer with an initial learning rate of 0.001, momentum of 0.9 and weight decay of 0.0005. The learning rate is decreased by a factor of 10 at the end of 10th and 20th epochs. We use a batch size of 24 with the input image dimension as 224 224. The networks are trained for 36 epochs. |