Probabilistic Attention for Interactive Segmentation

Authors: Prasad Gabbur, Manjot Bilkhu, Javier Movellan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we report the results of using probabilistic attention at various stages of a deep interactive semantic segmentation network. Specifically, we use it within the Bo TNet50 backbone [49] in place of standard attention and also as part of a self-attention based classification head at the network output. We quantify model performance (mean IOU relative to ground truth) as a function of the number of clicks [6] on two widely used public benchmarks for this task: Grab Cut[46] and Berkeley [39].
Researcher Affiliation Industry Prasad Gabbur Apple pgabbur@apple.com Manjot Bilkhu Apple mbilkhu@apple.com Javier Movellan Apple movellan@apple.com
Pseudocode No The paper presents mathematical equations and descriptions of methods, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes A Py Torch layer implementation of our probabilistic attention model is available here: https://github.com/apple/ml-probabilistic-attention.
Open Datasets Yes Using standard benchmarks, we observe that key adaptation boosts model performance ( 10% m Io U) in the low feedback regime and value propagation improves model responsiveness in the high feedback regime. A Py Torch layer implementation of our probabilistic attention model is available here: https://github.com/apple/ml-probabilistic-attention. (...) on two widely used public benchmarks for this task: Grab Cut[46] and Berkeley [39].
Dataset Splits No The paper mentions training on LVIS and fine-tuning on SBD, and testing on Grab Cut and Berkeley datasets following 'standard protocols', but it does not explicitly provide specific train/validation/test split percentages or sample counts in the main text.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework, but it does not specify any version numbers for Py Torch or other software dependencies.
Experiment Setup Yes Our models are trained on the LVIS [21] dataset at a resolution of 256 pixels. (...) The degree of adaptation is controlled by the prior precision parameter θξ with lower values leading to a higher degree of adaptation due to the lower weight on the prior keys. (...) We use two different values of the precision prior, 0.001 and 0 (...) The local context size for Axial attention modules is chosen to be 64 pixels. (...) We use 1 (BP1) and 5 (BP5) iterations of value propagation at the CSA layer and test on the Grab Cut (left) and Berkeley (right) datasets.