Probabilistic Attention for Interactive Segmentation
Authors: Prasad Gabbur, Manjot Bilkhu, Javier Movellan
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we report the results of using probabilistic attention at various stages of a deep interactive semantic segmentation network. Specifically, we use it within the Bo TNet50 backbone [49] in place of standard attention and also as part of a self-attention based classification head at the network output. We quantify model performance (mean IOU relative to ground truth) as a function of the number of clicks [6] on two widely used public benchmarks for this task: Grab Cut[46] and Berkeley [39]. |
| Researcher Affiliation | Industry | Prasad Gabbur Apple pgabbur@apple.com Manjot Bilkhu Apple mbilkhu@apple.com Javier Movellan Apple movellan@apple.com |
| Pseudocode | No | The paper presents mathematical equations and descriptions of methods, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | A Py Torch layer implementation of our probabilistic attention model is available here: https://github.com/apple/ml-probabilistic-attention. |
| Open Datasets | Yes | Using standard benchmarks, we observe that key adaptation boosts model performance ( 10% m Io U) in the low feedback regime and value propagation improves model responsiveness in the high feedback regime. A Py Torch layer implementation of our probabilistic attention model is available here: https://github.com/apple/ml-probabilistic-attention. (...) on two widely used public benchmarks for this task: Grab Cut[46] and Berkeley [39]. |
| Dataset Splits | No | The paper mentions training on LVIS and fine-tuning on SBD, and testing on Grab Cut and Berkeley datasets following 'standard protocols', but it does not explicitly provide specific train/validation/test split percentages or sample counts in the main text. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch' as the implementation framework, but it does not specify any version numbers for Py Torch or other software dependencies. |
| Experiment Setup | Yes | Our models are trained on the LVIS [21] dataset at a resolution of 256 pixels. (...) The degree of adaptation is controlled by the prior precision parameter θξ with lower values leading to a higher degree of adaptation due to the lower weight on the prior keys. (...) We use two different values of the precision prior, 0.001 and 0 (...) The local context size for Axial attention modules is chosen to be 64 pixels. (...) We use 1 (BP1) and 5 (BP5) iterations of value propagation at the CSA layer and test on the Grab Cut (left) and Berkeley (right) datasets. |