reproducibilityindex.ai

TETRIS: Towards Exploring the Robustness of Interactive Segmentation

Authors: Andrey Moskalenko, Vlad Shakhuro, Anna Vorontsova, Anton Konushin, Anton Antonov, Alexander Krapukhin, Denis Shepelev, Konstantin Soshin

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we conduct a real user study to investigate real user clicking patterns. This study reveals that the intuitive assumption made in the common evaluation strategy may not hold. As a result, interactive segmentation models may show high scores in the standard benchmarks, but it does not imply that they would perform well in a real world scenario. To assess the applicability of interactive segmentation methods, we propose a novel evaluation strategy providing a more comprehensive analysis of a model s performance. To this end, we propose a methodology for finding extreme user inputs by a direct optimization in a white-box adversarial attack on the interactive segmentation model. Based on the performance with such adversarial user inputs, we assess the robustness of interactive segmentation models w.r.t click positions. Besides, we introduce a novel benchmark for measuring the robustness of interactive segmentation, and report the results of an extensive evaluation of dozens of models.
Researcher Affiliation	Industry	Samsung Research {andrey.mos, v.shakhuro, a.vorontsova, a.konushin, ant.antonov, a.krapukhin, d.shepelev, k.soshin}@samsung.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions evaluating other 'open-source codebase' methods but does not provide concrete access or an explicit statement about releasing the source code for its own described methodology.
Open Datasets	Yes	We conduct our study on most commonly used datasets as well as TETRIS, which contains images of a significantly higher resolution. ... Benchmarking Interactive Segmentation. Grab Cut (Rother, Kolmogorov, and Blake 2004) was the first interactive segmentation dataset. Then, the Berkeley (Martin et al. 2001) segmentation dataset was adapted for interactive segmentation (Mc Guinness and O connor 2010). ... PASCAL VOC 2012 (Everingham et al. 2012) and COCO (Lin et al. 2014) segmentation datasets; ... DAVIS (Perazzi et al. 2016a) and SBD (Hariharan et al. 2011) (labeled with boundaries) datasets for interactive segmentation, using the same click generation strategy. ... Overall, we validate 23 checkpoints on the 5 interactive segmentation datasets: Grab Cut, Berkeley, DAVIS, and COCO-MVal, as well as on our novel TETRIS dataset.
Dataset Splits	No	The paper evaluates existing models on various datasets but does not provide specific details on the training, validation, or test dataset splits used for these evaluations.
Hardware Specification	Yes	Brute-force for a single 1024 1024 image takes over 4 hours using a single NVIDIA Tesla V100 to proceed, which encourages us to seek a faster approach for robustness evaluation, described below.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as library names (e.g., PyTorch, TensorFlow) and their versions.
Experiment Setup	Yes	We run no more than 10 optimization iterations to restrict the number of calculations. The gradient updates are calculated with an Adam optimizer (Kingma and Ba 2014). To compare models with different input resolution fairly, we linearly scale the learning rate by an input size factor: sqrt(HW)/10241024, where H, W denote image height and width in pixels, respectively. For minimizing and maximizing trajectories, the first iteration is selected with the baseline strategy and the consecutive clicks are placed greedily one by one. ... The total loss is a weighted sum of a Dice (Dice 1945) loss and an interaction location loss, scaled by 1000.