Contour-based Interactive Segmentation
Authors: Polina Popenova, Danil Galeev, Anna Vorontsova, Anton Konushin
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed method on the standard segmentation benchmarks, our novel User Contours dataset, and its subset User Contours-G containing difficult segmentation cases. Through experiments, we demonstrate that a single contour provides the same accuracy as multiple clicks, thus reducing the required amount of user interactions. |
| Researcher Affiliation | Industry | Polina Popenova , Danil Galeev , Anna Vorontsova and Anton Konushin Samsung Research {p.popenova, d.galeev, a.vorontsova, a.konushin}@samsung.com |
| Pseudocode | No | The paper describes the 'Contour Generation' algorithm in numbered steps, but does not present it as a formal pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the methodology described. |
| Open Datasets | Yes | Specifically, we use Semantic Boundaries Dataset, or SBD [Hariharan et al., 2011], and the combination of LVIS [Gupta et al., 2019] and COCO [Lin et al., 2014] for training. |
| Dataset Splits | No | The paper mentions using a 'test+validation split of Open Images [Kuznetsova et al., 2020] (about 100k samples)' in an ablation study but does not provide specific details on how this or other datasets are split into train/validation/test sets for reproducibility. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using 'Adam' for optimization but does not provide specific version numbers for any software dependencies (e.g., libraries, frameworks). |
| Experiment Setup | Yes | Input images are resized to 320px 480px. During training, we randomly crop and rescale images, use horizontal flip, and apply random jittering of brightness, contrast, and RGB values. The models are trained for 140 epochs using Adam [Kingma and Ba, 2014] with β1 = 0.9, β2 = 0.999 and ε = 10 8. The learning rate is initialized with 5 10 4 and reduced by a factor of 10 at epochs 119 and 133. In a study of training data, we fine-tune our models for 10 epochs. We use stochastic weight averaging [Izmailov et al., 2018], aggregating the weights at every second epoch starting from the fourth epoch. During fine-tuning, we set the learning rate to 1 10 5 for the backbone and 1 10 4 for the rest of the network, and reduce it by a factor of 10 at epochs 8 and 9. |