MOVE: Unsupervised Movable Object Segmentation and Detection
Authors: Adam Bielski, Paolo Favaro
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We introduce MOVE, a novel method to segment objects without any form of supervision. MOVE exploits the fact that foreground objects can be shifted locally relative to their initial position and result in realistic (undistorted) new images. This property allows us to train a segmentation model on a dataset of images without annotation and to achieve state of the art (Sot A) performance on several evaluation datasets for unsupervised salient object detection and segmentation. In unsupervised single object discovery, MOVE gives an average Cor Loc improvement of 7.2% over the Sot A, and in unsupervised class-agnostic object detection it gives a relative AP improvement of 53% on average. |
| Researcher Affiliation | Academia | Adam Bielski University of Bern adam.bielski@unibe.ch Paolo Favaro University of Bern paolo.favaro@unibe.ch |
| Pseudocode | No | The paper describes the method using text and diagrams (Figures 2 and 3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | We plan on sharing the code and the instructions at a later time. |
| Open Datasets | Yes | We train our main model using the train split of the DUTS dataset (DUTS-TR) [30], containing 10,553 images of scenes and objects of varying sizes and appearances. We emphasize that we only use the images without the corresponding ground truth. For comparison, we evaluate our approach on three saliency detection datasets: the test set of DUTS (5,019 images), DUT-OMRON [29] (5,168 images) and ECSSD [31] (1,000 images). |
| Dataset Splits | Yes | We train our main model using the train split of the DUTS dataset (DUTS-TR) [30], containing 10,553 images of scenes and objects of varying sizes and appearances. ... We perform ablation experiments on the validation split (500 images) of HKU-IS [58] to validate the relative importance of the components of our segmentation approach. |
| Hardware Specification | Yes | We train our model for 80 epochs with a batch size of 32 on a single NVIDIA Ge Force 3090Ti GPU with 24GB of memory. |
| Software Dependencies | No | The paper states 'We implemented our experiments in Py Torch [28]' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We set the minimum mask area θmin = 0.05, the minimum loss coefficient λmin = 100 and we linearly ramp up the binarization loss coefficient λbin from 0 to 12.5 over the first 2500 segmenter iterations. We use the shift range = 1/8. We train the segmenter by alternatively minimizing the discriminator loss and the segmenter losses. Both are trained with a learning rate of 0.0002 and an Adam [27] optimizer with betas = (0, 0.99) for the discriminator and (0.9, 0.95) for the segmenter. We train our model for 80 epochs with a batch size of 32... |