Associating Objects with Transformers for Video Object Segmentation
Authors: Zongxin Yang, Yunchao Wei, Yi Yang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on both multi-object and single-object benchmarks to examine AOT variant networks with different complexities. |
| Researcher Affiliation | Collaboration | Zongxin Yang1,2, Yunchao Wei3,4, Yi Yang1 1 CCAI, College of Computer Science and Technology, Zhejiang University 2 Baidu Research 3 Institute of Information Science, Beijing Jiaotong University 4 Beijing Key Laboratory of Advanced Information Science and Network |
| Pseudocode | No | The paper describes the proposed methods and their components (e.g., LSTT block structure in Fig. 2c) but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to open-source code or an explicit statement about its release. |
| Open Datasets | Yes | We evaluate AOT on popular multi-object benchmarks, You Tube-VOS [48] and DAVIS 2017 [31], and single-object benchmark, DAVIS 2016 [30]. |
| Dataset Splits | Yes | You Tube-VOS contains 3471 videos in the training split with 65 categories and 474/507 videos in the validation 2018/2019 split with additional 26 unseen categories. |
| Hardware Specification | No | The paper discusses performance metrics like FPS but does not provide specific details on the hardware (e.g., GPU models, CPU types) used for experiments. |
| Software Dependencies | No | AOT performs well with Paddle Paddle [1] and Py Torch [28]. (No version numbers provided for reproducibility). |
| Experiment Setup | Yes | The spatial neighborhood size λ is set to 15, and the number of identification vectors, M, is set to 10, which is consistent with the maximum object number in the benchmarks [48, 31]. The hyper-parameters of these variants are: (1) AOT-Tiny: L = 1, m = {1}; (2) AOT-Small: L = 2, m = {1}; (3) AOT-Base: L = 3, m = {1}; (4) AOT-Large: L = 3, m = {1, 1 + δ, 1 + 2δ, 1 + 3δ, ...}. |