Polar Relative Positional Encoding for Video-Language Segmentation
Authors: Ke Ning, Lingxi Xie, Fei Wu, Qi Tian
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method outperforms previous best method by a large margin of 11.4% absolute improvement in terms of m AP on the challenging A2D Sentences dataset. Our method also achieves competitive performances on the J-HMDB Sentences dataset. We evaluate our approach on two challenging datasets: A2D Sentences and J-HMDB Sentences. |
| Researcher Affiliation | Collaboration | Ke Ning1 , Lingxi Xie2 , Fei Wu1 and Qi Tian2 1Zhejiang University 2Huawei Noah s Ark Lab |
| Pseudocode | No | The paper describes its methods but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | Yes | A2D Sentences dataset is an extended version of the A2D dataset [Xu et al., 2015]. J-HMDB Sentences is an extension of the J-HMDB dataset [Jhuang et al., 2013]. |
| Dataset Splits | No | The paper specifies training and testing video counts (e.g., '3,036 training videos and 746 testing videos' for A2D Sentences) but does not explicitly provide details for a separate validation split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'We use Tensor Flow to implement our model.' but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | Implementation Details. We use Tensor Flow to implement our model. p is set to 3 in our experiments. The learning rate is 0.0005. We use a stack of 8 256 256 RGB frames as the video input for a balanced performance and speed. |