TD²-Net: Toward Denoising and Debiasing for Video Scene Graph Generation
Authors: Xin Lin, Chong Shi, Yibing Zhan, Zuopeng Yang, Yaqi Wu, Dacheng Tao
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Systematic experimental results demonstrate the superiority of our proposed TD2-Net over existing state-of-the-art approaches on Action Genome databases. |
| Researcher Affiliation | Collaboration | Xin Lin1*, Chong Shi1, Yibing Zhan2, Zuopeng Yang1*, Yaqi Wu1, Dacheng Tao3 1 Guangzhou University 2 JD Explore Academy 3 The University of Sydney |
| Pseudocode | No | The paper describes its methods in detail using natural language and mathematical equations, but it does not include a dedicated 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Our experiments are conducted on the AG dataset (Ji et al. 2020), which is the benchmark dataset of dynamic scene graph generation. |
| Dataset Splits | No | The paper mentions training for 10 epochs and using a batch size of 1, and evaluates on the AG dataset under 'With Constraints' and 'No Constraints' settings. However, it does not explicitly provide specific training/validation/test dataset split percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper mentions using Faster R-CNN, Res Net-101 as backbone, and Adam W optimizer, but it does not specify version numbers for these software components or any other libraries. |
| Experiment Setup | Yes | During training, we utilize the Adam W optimizer (Loshchilov and Hutter 2017) with an initial learning rate of 1e 5 and a batch size of 1. The model is trained for 10 epochs. Additionally, we apply gradient clipping, restricting the gradients to a maximum norm of 5. In the Eq. (3) and Eq. (4), we set parameters M = N = 3. |