Video Scene Graph Generation from Single-Frame Weak Supervision

Authors: Siqi Chen, Jun Xiao, Long Chen

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive ablations and results on the benchmark Action Genome have demonstrated the effectiveness of our PLA.
Researcher Affiliation Academia Zhejiang University, The Hong Kong University of Science and Technology
Pseudocode No The paper describes its methodology in natural language and mathematical formulas, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes 1Codes are available at: https://github.com/zjucsq/PLA.
Open Datasets Yes We evaluated our method PLA on the challenging Vid SGG benchmark: Action Genome (AG) (Ji et al., 2020).
Dataset Splits No The paper states using 'the official splits as fully-supervised work, i.e., 7,464 videos for training, 1,737 videos for testing' but does not explicitly mention a separate validation split or how it was used if it existed.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running its experiments. It only mentions the backbone architecture of a pre-trained detector used.
Software Dependencies No The paper mentions specific models and frameworks used (e.g., Vin VL, STTran), but does not provide specific version numbers for software dependencies such as programming languages, libraries, or operating systems.
Experiment Setup Yes For the model-free teacher, we set the Io U matching threshold η = 0.5. In FPP module, we set the initial learning rate 1e 3. We used STTran as the student model, and followed the same training settings (e.g., learning rate and batch size) of the original paper.