reproducibilityindex.ai

Adaptive Visual Scene Understanding: Incremental Scene Graph Generation

Authors: Naitik Khandelwal, Xiao Liu, Mengmi Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results not only highlight the challenges of directly combining existing continual learning methods with SGG backbones but also demonstrate the effectiveness of our proposed approach, enhancing CSEGG efficiency while simultaneously preserving privacy and memory usage. All data and source code are publicly available here.
Researcher Affiliation	Academia	1 College of Computing and Data Science, Nanyang Technological University (NTU), Singapore 2 Deep Neuro Cognition Lab, Agency for Science, Technology and Research (A*STAR), Singapore
Pseudocode	No	The paper describes methods and processes but does not include any clearly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	All data and source code are publicly available here.
Open Datasets	Yes	Thus, we re-structure the Visual Genome dataset [25] and establish a novel and comprehensive CSEGG benchmark, where AI models are deployed to dynamic scenes where new objects and new relationships are introduced.
Dataset Splits	Yes	In CSEGG, to cater to the three continual learning scenarios below, we re-organize the Visual Genome [25] dataset and follow its standard image splits for training, validation, and test sets specified in [72].
Hardware Specification	Yes	All models are trained on 4 A5000 GPUs.
Software Dependencies	No	The paper mentions software components like the Stable Diffusion model [58] and Adam optimizer, and uses implementations from [32] and [68], but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup	Yes	For SGTR in Fig. S3 (a), the approach uniquely formulates the task as a bipartite graph construction problem. ... a batch size of 32 is used. All methods are optimized using the Adam optimizer with a base learning rate of 1 10 4 and a weight decay of 1 10 4. Object detection training is conducted only in the S2 and S3 scenarios. Each task in S2 is trained for 100 epochs, while each task in S3 is trained for 50 epochs. ... SGG (Scene Graph Generation) Training: In this stage, the entire SGTR model is fine-tuned while keeping the 2D-CNN feature extractor frozen. A batch size of 24 is employed, and the Adam optimizer is used with a base learning rate of 8 10 5. In S1 and S3, each model is trained for 50 epochs per task, while in S2, 80 epochs per task are used.