Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Flexible Visual Relationship Segmentation
Authors: Fangrui Zhu, Jianwei Yang, Huaizu Jiang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical validation across various datasets demonstrates that our framework outperforms existing models in standard, promptable, and open-vocabulary tasks, e.g., +1.9 m AP on HICO-DET, +11.4 Acc on VRD, +4.7 m AP on unseen HICO-DET. |
| Researcher Affiliation | Collaboration | Fangrui Zhu1 Jianwei Yang2 Huaizu Jiang1 1Northeastern University 2Microsoft Research |
| Pseudocode | No | The paper describes the model architecture and operations in text and diagrams, but does not include a formal pseudocode or algorithm block. |
| Open Source Code | No | The paper includes a project page link (https://neu-vi.github.io/Fle VRS) but no explicit statement about open-sourcing the code for the described methodology or a direct link to a code repository. |
| Open Datasets | Yes | For HOI segmentation, we utilize two public benchmarks: HICO-DET [4] and V-COCO [18]. ... For panoptic SGG, we use the PSG dataset [85], sourced from COCO and VG [37] intersections... |
| Dataset Splits | No | The paper provides training and testing splits for the datasets (e.g., '44,329 images (35,801 training, 8,528 testing)' for HICO-DET) but does not explicitly mention a separate validation split or its size. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software components like 'Adam W', 'CLIP', 'SAM', 'Focal-T/L', 'Da Vi T-B/L' but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | During training, we set the input image to be 640 x 640, with batch size of 64. We optimize our network with Adam W [54] with a weight decay of 10^-4. We train all models for 30 epochs with an initial learning rate of 10^-4 decreased by 10 times at the 20th epoch. ... The loss weights λb, λd, λc and λgrd (superscript omitted) are set to 1,1,2, and 2. |