Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Few-Shot Learning from Augmented Label-Uncertain Queries in Bongard-HOI
Authors: Qinqian Lei, Bo Wang, Robby T. Tan
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our method sets a new state-of-the-art (SOTA) performance by achieving 68.74% accuracy on the Bongard-HOI benchmark, a significant improvement over the existing SOTA of 66.59%. In our evaluation on HICO-FS, a more general few-shot recognition dataset, our method achieves 73.27% accuracy, outperforming the previous SOTA of 71.20% in the 5-way 5-shot task. |
| Researcher Affiliation | Collaboration | Qinqian Lei1, Bo Wang2, Robby T. Tan1 1 National University of Singapore 2 Ctrs Vision EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes its methods in detail and provides mathematical equations, but it does not include any pseudocode blocks or algorithm figures. |
| Open Source Code | No | The paper does not contain any explicit statements about open-sourcing code or provide links to a code repository. |
| Open Datasets | Yes | We conduct our experiments on the Bongard-HOI benchmark (Jiang et al. 2022). The training set of Bongard HOI has 23041 instances and 116 positive HOI classes. Each instance contains 14 images, including 6 positive, 6 negative, and 2 query images. We use the average prediction accuracy as the metric, following the Bongard-HOI benchmark. ... we additionally assess its performance on the HICO-FS dataset for few-shot HOI recognition (Ji et al. 2020). |
| Dataset Splits | No | The paper defines the training and test sets and their characteristics, but it does not explicitly specify a validation set split (e.g., percentages or counts) or its use for hyperparameter tuning. |
| Hardware Specification | Yes | All experiments are run on 4 A5000 GPUs with 24G GPU memory. |
| Software Dependencies | No | The paper mentions software components and models (e.g., DEKR model, CLIP image encoder, Unicontrol model, standard SGD optimizer) but does not provide specific version numbers for these software dependencies (e.g., Python 3.x, PyTorch 1.x, CUDA 1x.x). |
| Experiment Setup | Yes | All experiments are run on 4 A5000 GPUs with 24G GPU memory. with batch size 4 and total training epoch 5. The optimizer is standard SGD with a learning rate of 0.001 and weight decay of 5e-4. The hyper-parameter Ξ» = 0.2 in Equation 8. The loss weights in Equation 2 are set as Ξ³ = 0.3, ΞΎ = 0.03. We choose the teacher confidence score threshold as 0.9. |