Interactive Visual Task Learning for Robots
Authors: Weiwei Gu, Anant Sah, Nakul Gopalan
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two sets of results. Firstly, we compare Hi Viscont with the baseline model (FALCON) on visual question answering(VQA) in three domains. Secondly, we conduct a human-subjects experiment where users teach our robot visual tasks in-situ. Our framework achieves 33% improvements in success rate metric, and 19% improvements in the object level accuracy compared to the baseline model. |
| Researcher Affiliation | Academia | School of Computing and Augmented Intelligence, Arizona State University {weiweigu, asah4, ng}@asu.edu |
| Pseudocode | No | The paper describes the methods in natural language and mathematical formulas but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions an "associated webpage 1" for instruction videos and manuals, but does not provide a link or explicit statement for the open-source code of the methodology. |
| Open Datasets | Yes | We first present experimental results on VQA tasks for three domains: the CUB-200-2011 dataset, a custom house-construction domain with building blocks, and a custom zoo domain with terrestrial and aquatic animals. CUB-200-2011 is a well-known public dataset. |
| Dataset Splits | No | The paper mentions "validation questions" in the context of training for gradient flow, but it does not provide explicit training/test/validation dataset splits (e.g., specific percentages or sample counts for each split) for reproducibility. |
| Hardware Specification | No | The paper specifies the robotic arm (Franka Emika Research 3 arm) and cameras (realsense D435 depth cameras) used for the robotic setup, but does not provide details on the computing hardware (e.g., specific GPU or CPU models) used for training or running the models. |
| Software Dependencies | No | The paper mentions using "SAM (Segment Anything Model)" and a "pretrained BERT-base model" but does not provide specific version numbers for any software libraries, frameworks, or programming languages. |
| Experiment Setup | No | The paper states "Both concept net models are trained with the same split of concepts and the same training data for the same number of steps" and mentions "validation questions" during training, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings in the main text. It mentions a more detailed description in the appendix. |