Grounding the Meaning of Words through Vision and Interactive Gameplay
Authors: Natalie Parde, Adam Hair, Michalis Papakostas, Konstantinos Tsiakas, Maria Dagioglou, Vangelis Karkaletsis, Rodney D. Nielsen
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that I Spy is an effective approach for teaching robots how to model new concepts using representations comprised of visual attributes. The results from 255 test games show that the system was able to correctly determine which object the human had in mind 67% of the time. Furthermore, a model evaluation showed that the system correctly understood the visual representations of its learned concepts with an average of 65% accuracy. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, University of North Texas 2Department of Computer Science and Engineering, University of Texas at Arlington 3Institute of Informatics and Telecommunications, N.C.S.R. Demokritos |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks that are clearly labeled or formatted like code procedures. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository. |
| Open Datasets | No | To create initial concept models for the simulated games, the initial learning phase was conducted for 17 objects... Descriptions for these objects were acquired via AMT... To obtain human responses for the simulated games, a list was generated... AMT workers were provided one pre-captured gamespace configuration s images... The paper describes creating its own dataset and does not provide concrete access information (link, DOI, repository, or formal citation with author/year) for a publicly available or open dataset. |
| Dataset Splits | No | The 510 sets of answers were first divided into two groups (training and testing). After completing the initial learning phase for each object, the 15 training sets of answers for each object... The remaining 15 test sets of answers for each object were used for the actual simulations. The paper specifies a training and testing split but does not explicitly mention a separate validation set or its details. |
| Hardware Specification | Yes | The robot platform is NAO V4, a humanoid robot created by Aldebaran Robotics (www.aldebaran.com), running the NAOqi v1.14.5 operating system. |
| Software Dependencies | Yes | The robot platform is NAO V4, a humanoid robot created by Aldebaran Robotics (www.aldebaran.com), running the NAOqi v1.14.5 operating system. The JNAOqi SDK is used to integrate the robot with Java for motion control and image capture operations. Model training is performed using the Python Sci Kit Learn [Pedregosa et al., 2011] library. Image segmentation is performed using the Open CV [Bradski, 2000] Watershed Algorithm. Part-of-speech tags are acquired using the Stanford Part-Of Speech Tagger [Toutanova et al., 2003]... The question is then constructed using Simple NLG [Gatt and Reiter, 2009]. |
| Experiment Setup | Yes | Initial Learning Phase. In the initial learning phase, the robot begins with an empty knowledge base. To learn about an object, the robot captures a series of images of the object from different angles and distances... Gaming Phase. In the gaming phase, the robot is placed in front of a set of objects. The robot captures images of the gamespace... GMMs are constructed with two components, and the Expectation-Maximization (EM) algorithm is used for parameter estimation (component weights, means, and covariances). Models are retrained following every game, to reflect new information from yes and no player responses and visual features extracted from game photos. |