Mirroring without Overimitation: Learning Functionally Equivalent Manipulation Actions

Authors: Hangxin Liu, Chi Zhang, Yixin Zhu, Chenfanfu Jiang, Song-Chun Zhu8025-8033

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiment, we demonstrate the proposed approach by teaching a real Baxter robot with a complex manipulation task involving haptic feedback opening medicine bottles. Figure 7a-b shows the Q-learning results, with a discount factor 0.99, reward for success +1, reward for failure 1, and reward for all others 0. We use ϵ-greedy exploration with exponential decay to obtain the state-force associations. Table 1 shows the success rate.
Researcher Affiliation Academia Hangxin Liu,1 Chi Zhang,1,2 Yixin Zhu,1,2 Chenfanfu Jiang,3 Song-Chun Zhu1,2 1UCLA Center for Vision, Cognition, Learning and Autonomy 2International Center for AI and Robot Autonomy (CARA) 3UPenn Computer and Information Science Department {hx.liu,chi.zhang,yixin.zhu}@ucla.edu, cffjiang@seas.upenn.edu, sczhu@stat.ucla.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions using an 'open-sourced tactile glove (Liu et al. 2017)' but does not provide access to the source code for the methodology described in this paper.
Open Datasets No The hand pose and force data is collected using an open-sourced tactile glove (Liu et al. 2017)... The data of 10 human manipulation sequences is collected. The paper describes its own collected dataset but does not provide access details (link, citation for the dataset itself, or repository) for public availability.
Dataset Splits No Figure 7a shows the cumulative reward during each training episode in red, and the average cumulative reward during evaluation in blue. While it mentions 'evaluation', the paper does not provide specific details on training/validation/test splits, such as percentages or sample counts.
Hardware Specification Yes We exercise the proposed framework in a robot platform with a dual-armed 7-Do F Baxter robot mounted on a Data Speed mobility base. The robot is equipped with a Re Flex Takk Tile gripper on the right wrist and a Robotiq S85 parallel gripper on the left.
Software Dependencies No The entire system runs on ROS, and the arm motion is planned by Move It!. The paper mentions software components (ROS, Move It!) but does not provide specific version numbers for them.
Experiment Setup Yes Figure 7a-b shows the Q-learning results, with a discount factor 0.99, reward for success +1, reward for failure 1, and reward for all others 0. We use ϵ-greedy exploration with exponential decay to obtain the state-force associations.