Deictic Image Mapping: An Abstraction for Learning Pose Invariant Manipulation Policies

Authors: Robert Platt, Colin Kohler, Marcus Gualtieri8042-8049

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we report on a series of experiments that evaluates the approach both in simulation and in hardware. The results show that the method can solve a variety of challenging manipulation problems with training times less than two hours on a standard desktop GPU system. We evaluate on the block alignment problem shown on the left side of Figure 4 where the agent must grasp one block and place it in alignment with the other. Figure 4 shows the results. DQN performance is shown in red while deictic image mapping performance is shown in blue.
Researcher Affiliation Academia College of Computer and Information Science, Northeastern University 360 Huntington Ave, Boston, MA 02115, USA {rplatt,ckohler,mgualti}@ccs.neu.edu
Pseudocode No The paper describes algorithms and formulations (e.g., Equation 2, Equation 3), but it does not include a dedicated pseudocode block or a clearly labeled algorithm block.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described.
Open Datasets No The paper describes the setup for its experiments, including simulation environment (Open RAVE) and mentions that policies were trained in simulation. However, it does not provide access information (link, DOI, formal citation) for any specific datasets used for training.
Dataset Splits No The paper describes training with a curriculum and mentions 'buffer size of 10k, a batch size of 10' and 'ϵ-greedy DQN', but it does not specify explicit train/validation/test dataset splits. For example, it does not state percentages or counts for validation data.
Hardware Specification Yes The full training curriculum executes in approximately 1.5 hours on a standard Intel Core i7-4790K running one NVIDIA 1080 graphics card. We used a UR5 equipped with a Robotiq two finger gripper, as shown in Figure 6a.
Software Dependencies No The paper mentions software like 'Tensor Flow' and 'Open RAVE (Diankov and Kuffner 2008)' and specific algorithms like 'DQN with dueling networks (Wang et al. 2015)' and 'Adam optimizer', but it does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes In this experiment, we used a standard ϵ-greedy DQN with dueling networks (Wang et al. 2015), no prioritized replay (Schaul et al. 2015), a buffer size of 10k, a batch size of 10, and an episode length of 10 steps. Epsilon decreased linearly from 100% to 10% over the training session. The neural network has two convolutional+relu+pooling layers of 16 and 32 units respectively with a stride of 3 followed by one fully connected layer with 48 units. We use the Adam optimizer with a learning rate of 0.0003.