reproducibilityindex.ai

RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches

Authors: Jiayuan Gu, Sean Kirmani, Paul Wohlhart, Yao Lu, Montserrat Gonzalez Arenas, Kanishka Rao, Wenhao Yu, Chuyuan Fu, Keerthana Gopalakrishnan, Zhuo Xu, Priya Sundaresan, Peng Xu, Hao Su, Karol Hausman, Chelsea Finn, Quan Vuong, Ted Xiao

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate RT-Trajectory at scale on a variety of real-world robotic tasks, and find that RT-Trajectory is able to perform a wider range of tasks compared to languageconditioned and goal-conditioned policies, when provided the same training data. Our real robot experiments aim to study the following questions:
Researcher Affiliation	Collaboration	1Google Deep Mind, 2University of California San Diego, 3Stanford University, 4Intrinsic
Pseudocode	No	The paper describes procedures in text but does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Evaluation videos can be found at https://rt-trajectory.github.io/. No explicit statement about providing open-source code for the methodology was found.
Open Datasets	Yes	We use the RT-1 (Brohan et al., 2023b) demonstration dataset for training.
Dataset Splits	No	The paper refers to a training dataset and unseen skills for evaluation but does not provide specific percentages or counts for training, validation, and test splits, nor does it refer to predefined standard splits for reproducibility.
Hardware Specification	No	The paper describes the robot hardware ("mobile manipulator robot from Everyday Robots", "7 degree-of-freedom arm", "two-fingered gripper", "mobile base") but does not specify the computational hardware (e.g., GPU models, CPU types) used for training or inference of the models.
Software Dependencies	No	The paper mentions "Mediapipe" and "Open AI" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We then learn a policy π represented by a Transformer (Vaswani et al., 2017) using Behavior Cloning (Pomerleau, 1988) following the RT-1 framework (Brohan et al., 2023b), by minimizing the log-likelihood of predicted actions at given the input image and trajectory sketch. To support trajectory conditioning, we modify the RT-1 architecture as follows. The trajectory sketch is concatenated with each RGB image along the feature dimension in the input sequence (a history of 6 images), which is processed by the image tokenizer (an Image Net pretrained Efficient Net-B3). For the additional input channels to the image tokenizer, we initialize the new weights in the first convolution layer with all zeros. Since the language instruction is not used, we remove the Fi LM layers used in the original RT-1.