InfoCon: Concept Discovery with Generative and Discriminative Informativeness

Authors: Ruizhe Liu, Qian Luo, Yanchao Yang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 EXPERIMENTS", "We evaluate the effectiveness of Info Con and the derived key states on four robot manipulation tasks from Mani Skill2 (Gu et al., 2023), an extended version of Mani Skill (Mu et al., 2021): P&P Cube and its two key states ( Grasp , End ). Stack Cube and its three key states ( Grasp A , A on B , End ). Turn Faucet and its two key states ( Contacted , End ). Peg Insertion and its three key states ( Grasp , Align , End ).", "Table 1: Success rate (%) of different methods.
Researcher Affiliation Academia 1HKU Musketeers Foundation Institute of Data Science, The University of Hong Kong 2School of Electronics Engineering and Computer Science, Peking University 3Department of Electrical and Electronic Engineering, The University of Hong Kong
Pseudocode Yes Algorithm 1 Info Con
Open Source Code Yes Our code is available at: https://zrllrz.github.io/Info Con /
Open Datasets Yes We evaluate the effectiveness of Info Con and the derived key states on four robot manipulation tasks from Mani Skill2 (Gu et al., 2023), an extended version of Mani Skill (Mu et al., 2021):
Dataset Splits Yes Specifically, we collect 500 training trajectories for each task, 100 evaluation trajectories for P&P Cube and Stack Cube, 100 evaluation trajectories for seen faucets, 400 trajectories for unseen faucets, and 400 evaluation trajectories for Peg Insertion.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU specifications, or memory amounts used for running experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the implementation.
Experiment Setup Yes The number of concepts is fixed, maximum number of 10 manipulation concepts for all the tasks. The temperature τ in Eq. 5 and Fig. 7 is 0.1. A in Eq. 12 is 0.2. All the size of hidden features output by Transformers and concept features {αk}K k=1 is 128. When training, the coefficient λ, λrec in Eq. 18, Eq. 10 is 0.001, 0.1, and we will defer optimization of Lgen until half of iteration for training is done. We pretrain Info Con according to Eq. 18 for 1 104 iteration with base learning rate 1.0 10 4. Then we train the Info Con for each of the task with 1.6 106 iterations based on the pretrained model with base learning rate 1.0 10 4. After labeling the original data with key states using trained Info Con models, we train our Co TPC policies for 1.8 106 iterations with base learning rate 5.0 10 4. For the three training stages, we all use Adam W optimizer and warm-up cosine annealing scheduler which linearly increases learning rate from 0.1 of base learning rate to the base learning rate for 1000 iteration, and then decreases learning rate from base learning rate to 0.1 of base learning rate. The weight decay is always 1.0 10 3, and batch size is 256. For practice, we would only use a segment of 60 states (along with actions) for every item (trajectory) in the batch.