Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling

Authors: Penghao Wu, Li Chen, Hongyang Li, Xiaosong Jia, Junchi Yan, Yu Qiao

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data.
Researcher Affiliation Collaboration 1Shanghai AI Laboratory 2University of California at San Diego 3Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/Open Drive Lab/PPGeo
Open Datasets Yes All pre-training experiments are conducted on the hours-long unlabeled You Tube driving videos (Zhang et al., 2022b). CARLA (Dosovitskiy et al., 2017). nu Scenes (Caesar et al., 2020).
Dataset Splits Yes We use the official train-val split for training and evaluation. We use different sizes of training data (from 4K to 40K) following Zhang et al. (2022b) to evaluate the generalization ability of pre-trained visual encoders when labeled data is limited and conduct the closed-loop evaluation.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper mentions software components and optimizers (e.g., Adam, AdamW, Torchvision) but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For the first stage in PPGeo pipeline, we train the model for 30 epochs by Adam (Kingma & Ba, 2015) optimizer with a learning rate of 10 4 which drops to 10 5 after 25 epochs. For the second stage, the encoder is trained for 20 epochs using the Adam W (Loshchilov & Hutter, 2017) optimizer. A cyclic learning rate scheduler is applied with the learning rate ranging from 10 6 to 10 4. The batch size for both stages is 128. We use data augmentations including Color Jitter, Ramdom Gray Scale, and Gaussian Blur.