Reward Finetuning for Faster and More Accurate Unsupervised Object Discovery

Authors: Katie Luo, Zhenzhen Liu, Xiangyu Chen, Yurong You, Sagie Benaim, Cheng Perng Phoo, Mark Campbell, Wen Sun, Bharath Hariharan, Kilian Q. Weinberger

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that our approach is not only more accurate, but also orders of magnitudes faster to train compared to prior works on object discovery.
Researcher Affiliation Academia 1Cornell University, Ithaca, NY 2The Hebrew University of Jerusalem
Pseudocode Yes Algorithm 1 Reward-Incentivized Finetuning
Open Source Code Yes Code is available at https://github.com/katieluo88/DRIFT.
Open Datasets Yes We experimented with two different datasets: Lyft Level 5 Perception dataset [24] and Ithaca-365 dataset [12].
Dataset Splits No The paper specifies train and test splits (e.g., "11,873 train scenes and 4,901 test scenes" for Lyft, and "57,107 scenes for training and 1,644 for testing" for Ithaca365), but does not explicitly quantify a separate validation dataset split.
Hardware Specification Yes We train DRIFT on four NVIDIA RTX A6000 GPUs, with batch size 10 per GPU.
Software Dependencies No The paper mentions using "Point RCNN [40]" and "Open PCDet [42]" but does not specify version numbers for these software components.
Experiment Setup Yes We train DRIFT with 120 epochs in Lyft and 30 epochs in Ithaca365 as the default setting... We use λshape = 1, λalign = 1, λdyn = 0.001 and λbg = 0.001. We use µscale = 0.8 and σscale = 0.2 for the alignment reward.