ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation

Authors: Kaiwen Zhou, Kaizhi Zheng, Connor Pryor, Yilin Shen, Hongxia Jin, Lise Getoor, Xin Eric Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on MP3D (Chang et al., 2017), HM3D (Ramakrishnan et al., 2021), and Robo THOR (Deitke et al., 2020) benchmarks show that our ESC method improves significantly over baselines, and achieves new state-of-the-art results for zero-shot object navigation (e.g., 288% relative Success Rate improvement than Co W (Gadre et al., 2022) on MP3D).
Researcher Affiliation Collaboration 1University of California, Santa Cruz 2Samsung Research America.
Pseudocode Yes Algorithm 1 Navigation algorithm
Open Source Code No The paper does not include an unambiguous statement indicating the release of the source code for the described methodology, nor does it provide a direct link to a code repository. The text does not contain phrases like 'We release our code', 'The source code for our method is available at', or similar declarations of public code availability.
Open Datasets Yes MP3D (Chang et al., 2017) is used in Habitat Object Nav challenges, containing 2195 validation episodes on 11 validation environments with 21 goal object categories. HM3D (Ramakrishnan et al., 2021) is used in Habitat 2022 Object Nav challenge, containing 2000 validation episodes on 20 validation environments with 6 goal object categories. Robo THOR (Deitke et al., 2020) is used in Robo THOR 2020, 2021 Object Nav challenge, containing 1800 validation episodes on 15 validation environments with 12 goal object categories.
Dataset Splits Yes MP3D (Chang et al., 2017) is used in Habitat Object Nav challenges, containing 2195 validation episodes on 11 validation environments with 21 goal object categories. HM3D (Ramakrishnan et al., 2021) is used in Habitat 2022 Object Nav challenge, containing 2000 validation episodes on 20 validation environments with 6 goal object categories. Robo THOR (Deitke et al., 2020) is used in Robo THOR 2020, 2021 Object Nav challenge, containing 1800 validation episodes on 15 validation environments with 12 goal object categories.
Hardware Specification No The paper provides details about the simulated agent's configuration and sensor setup (e.g., 'The agent has a height of 0.88m, with a radius of 0.18m. The agent receives 640 x 480 RGB-D egocentric views from a camera with 79 HFoV placed 0.88m from the ground.'), but it does not specify the actual hardware (e.g., GPU models, CPU models, or cloud computing instances) used to run the experiments or train the models.
Software Dependencies No The paper mentions using specific pre-trained models like 'GLIP-L (Li* et al., 2022)', 'Deberta v3 (He et al., 2021)', and 'Chat GPT (Ouyang et al., 2022)', but it does not list specific software dependencies with their version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x) that are required to replicate the experiments.
Experiment Setup Yes There are several hyper-parameters in the ESC method. For the distance threshold df for selecting the closest frontier to explore, we use df = 1.6m in all the experiments. For the threshold do determining whether a frontier is near an object, we fix do = 1.6m. For the threshold dr determining whether a frontier is in a room, we fix dr = 0.6m. We applied a weight of 1.0 for all PSL rules when only one of commonsense reasoning (object or room) was utilized. Moreover, we double the weight for the shortest distance rule in Eq. 6 to 2.0 when both levels of commonsense reasoning are employed.