reproducibilityindex.ai

VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model

Authors: Pengying Wu, Yao Mu, Bingxian Wu, Yi Hou, Ji Ma, Shanghang Zhang, Chang Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluation on HM3D and HSSD validates Voro Nav surpasses existing benchmarks in both success rate and exploration efficiency (absolute improvement: +2.8% Success and +3.7% SPL on HM3D, +2.6% Success and +3.8% SPL on HSSD).
Researcher Affiliation	Collaboration	1Department of Advancded Manufacturing and Robotics, College of Engineering, Peking University, Beijing, China. 2The University of Hong Kong. 3Open GVLab, Shanghai AI Laboratory. 4National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, Beijing, China.
Pseudocode	Yes	Algorithm 1 Navigation Process of Voro Nav; Algorithm 2 Look Around
Open Source Code	Yes	Project page: https://voro-nav.github.io
Open Datasets	Yes	The HM3D dataset provides 20 high-fidelity reconstructions of entire buildings and contains 2K validation episodes for object navigation tasks. The HSSD dataset provides 40 high-quality synthetic scenes and contains 1.2K validation episodes for object navigation.
Dataset Splits	Yes	The HM3D dataset provides 20 high-fidelity reconstructions of entire buildings and contains 2K validation episodes for object navigation tasks. The HSSD dataset provides 40 high-quality synthetic scenes and contains 1.2K validation episodes for object navigation.
Hardware Specification	Yes	These experimental results were obtained using a computer equipped with a 13th-generation Intel Core i7-13700KF CPU and an Nvidia RTX 4070 GPU with 12GB of memory.
Software Dependencies	No	The paper mentions models like Grounded-SAM, BLIP, and GPT-3.5 with their corresponding citations, but does not specify software dependencies like Python, PyTorch, or CUDA versions.
Experiment Setup	Yes	The agent s action space is {Stop, Move Forward, Turn Left, Turn Right, Look Up, Look Down}, with a discrete movement increment of 0.25m and discrete rotations of 30 .