Trajectory Diffusion for ObjectGoal Navigation
Authors: Xinyao Yu, Sixian Zhang, Xinhang Song, Xiaorong Qin, Shuqiang Jiang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the Gibson and MP3D datasets demonstrate that the generated trajectories effectively guide the agent, resulting in more accurate and efficient navigation. |
| Researcher Affiliation | Academia | 1Key Lab of Intelligent Information Processing Laboratory of the Chinese Academy of Sciences (CAS), Institute of Computing Technology, Beijing, 2University of Chinese Academy of Sciences, Beijing 3 Institute of Intelligent Computing Technology, Suzhou, CAS |
| Pseudocode | No | The paper describes the implementation of T-Diff and its components (e.g., in Figure 2 and Section 4.2), but it does not include a dedicated pseudocode block or an algorithm section labeled as such. |
| Open Source Code | Yes | The code is available at https://github.com/sx-zhang/T-diff.git. |
| Open Datasets | Yes | We evaluate the performance of our model on standard Object Nav datasets, including Gibson [47] and Matterport3D (MP3D) [3] , in the Habitat simulator. |
| Dataset Splits | Yes | For Gibson, we use 25 train / 5 val scenes from the tiny-split, following the settings of [31], with 1000 validation episodes containing 6 target object categories. For MP3D, we utilize 56 train / 11 val scenes, with 2195 validation episodes containing 21 target object categories. |
| Hardware Specification | No | The paper discusses computational complexity in Section A.2 (FLOPs) but does not provide specific details on the hardware used for running the experiments, such as GPU/CPU models, memory, or specific cloud instances. |
| Software Dependencies | No | The paper mentions software components like DiT, ResNet-18, and AdamW optimizer but does not provide specific version numbers for these or other software dependencies required for replication. |
| Experiment Setup | Yes | For the training of trajectory diffusion model... The semantic maps are resized to 224 224. ... Training is performed using Adam W optimizer[19, 24] with a base learning rate of 1e-4, warmed up for 1000 steps using linear warmup and cosine schedule. After the warmup steps, the learning rate for the diffusion model is decayed by a factor of 1e-3, and the learning rate of the semantic map encoder is decayed by a factor of 1e-6. Each model is trained for 200 epochs. ... The maximum noise schedule τmax is set to 100. The length of the predicted trajectory k = 32 and the selected kg-th point is set to 28. ... The agent s turn angle is fixed at 30 degrees and each Forward step distance is 25 cm. The maximum timestep limit is set to 500 during navigation and t T diff is set to 5. |