Exploring the Task Cooperation in Multi-goal Visual Navigation
Authors: Yuechen Wu, Zhenhuan Rao, Wei Zhang, Shijian Lu, Weizhi Lu, Zheng-Jun Zha
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive results on the interactive platform AI2-THOR demonstrate that the proposed method converges faster than state-of-the-art methods while producing more direct routes to navigate to the goal. |
| Researcher Affiliation | Academia | Yuechen Wu1 , Zhenhuan Rao1 , Wei Zhang1 , Shijian Lu2 , Weizhi Lu1 and Zheng-Jun Zha3 1School of Control Science and Engineering, Shandong University 2School of Computer Science and Engineering, Nanyang Technological University 3School of Information Science and Technology, University of Science and Technology of China |
| Pseudocode | Yes | Algorithm 1: Multi-goal Co-learning (Mg Cl) |
| Open Source Code | No | The video demonstration is available at: https://youtube.com/channel/ UCtp TMOsctt3y Pz Xqe JMD3w/videos. This link points to a video demonstration, not the source code for the methodology. |
| Open Datasets | No | The paper uses AI2-THOR as an interactive environment, but does not explicitly state that a specific dataset used for training is publicly available or provide concrete access information for such a dataset. |
| Dataset Splits | No | The paper describes evaluation procedures but does not specify explicit training/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU model, CPU type) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | In the implementation, we set the discount factor γ = 0.99, the RMSProp decay factor α = 0.99, the exploration rate ϵ = 0.1, and the entropy regularization term β = 0.01. Besides, we used 16 threads and performed updates after every 5 actions (i.e., tmax = 5). To relieve the bias behaviour, all the goals and scenes were trained in turn in each thread in the experiments. |