reproducibilityindex.ai

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments

Authors: Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS, 5.1 EXPERIMENTAL SETUP, 5.2 BASELINES, 5.3 EXPERIMENTAL RESULTS, Table 1: The rescued value rate (Value), averaged rescue step (Step), and averaged damaged rate (Damage) of the proposed LLM pipeline (LLM) and all baseline methods.
Researcher Affiliation	Collaboration	Qinhong Zhou1 , Sunli Chen2 , Yisong Wang3, Haozhe Xu3, Weihua Du2, Hongxin Zhang1, Yilun Du4, Joshua B. Tenenbaum4, Chuang Gan1,5 1University of Massachusetts Amherst, 2 Institute for Interdisciplinary Information Sciences, Tsinghua University, 3Peking University, 4MIT, 5MIT-IBM Watson AI Lab
Pseudocode	No	The paper describes algorithms like A* and MCTS within the text, but it does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	For readers interested in reproducing the experimental results presented in this paper, we have made our experiments accessible via a Github repository, available at https://github.com/UMass-Foundation-Model/HAZARD.
Open Datasets	Yes	HAZARD is available at https://vis-www.cs.umass.edu/hazard/ and To create the dataset for HAZARD, we choose 4 distinct indoor rooms for the fire and flood tasks, and 4 outdoor regions for the wind task.
Dataset Splits	No	The paper states a 'train-set split ratio of 3:1' but does not explicitly mention a separate validation split or its details.
Hardware Specification	Yes	We run most of our experiments on an Intel i9-9900k CPU and RTX2080-Super GPU Desktop.
Software Dependencies	No	The paper mentions 'Open MMLab detection framework' and 'Mask-RCNN' but does not provide specific version numbers for these software components.
Experiment Setup	Yes	We use max tokens of 512, temperature of 0.7, top p of 1.0 as hyper-parameters during inference. and We use the PPO algorithm with learning rate 2.5 ˆ 10 4 and train for 105 steps.