Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree

Authors: Lang Feng, Pengjie Gu, Bo An, Gang Pan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide both theoretical analysis and empirical evidence to support TAT s effectiveness. Our results highlight its remarkable ability to resist the risk from unreliable trajectories, guarantee the performance boosting of diffusion planners in 100% of tasks, and exhibit an appreciable tolerance margin for sample quality, thereby enabling planning with a more than 3 acceleration. In this section, we present the empirical evaluations of the proposed TAT across a range of decision-making tasks in offline control settings.
Researcher Affiliation Collaboration 1Zhejiang University, China 2Nanyang Technological University, Singapore 3Skywork AI, Singapore 4State Key Laboratory of Brain-Machine Intelligence, China.
Pseudocode Yes Pseudocode of closed-loop planning with TAT in a single episode is given in Algorithm 1, where lines 11-18 correspond to the merging, lines 20-23 correspond to the expanding, line 25 corresponds to the acting, and line 27 corresponds to the pruning.
Open Source Code Yes Source code is available at https://github. com/langfeng Q/tree-diffusion-planner.
Open Datasets Yes We evaluate TAT on the Maze2D environments (Fu et al., 2020) to show its effectiveness in minimizing the artifacts risks in the original Diffuser. The Kuka block stacking suite (Janner et al., 2022) is designed for evaluating algorithms test-time flexibility. At last, we evaluate our method on Mu Jo Co tasks using D4RL offline locomotion suite (Fu et al., 2020).
Dataset Splits No The paper states it uses pre-trained models or retrains them using their original hyperparameters from cited works, but does not explicitly provide the specific training/validation/test splits within its own text for full reproduction of its data partitioning.
Hardware Specification Yes All experiments are run on the NVIDIA Ge Force RTX 3080 GPU core.
Software Dependencies No The paper mentions general software components like 'U-Net architecture', 'group norm', and 'Mish activation function', but does not specify their version numbers or other crucial software dependencies (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes Regarding the additional hyperparameters of TAT, we set λ = 0.98 and 1 α = 0.0005. The bath size of sampled trajectories is set to 128. Regarding the hyperparameters of TAT, we set λ = 0.98 and 1 α = 0.002. The bath size of trajectories is set to 64. For the hyperparameters of TAT, we set λ = 0.98, 1 α = 0.005 for Walker2d and Hopper, and 1 α = 0.002 for Half Cheetah. We found that we could reduce the sampling step for many tasks through warm-startΥ planning (e.g., from default 20 to 10 in the Hopper tasks). Since Diffuser and RGG generate default 64 trajectories for each planning step, we use the first 32 (half) for TAT construction.