AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

Authors: Zhixuan Liang, Yao Mu, Mingyu Ding, Fei Ni, Masayoshi Tomizuka, Ping Luo

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical experiments on two benchmark environments and two carefully designed unseen tasks in KUKA industrial robot arm and Maze2D environments demonstrate the effectiveness of Adapt Diffuser.
Researcher Affiliation Academia 1Department of Computer Science, The University of Hong Kong, Hong Kong SAR 2University of California, Berkeley, USA 3College of Intelligence and Computing, Tianjin University, Tianjin, China 4Shanghai AI Laboratory, Shanghai, China.
Pseudocode No The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code for a method or procedure.
Open Source Code No The paper mentions 'More visualization results and demo videos could be found on our project page.' in the abstract, which might contain code, but it does not provide an unambiguous statement or direct link to the specific source code for the methodology developed in this paper ('Adapt Diffuser'). It only references the official implementations of third-party baselines (IQL and Diffuser).
Open Datasets Yes Maze2D: Maze2D (Fu et al., 2020) environment is a navigation task in which a 2D agent needs to traverse from a randomly designated location to a fixed goal location where a reward of 1 is given... Mu Jo Co: Mu Jo Co (Todorov et al., 2012) is a physics engine that allows for real-time simulation of complex mechanical systems. It has three typical tasks: Hopper, Half Cheetah, and Walker2d. Each task has 4 types of datasets to test the performance of an algorithm: medium , random , medium-replay and medium-expert .
Dataset Splits No The paper mentions using D4RL datasets and training parameters but does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or exact sample counts) or the methodology for creating these splits for its experiments. It implicitly relies on the standard usage of D4RL datasets without providing explicit details for reproduction.
Hardware Specification Yes All these data are tested with one NVIDIA RTX 3090 GPU.
Software Dependencies No The paper mentions using a temporal U-Net architecture and the Adam optimizer, but it does not specify concrete version numbers for ancillary software dependencies such as programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other relevant libraries.
Experiment Setup Yes The diffusion model is trained using the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 2 10 4 and batch size of 32. The training steps of the diffusion model are 1M for Mu Jo Co locomotion task, 2M for tasks on Maze2D and 0.7M for KUKA Robot Arm tasks. The planning horizon T is set as 32 in all locomotion tasks, 128 for KUKA pick-and-place, 128 in Maze2D-UMaze, 192 in Maze2D-Medium, and 384 in Maze2D-Large. We use K = 100 diffusion steps for all locomotion tasks, 1000 for KUKA robot arm tasks, 64 for Maze2D-UMaze, 128 for Maze2D-Medium, and 256 for Maze2D-Large.