Efficient Planning with Latent Diffusion

Authors: Wenhao Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive numerical results on low-dimensional locomotion control tasks reveal that Latent Diffuser exhibits competitive performance against robust baselines and outperforms them on tasks of greater dimensionality. Our main contributions encompass: ... 3) Numerical experiments exhibit the competitive performance of Latent Diffuser and its applicability across a range of low- and high-dimensional continuous control tasks.
Researcher Affiliation Academia Wenhao Li School of Software Engineering, Tongji University Shanghai, 201804, China liwenhao@cuhk.edu.cn
Pseudocode Yes Algorithm 1 Latent Diffuser: Efficient Planning with Latent Diffusion
Open Source Code Yes The forthcoming release of the complete source code will be subject to the Creative Commons Attribution 4.0 License (CC BY), with the exception of the gym locomotion control, Adroit, and Ant Maze datasets, which will retain their respective licensing arrangements.
Open Datasets Yes The empirical evaluation encompasses three task categories derived from D4RL (Fu et al., 2020): namely, Gym locomotion control, Adroit, and Ant Maze.
Dataset Splits No The paper mentions "Each task undergoes assessment with a total of 5 distinct training seeds, evaluated over a span of 20 episodes." but does not specify a training/validation/test split or cross-validation details for the dataset itself. It refers to "offline dataset D" for training, but not its split.
Hardware Specification Yes The computational infrastructure consists of dual servers, each possessing 256 GB of system memory, as well as a pair of NVIDIA Ge Force RTX 3090 graphics processing units equipped with 24 GB of video memory.
Software Dependencies No The paper mentions using specific optimizers like "Adam optimizer" and refers to "GPT-2 style Transformer" and "temporal U-Net" architectures. However, it does not provide specific version numbers for software dependencies like Python, PyTorch, TensorFlow, or any other libraries used.
Experiment Setup Yes The action decoder is trained employing the Adam optimizer, featuring a learning rate of 2e 4 and batch size of 32 across 2e6 training steps. The reward and return decoder are also trained utilizing the Adam optimizer, however, with a learning rate of 2e 4 and batch size of 64 spanning 1e6 training steps. ... We employ the Adam optimization algorithm, utilizing a learning rate of 2 10 4, a batch size of 32, and performing 2 106 training iterations. The probability, denoted by p, of excluding conditioning information s1 is set to 0.25, and K = 100 diffusion steps are executed.