Multi-Person 3D Motion Prediction with Multi-Range Transformers

Authors: Jiashun Wang, Huazhe Xu, Medhini Narasimhan, Xiaolong Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform our experiments on multiple datasets including CMU-Mocap [1], Mu Po TS-3D [48], 3DPW [64] for multi-person motion prediction in 3D (with 2 3 persons).
Researcher Affiliation Academia Jiashun Wang1 Huazhe Xu2 Medhini Narasimhan2 Xiaolong Wang1 1UC San Diego 2UC Berkeley jiw077@ucsd.edu {huazhe_xu,medhini}@berkeley.edu xiw012@eng.ucsd.edu
Pseudocode No The paper describes the network architecture and its components in detail with mathematical equations, but it does not include any pseudocode or algorithm blocks.
Open Source Code Yes Project page with code is available at https://jiashunwang.github.io/MRT/.
Open Datasets Yes We perform our experiments on multiple datasets. CMU-Mocap [1], Mu Po TS-3D [48] and 3DPW [64] datasets are collected using cameras with pose estimation and optimization.
Dataset Splits No The paper describes using CMU-Mocap as training data and sampling a test set, but it does not explicitly provide details about a validation dataset split.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using Adam as an optimizer but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch version, TensorFlow version, CUDA version).
Experiment Setup Yes In our experiments, we give 1 second history motion (k = 15 time steps) as input and recursively predict the future 3 seconds (45 time steps) as Sec. 3.3 described. We use L = 3 alternating layers with 8 heads in each Transformer. We use Adam [32] as the optimizer for our networks. During training, we set 3 10 4 as the learning rate for predictor P and 5 10 4 as the learning rate for discriminator D. We set λrec = 1 and λadv = 5 10 4. For experiments with 2 3 persons, we set a batch size of 32 and for scene with more people, we set a batch size of 8.