Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics
Authors: Xueyuan Yang, Chao Yao, Xiaojuan Ban
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate our proposed approach achieves significant improvements in multiple metrics compared to existing methods. Notably, with textual supervision, our method not only differentiates between ambiguous actions such as sitting and standing but also produces more precise and natural motion. |
| Researcher Affiliation | Academia | Xueyuan Yang 1,2, Chao Yao1,2*, Xiaojuan Ban1,2,3,4* 1Beijing Advanced Innovation Center for Materials Genome Engineering, Beijing 100083, China. 2University of Science and Technology Beijing, Beijing 100083, China. 3Key Laboratory of Intelligent Bionic Unmanned Systems, Ministry of Education, Beijing 100083, China. 4Institute of Materials Intelligent Technology, Liaoning Academy of Materials, Shenyang 110004, China. m202210673@xs.ustb.edu.cn,{yaochao, banxj}@ustb.edu.cn |
| Pseudocode | No | The paper describes its modules and processes through text and diagrams (e.g., Figure 2) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing its source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We utilized the Babel dataset (Punnakkal et al. 2021) for semantic annotations... For the DIP-IMU dataset (Huang et al. 2018)... The AMASS dataset... Totalcapture dataset (Trumble et al. 2017)... |
| Dataset Splits | No | The paper mentions using specific subjects for 'evaluation' (Subjects 9 and 10 from DIP-IMU) and the rest for 'training', but it does not specify a distinct 'validation' dataset split or its size for hyperparameter tuning or early stopping, which is common in machine learning contexts. |
| Hardware Specification | Yes | The entire training and evaluation regimen was conducted on a system equipped with 1 Intel(R) Xeon(R) Silver 4110 CPU and 1 NVIDIA Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | Yes | Our model was developed using Py Torch 1.13.0, further accelerated by CUDA 11.6. |
| Experiment Setup | Yes | Our model configuration sets the input sequence length T at 80 frames, with a window and shifted size of 20 and 10 frames, respectively, and a threshold of M being 15. The training process, utilizing a batch size of 40, incorporates the Adam optimizer (Kingma and Ba 2017) initialized with a learning rate of 2e-5. To balance the magnitude of the loss, we set λ and α to 1, β to 10, δ to 0.1, and γ to 0.01. |