reproducibilityindex.ai

Recognizing Actions in 3D Using Action-Snippets and Activated Simplices

Authors: Chunyu Wang, John Flynn, Yizhou Wang, Alan Yuille

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁrst evaluate how well activated simplices can represent data by computing the distance between a data point and its projection onto the nearest simplex. Then we present action recognition results on three standard benchmark datasets (Li, Zhang, and Liu 2010) (Seidenari et al. 2013) (Xia, Chen, and Aggarwal 2012). We also provide diagnostic analysis.
Researcher Affiliation	Academia	1Nat l Eng. Lab. for Video Technology, Cooperative Medianet Innovation Center Key Lab. of Machine Perception (Mo E), Sch l of EECS, Peking University, Beijing, 100871, China {wangchunyu, Yizhou.Wang}@pku.edu.cn 2Department of Statistics, University of California, Los Angeles (UCLA), USA {john.ﬂynn,yuille}@stat.ucla.edu
Pseudocode	No	The paper describes the algorithm and process in prose and mathematical equations but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We conduct experiments on a large human pose dataset H3.6M (Ionescu et al. 2014). We use 11, 000 3D poses of 11 actions including taking photo , smoking , purchases discussion , etc from the dataset to evaluate our method. The MSR-Action3D Dataset (Li, Zhang, and Liu 2010) provides 557 human pose sequences of ten subjects performing 20 actions which are recorded with a depth sensor. The Florence Dataset (Seidenari et al. 2013) includes nine activities including wave, drink from a bottle, etc. The UTKinect Dataset (Xia, Chen, and Aggarwal 2012) was captured using a single stationary Kinect.
Dataset Splits	Yes	We split the 11, 000 poses into training and testing subsets each containing 5, 500 poses of the 11 actions. Many works choose ﬁve subjects for training and the remaining subjects for testing, e.g. in (Li, Zhang, and Liu 2010), and report the result based on a single split. To make the results more comparable we experiment with all 252 possible splits and report the average accuracy. Following the dataset recommendation, we use a leave-one-actor-out protocol: we train the classiﬁer using all the sequences from nine out of ten actors and test on the remaining one. We use the standard leave-one-sequence-out protocol where one sequence is used for testing and the remaining are used for training.
Hardware Specification	No	Acknowledgements: We thank Xianjie Chen for helping improve the writing, and for support from the following research grants 973-2015CB351800, the Okawa Foundation Research Grant, NSFC-61272027, NSFC-61231010, NSFC-61527804, NSFC-61421062, NSFC-61210005, ONR grant N00014-15-1-2356 and ARO 62250-CS. We also thank NVIDIA Corporation for donating the GPUs. The paper mentions GPUs but does not specify exact models or other hardware details for the experiments.
Software Dependencies	No	The paper mentions methods and frameworks like 'sparse coding (Mairal et al. 2009)' and 'k-means initialized by k++ (Arthur and Vassilvitskii 2007)', but does not list specific software libraries, packages, or programming languages with version numbers required for reproduction.
Experiment Setup	Yes	We set the number of bases for each class to be 40 (by cross-validation). We obtain about 15 activated simplices, whose dimensions are ﬁve on average, for each class, activated simplices achieves recognition accuracy of 91.40%. We set the number of bases for each class to be 50 (450 in total) by cross-validation. We learn 40 bases and about 20 activated simplices for each action class. The dimension of the simplices is ﬁve on average. We also evaluated the inﬂuence of the two main parameters in the model, i.e. the number of bases and the number of poses in an action-snippet.