Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards Omni-Supervised Face Alignment for Large Scale Unlabeled Videos

Authors: Congcong Zhu, Hao Liu*(corresponding author), Zhenhua Yu, Xuehong Sun13090-13097

AAAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results demonstrate that our approach surpasses the performance of most fully supervised state-of-the-arts. To justify the effectiveness of the proposed STRRN, we represent folds of experimental results and analysis based on three downloaded large scale video datasets.
Researcher Affiliation	Academia	1School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China 2School of Information Engineering, Ningxia University, Yinchuan, 750021, China 3Collaborative Innovation Center for Ningxia Big Data and Artiﬁcial Intelligence Co-founded by Ningxia Municipality and Ministry of Education, Yinchuan, 750021, China
Pseudocode	Yes	Algorithm 1: Training Procedure of Our STRRN
Open Source Code	No	The paper states 'More details will be made publicly in our release model and source code.' which indicates a future release, not concrete access to source code at the time of publication.
Open Datasets	Yes	Evaluation Datasets: 300VW (Shen et al. 2015): The 300 Videos in the Wild (300VW) Dataset was collected speciﬁc for video-based face alignment. You Tube-Face (Wolf, Hassner, and Maoz 2011) and You Tube-Celebrities(Kim et al. 2008): We also leveraged two large scale unlabeled video datasets including You Tube Face (Wolf, Hassner, and Maoz 2011) and You Tube-Celebrities(Kim et al. 2008).
Dataset Splits	No	For the 300VW dataset, the paper states 'we utilized 50 sequences for training and the remaining 64 sequences were used for testing,' but it does not specify a distinct validation split or cross-validation setup.
Hardware Specification	Yes	The whole training procedure processes at about 60ms each frame with a GPU of single NVIDIA GTX 1080 Ti graphic computation card (11G memory). Excluding the time of the face detection part, our model runs at 30 frames per second on one CPU with the Intel(R) Core(TM) i5-6500 CPU@3.20GHz and requires around 2G memory usage for runtime data loading.
Software Dependencies	No	The paper mentions 'Tensorflow' but does not provide specific version numbers for it or any other key software dependencies.
Experiment Setup	Yes	For hyper-parameters in our STRRN, we empirically set the discounted factor λ to 0.4 and the thresholding T to the normalized RMSE 0.02 during generating extra training annotations.