Scene-Centric Joint Parsing of Cross-View Videos

Authors: Hang Qi, Yuanlu Xu, Tao Yuan, Tianfu Wu, Song-Chun Zhu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Quantitative experiments show that scene-centric predictions in the parse graph outperform view-centric predictions. We evaluate our scene-centric joint-parsing framework in tasks including object detection, multi-object tracking, action recognition, and human attributes recognition.
Researcher Affiliation Academia 1Dept. Computer Science and Statistics, University of California, Los Angeles (UCLA) 2Dept. Electrical and Computer Engineering, NC State University
Pseudocode No The paper describes its inference process and algorithms in detailed prose but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper provides links to datasets (e.g., bitbucket.org/merayxu/multiview-object-tracking-dataset) and external tools (Faster RCNN) but does not include an explicit statement or link confirming the release of the authors' own source code for the methodology described in the paper.
Open Datasets Yes The CAMPUS dataset (Xu et al. 2016) 1 contains video sequences from four scenes each captured by four cameras. 1bitbucket.org/merayxu/multiview-object-tracking-dataset. The TUM Kitchen dataset (Tenorth, Bandouch, and Beetz 2009)2 is an action recognition dataset... 2ias.in.tum.de/software/kitchen-activity-data
Dataset Splits No The paper mentions that tuning parameters "can be learned via cross-validation" and refers to "training sequences" for estimating prior distributions, but it does not provide specific percentages, sample counts, or a detailed methodology for train/validation/test dataset splits for their main experiments.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments, only mentioning general runtime performance.
Software Dependencies No The paper mentions using "Faster RCNN (Ren et al. 2015)" and refers to a "fully-connected neural network" and an "attribute grammar model," but it does not provide specific version numbers for these or any other key software components, libraries, or solvers used in the experiments.
Experiment Setup No The paper describes general aspects of its setup, such as using Faster RCNN for initial object proposals and a fully-connected neural network for action recognition. However, it does not provide specific numerical hyperparameters (e.g., learning rate, batch size, number of epochs) or system-level training settings needed for full experimental reproduction.