Very Important Person Localization in Unconstrained Conditions: A New Benchmark
Authors: Xiao Wang, Zheng Wang, Toshihiko Yamasaki, Wenjun Zeng2809-2816
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that the JSRII-GNN yields competitive accuracy on NCAA (National Collegiate Athletic Association), MS (Multi-scene), and Unconstrained-7k datasets. Table 3 shows that POINT achieves 97.3% m AP, and our method achieves 97.6%. In addition, we report the CMC curve in Figure 5(a). |
| Researcher Affiliation | Collaboration | Xiao Wang1, Zheng Wang2,3 , Toshihiko Yamasaki2,3, Wenjun Zeng4 1School of Computer Science and Technology, Wuhan University of Science and Technology 2Research Institute for an Inclusive Society through Engineering (RIISE), The University of Tokyo 3Department of Information and Communication Engineering, The University of Tokyo 4Microsoft Research Asia |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | https://github.com/xiaowang1516/VIPLoc. |
| Open Datasets | Yes | The NCAA Basketball Image Dataset is formed by extracting frames covering basketball match events. The MS dataset contains 2310 images from more than six types of scenes. This dataset includes training and testing subsets. During the dataset collection, we retrieved 50,000 images from the Internet through key words queries, such as speech , demonstration , interview , sports , military , meeting , etc. We manually identified ten kinds of scenes. The detector in this section is built upon the state-of-the-art object detection framework, i.e., Yolov4 (Bochkovskiy, Wang, and Liao 2020), pre-trained on COCO dataset (Caesar, Uijlings, and Ferrari 2018) for objects and pre-trained on head data (Shao et al. 2018) for person heads. |
| Dataset Splits | No | The paper mentions "Finally, we split the annotated data into train and test partitions according to the ratio of 1:1." but does not explicitly mention a separate validation split or how it was handled. |
| Hardware Specification | Yes | We implement JSRII-GNN using Py Torch on a machine with CPU i7, Ge Force GTX Titan X and 256 GB RAM. |
| Software Dependencies | No | The paper mentions "Py Torch" but does not specify a version number. Other software or frameworks like "Yolov4" and "Res Net-50" are mentioned, but without specific version numbers for reproducible setup. |
| Experiment Setup | Yes | τ is set to 0.3 in this paper. The loss function for VIPLoc is the cross-entropy loss. For each training step, we use 64 positive and negative pairs, representing a total of 128 pairs. The commonly used cross-entropy loss is employed to penalize the model, and the SGD (stochastic gradient descent) is used to optimize the model for backward computation. |