Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
INFP: INdustrial Video Anomaly Detection via Frequency Prioritization
Authors: Qianzi Yu, Kai Zhu, Yang Cao, Yu Kang
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the benchmark IPAD dataset demonstrate the superiority of our proposed method over the state-of-the-art. |
| Researcher Affiliation | Academia | 1University of Science and Technology of China, Hefei, China 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China EMAIL, EMAIL |
| Pseudocode | No | The paper describes the method using mathematical equations and textual descriptions, but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | IPAD[Liu et al., 2024] is the first video anomaly detection dataset that focuses on industrial scenarios, which contains a total of 597,979 frames, with 430,867 frames allocated for training data and 167,112 frames for the test data. |
| Dataset Splits | Yes | IPAD... contains a total of 597,979 frames, with 430,867 frames allocated for training data and 167,112 frames for the test data. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions using the Adam optimizer but does not provide specific version numbers for any software dependencies or libraries used in the implementation. |
| Experiment Setup | Yes | Each video frame is resized as 224 288, the intensity of which is normalized to the range of [ 1, 1] before being fed into the model. The learning rate is set as 2e-4 initially and decrease to 1e-4 at epoch 120. The Adam optimizer is used to train our network. A sequence of five video frames is randomly selected from the training set, with the first four frames serving as input and the fifth frame serving as the ground truth. |