Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection

doi:10.48550/arXiv.2403.12172

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection

Skeleton-based video anomaly detection (SVAD) is a crucial task in computer vision. Accurately identifying abnormal patterns or events enables operators to promptly detect suspicious activities, thereby enhancing safety. Achieving this demands a comprehensive understanding of human motions, both at body and region levels, while also accounting for the wide variations of performing a single action. However, existing studies fail to simultaneously address these crucial properties. This paper introduces a novel, practical and lightweight framework, namely Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection (GiCiSAD) to overcome the challenges associated with SVAD. GiCiSAD consists of three novel modules: the Graph Attention-based Forecasting module to capture the spatio-temporal dependencies inherent in the data, the Graph-level Jigsaw Puzzle Maker module to distinguish subtle region-level discrepancies between normal and abnormal motions, and the Graph-based Conditional Diffusion model to generate a wide spectrum of human motions. Extensive experiments on four widely used skeleton-based video datasets show that GiCiSAD outperforms existing methods with significantly fewer training parameters, establishing it as the new state-of-the-art.

Publication:

arXiv e-prints

Pub Date:

March 2024

DOI:

10.48550/arXiv.2403.12172

arXiv:

arXiv:2403.12172

Bibcode:

2024arXiv240312172K

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Artificial Intelligence

E-Print:

Accepted at the Winter Conference on Applications of Computer Vision (WACV). 17 pages, 6 figures, 6 tables

NASA/ADS

Graph-Jigsaw Conditioned Diffusion Model for Skeleton-based Video Anomaly Detection

Abstract