MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

doi:10.48550/arXiv.2109.05569

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

Understanding movies and their structural patterns is a crucial task in decoding the craft of video editing. While previous works have developed tools for general analysis, such as detecting characters or recognizing cinematography properties at the shot level, less effort has been devoted to understanding the most basic video edit, the Cut. This paper introduces the Cut type recognition task, which requires modeling multi-modal information. To ignite research in this new task, we construct a large-scale dataset called MovieCuts, which contains 173,967 video clips labeled with ten cut types defined by professionals in the movie industry. We benchmark a set of audio-visual approaches, including some dealing with the problem's multi-modal nature. Our best model achieves 47.7% mAP, which suggests that the task is challenging and that attaining highly accurate Cut type recognition is an open research problem. Advances in automatic Cut-type recognition can unleash new experiences in the video editing industry, such as movie analysis for education, video re-editing, virtual cinematography, machine-assisted trailer generation, machine-assisted video editing, among others. Our data and code are publicly available: https://github.com/PardoAlejo/MovieCuts}{https://github.com/PardoAlejo/MovieCuts.

Publication:

arXiv e-prints

Pub Date:

September 2021

DOI:

10.48550/arXiv.2109.05569

arXiv:

arXiv:2109.05569

Bibcode:

2021arXiv210905569P

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

Paper's website: https://www.alejandropardo.net/publication/moviecuts/

NASA/ADS

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

Abstract