Motion-Based Sign Language Video Summarization using Curvature and Torsion

doi:10.48550/arXiv.2305.16801

Motion-Based Sign Language Video Summarization using Curvature and Torsion

An interesting problem in many video-based applications is the generation of short synopses by selecting the most informative frames, a procedure which is known as video summarization. For sign language videos the benefits of using the $t$-parameterized counterpart of the curvature of the 2-D signer's wrist trajectory to identify keyframes, have been recently reported in the literature. In this paper we extend these ideas by modeling the 3-D hand motion that is extracted from each frame of the video. To this end we propose a new informative function based on the $t$-parameterized curvature and torsion of the 3-D trajectory. The method to characterize video frames as keyframes depends on whether the motion occurs in 2-D or 3-D space. Specifically, in the case of 3-D motion we look for the maxima of the harmonic mean of the curvature and torsion of the target's trajectory; in the planar motion case we seek for the maxima of the trajectory's curvature. The proposed 3-D feature is experimentally evaluated in applications of sign language videos on (1) objective measures using ground-truth keyframe annotations, (2) human-based evaluation of understanding, and (3) gloss classification and the results obtained are promising.

Publication:

arXiv e-prints

Pub Date:

May 2023

DOI:

10.48550/arXiv.2305.16801

arXiv:

arXiv:2305.16801

Bibcode:

2023arXiv230516801S

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Computation and Language;
68T45;
68U10;
I.4.9;
I.5.4;
I.2.7

E-Print:

This work is under consideration at Pattern Recognition Letters for possible publication

NASA/ADS

Motion-Based Sign Language Video Summarization using Curvature and Torsion

Abstract