Paper
14 August 2019 Real-time action recognition based on enhanced motion vector temporal segment network
Xue Bai, Enqing Chen, Haron Chweya Tinega
Author Affiliations +
Proceedings Volume 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019); 111791W (2019) https://doi.org/10.1117/12.2540268
Event: Eleventh International Conference on Digital Image Processing (ICDIP 2019), 2019, Guangzhou, China
Abstract
At present, the method based on two-stream network has achieved good recognition performance in action recognition, however, its real-time performance is obstructed due to the high computational cost of optical flow. Temporal Segment Network (TSN), a successful example based on the two-stream network, achieves high recognition performance but cannot be processed in real time. In this paper, the motion vector TSN (MV-TSN) is proposed by introducing the motion vector into temporal segment networks, which greatly speeds up the processing speed of TSN. In order to solve the problem of performance degradation caused by the motion vectors lacking fine structure information, we propose a knowledge transfer strategy, which initializes the MV-TSN with the fine knowledge learned by optical flow. The experimental results show that the proposed method achieves a comparable recognition performance to the previous state-of-the-art approaches on UCF-101 and HMDB-51, and the processing speed is 206.2 fps, which is 13 times of the original TSN.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xue Bai, Enqing Chen, and Haron Chweya Tinega "Real-time action recognition based on enhanced motion vector temporal segment network", Proc. SPIE 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019), 111791W (14 August 2019); https://doi.org/10.1117/12.2540268
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical flow

RGB color model

Video compression

Video processing

Convolution

Video acceleration

Network architectures

RELATED CONTENT


Back to Top