Paper
26 June 2023 TEN: temporal excitation network for video action recognition
Dengdi Sun, Zhenhao He, Bin Luo, Zhuanlian Ding
Author Affiliations +
Proceedings Volume 12714, International Conference on Computer Network Security and Software Engineering (CNSSE 2023); 127141T (2023) https://doi.org/10.1117/12.2683421
Event: Third International Conference on Computer Network Security and Software Engineering (CNSSE 2023), 2023, Sanya, China
Abstract
Temporal modeling has attracted the attention of a large number of researchers in the past few years. In this work, we propose a new video architecture, termed as Temporal Excitation Network (TEN). The core of TEN is Temporal Excitation Module (TEM) block, which consists of Temporal Convolution Module (TCM) and Temporal Difference Module (TDM). TCM applies a channel-wise convolution to supplements the short-range temporal information. TDM works by computing feature-level long-range temporal differences and then exploiting it to excite motion-sensitive channels. These two-stage modeling scheme can be fused into existing 2D CNNs to model temporal structures flexibly and efficiently. Extensive experiments demonstrate the effectiveness of the proposed TEN on several benchmarks (e.g., UCF101, HMDB51, Something-Something V1 and Jester). The proposed TEN can guarantee high recognition accuracy while maintaining high recognition efficiency on these datasets.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Dengdi Sun, Zhenhao He, Bin Luo, and Zhuanlian Ding "TEN: temporal excitation network for video action recognition", Proc. SPIE 12714, International Conference on Computer Network Security and Software Engineering (CNSSE 2023), 127141T (26 June 2023); https://doi.org/10.1117/12.2683421
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Action recognition

Convolution

Transmission electron microscopy

Time division multiplexing

Modeling

3D modeling

Back to Top