Paper
13 May 2024 Joint misalignment-aware bilateral detection network for human pose estimation in videos
Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, Dan Xu
Author Affiliations +
Proceedings Volume 13158, Seventh International Conference on Computer Graphics and Virtuality (ICCGV 2024); 131580C (2024) https://doi.org/10.1117/12.3029389
Event: Seventh International Conference on Computer Graphics and Virtuality (ICCGV24), 2024, Hangzhou, China
Abstract
Existing human pose estimation methods in videos often rely on sampling strategies to select frames for estimation tasks. Common sampling approaches include uniform sparse sampling and keyframe selection. However, the former focuses solely on fixed positions of video frames, leading to the omission of dynamic information, while the latter incurs high computational costs by processing each frame. To address these issues, we propose an efficient and effective pose estimation framework, named Joint Misalignment-aware Bilateral Detection Network (J-BDNet). Our framework incorporates a Bilateral Dynamic Attention Module (BDA) using knowledge distillation for efficiency. BDA detects dynamic information on both left and right halves of a video segment, guiding the sampling process. Additionally, employing a smart bilateral recursive sampling strategy with BDA enables extracting more spatiotemporal dependencies from pose data, reducing computational costs without increasing the pose estimator’s usage frequency. Moreover, we enhance existing denoise network robustness by randomly exchanging body joint positions in pose data. Experiments demonstrate the performance of our framework in terms of high occlusion, spatial blur, and illumination variations, and achie state-of-the-art performance on Sub-JHMDB datasets.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Qianyun Song, Hao Zhang, Yanan Liu, Shouzheng Sun, and Dan Xu "Joint misalignment-aware bilateral detection network for human pose estimation in videos", Proc. SPIE 13158, Seventh International Conference on Computer Graphics and Virtuality (ICCGV 2024), 131580C (13 May 2024); https://doi.org/10.1117/12.3029389
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Pose estimation

Video

Image segmentation

Education and training

Sampling rates

Transformers

Video processing

RELATED CONTENT


Back to Top