Query graph attention for video relation detection

Jian Wang; Haibin Cai

doi:10.1117/12.2681229

8 June 2023 Query graph attention for video relation detection

Jian Wang, Haibin Cai

Proceedings Volume 12707, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023); 127070S (2023) https://doi.org/10.1117/12.2681229
Event: International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023), 2023, Changsha, China

Abstract

As a bridge to connect vision and language, visual relations between objects, visual relation provide a more comprehensive visual content understanding beyond objects. Most previous works adopt the track-to-detect framework for video visual relation detection (VidVRD), which cannot capture long-term spatio- temporal contexts in different stages and also suffers from inefficiency. In this work, we propose a query-based method for video visual relation detection. Our model exploits graph structure to autoregressively generate relation graphs with spatio-temporal contexts and uses an attentional graph convolutional network to fuse the contexts. Experiments on benchmark datasets ImageNet-VidVRD demonstrate the accuracy of our method.

Citation Download Citation

Jian Wang and Haibin Cai "Query graph attention for video relation detection", Proc. SPIE 12707, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023), 127070S (8 June 2023); https://doi.org/10.1117/12.2681229

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
9 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Object detection

Video

Visualization

Education and training

Semantics

Transformers

Detection and tracking algorithms

Show All Keywords

Keywords/Phrases

Search In:

Publication Years