Paper
12 September 2024 Vision language distillation by clustering bitrajectory matching
Jiaming Zhou, Shangjiaqi Hao, Qinghao Zhang
Author Affiliations +
Proceedings Volume 13256, Fourth International Conference on Computer Vision and Pattern Analysis (ICCPA 2024); 1325605 (2024) https://doi.org/10.1117/12.3037820
Event: Fourth International Conference on Computer Vision and Pattern Analysis (ICCPA 2024), 2024, Anshan, China
Abstract
Dataset distillation is often used to create compact datasets that can be used to achieve similar training performance, making it a good choice for addressing the challenges of data storage cost and training cost. However, existing distillation method are generally time-intensive and computationally expensive, especially when applied to vision-language tasks. To address this challenge, we propose the Clustering BiTrajectory Matching method, which accelerates existing distillation techniques by 8 times through two innovative strategies: a clustering-based sample selection and a biTrajectory optimization approach. The Clustering BiTrajectory Matching method can achieve good accuracy in a multi-modal setting while requiring lower computation resources and emphasizing efficiency in pre-training. We evaluate the proposed method on the Flickr8k dataset. We show that our method is able to achieve better efficiency (less iteration to achieve target accuracy), while outperforming other coreset selection methods.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jiaming Zhou, Shangjiaqi Hao, and Qinghao Zhang "Vision language distillation by clustering bitrajectory matching", Proc. SPIE 13256, Fourth International Conference on Computer Vision and Pattern Analysis (ICCPA 2024), 1325605 (12 September 2024); https://doi.org/10.1117/12.3037820
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Visual process modeling

Machine learning

Performance modeling

Active learning

Pattern recognition

Systems modeling

Back to Top