Paper
9 January 2024 Enhancing audio perception in augmented reality: a dynamic vocal information processing framework
Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, Anqi Du
Author Affiliations +
Proceedings Volume 12969, International Conference on Algorithm, Imaging Processing, and Machine Vision (AIPMV 2023); 129691Z (2024) https://doi.org/10.1117/12.3014440
Event: International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023), 2023, Qingdao, China
Abstract
The development of the Metaverse nowadays has sparked widespread emotions among researchers, and correspondingly, many technologies have been derived to improve the human's sense of reality in the Metaverse. Especially, Extended Reality (XR), as an indispensable and important technology and research direction in the study of the metaverse, aims to bring seamless transformation between the virtual world and the real-world immersion to the experiential world. However, the technology we currently lack is the ability to simultaneously separate, classify, and locate dynamic human sound information to enhance human sound perception in complex noise environments. This article proposes a framework that utilizes FCNN for separation, algebraic models for positioning to obtain estimated distances, and SVM for classification. The dataset is built to simulates distance-related changes with accurate ground truth labels. The results show that our method can effectively separate, separate, and locate mixed sound data, providing users with comprehensive information about the content, gender, and distance of the speaking object in complex sound environments, enhancing their immersive experience and perception ability. Our innovation lies in the combination of three audio processing technologies and the framework proposed may well inspire future work on related topics.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Danqing Zhao, Shuyi Xin, Lechen Liu, Yihan Sun, and Anqi Du "Enhancing audio perception in augmented reality: a dynamic vocal information processing framework", Proc. SPIE 12969, International Conference on Algorithm, Imaging Processing, and Machine Vision (AIPMV 2023), 129691Z (9 January 2024); https://doi.org/10.1117/12.3014440
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
Back to Top