Paper
6 June 2024 Multiview stereo reconstruction based on context-aware transformer
Zhaoxu Tian
Author Affiliations +
Proceedings Volume 13175, International Conference on Computer Network Security and Software Engineering (CNSSE 2024); 131750U (2024) https://doi.org/10.1117/12.3032052
Event: 4th International Conference on Computer Network Security and Software Engineering (CNSSE 2024), 2024, Sanya, China
Abstract
This paper tackles the challenges inherent in existing Multi-View Stereo (MVS) methods, which often struggle with scenes that have repetitive textures and complex scenarios, leading to reconstructions that lack quality, completeness, and accuracy. To address these issues, we introduce a novel deep learning network, Clo-PatchmatchNet, which leverages context-aware Transformers for enhanced performance. The network's architecture starts with a feature extraction module that processes image features. These features are then input into a learnable Patchmatch algorithm, creating an initial depth map. This map undergoes further refinement to yield the final, detailed depth map. A key innovation in our approach is the integration of a context-aware Transformer block, known as Cloblock, into the feature extraction stage. This allows the network to effectively capture both global contextual information and high-frequency local details, enhancing feature matching across various views. Our experimental evaluations, conducted on the Technical University of Denmark (DTU) dataset, reveal that Clo-PatchmatchNet outperforms the traditional PatchmatchNet by achieving a 2.5% improvement in reconstruction completeness and a 1.2% increase in accuracy, resulting in an overall enhancement of 1.7%. Moreover, when compared to other contemporary methods, our proposed solution demonstrates superior performance in terms of both completeness and overall quality, marking a significant advancement in the field of 3D reconstruction.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Zhaoxu Tian "Multiview stereo reconstruction based on context-aware transformer", Proc. SPIE 13175, International Conference on Computer Network Security and Software Engineering (CNSSE 2024), 131750U (6 June 2024); https://doi.org/10.1117/12.3032052
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Depth maps

Feature extraction

Education and training

Image processing

Point clouds

Visualization

Back to Top