Cone beam CT (CBCT) imaging with sparse-view can effectively reduce the radiation dose risk. The convolution-based end-to-end deep learning methods have been used in single-view CBCT image reconstruction, which can minimize radiation dose and achieve fast CBCT imaging. However, these methods ignore the mismatch between the local feature extraction ability of convolutional neural network (CNN) and the global features of the projection image. To address this issue, we propose a novel deep learning network architecture based on Swin Transformer for single-view CBCT reconstruction. First, we use the Swin Transformer network block to construct a single-view projection feature extraction module, then through the feature transformation module, we convert the 2D features learned from projection into 3D feature tensors, and finally get the 3D volume image in the generative network. This paper is the first attempt to use the Swin Transformer model for single-view CBCT reconstruction. Experimental results with the mouse datasets demonstrate that the proposed model outperforms the convolution-based end-to-end deep learning model in reducing artifacts and preserving the image accuracy.
|