Medical image segmentation aims to categorize pixels into different regions according to their corresponding tissues / organs in medical image. In recent years, due to Transformer's outstanding ability in the field of computer vision, various visual Transformers has been exploited in this task. However, these models often suffer from quadratic complexity in the self-attention and multi-scale information interaction. In this paper, we propose a novel dual attention and pyramid-aware network, DAPFormer, to solve the aforementioned limitations. It effectively combines efficient and channel attention into a dual attention mechanism to capture spatial and inter-channel relationships in the feature dimensions, meanwhile maintains computational efficiency. Additionally, we use pyramid-aware module to redesign the skip connection, modeling the cross-scale dependencies and addressing complex scale variations. Experiments on multi-organ cardiac and skin lesion segmentation datasets demonstrate that DAPFormer outperforms state-of-the-art methods.
|