Many existing deep learning methods have been proposed for Salient Object Detection (SOD) in the natural images, however they may be not compatible enough for remote sensing images by ignoring some unique domain knowledge for remote sensing images. For example, satellite images might contain more complex contexts than natural images, and many salient objects in the satellite images are small-size objects, but the existing deep learning based SOD methods for natural images do not have these special considerations. In this paper, we propose a new Transformer-aware Encoder-Decoder Network (TEDNet) combining a hybrid Convolutional Neural Network-Transformer encoder and a Transformer-enhanced decoder to learn the complex context features from the local neighbors by convolution and the long-range region dependency by Transformer for the SOD task in remote sensing images. Furthermore, we propose a new image-level and pixel-level size-guided loss for the small salient object mining to train the proposed TEDNet. Experimental results on a publicized remote sensing SOD dataset show the effectiveness and accuracy of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.