Recently, boundary information has gained more attention in improving the performance of semantic segmentation. This paper presents a novel symmetrical network, called BASNet, which contains four components: the pre-trained ResNet-101 backbone, semantic segmentation branch (SSB), boundary detection branch (BDB), and aggregation module (AM). More specifically, our BDB only focuses on processing boundary-related information using a series of spatial attention blocks (SABs). On the other hand, a set of global attention blocks (GABs) are used in SSB to further capture more accurate object boundary information and semantic information. Finally, the outputs of SSB and BDB are fed into AM, which merges the features from SSB and BDB to boost performance. The exhaustive experimental results show that our method not only predicts the boundaries of objects more accurately, but also improves the performance of semantic segmentation.
This paper introduces a lightweight convolutional neural network, called ECDet, for real-time accurate object detection. In contrast to recent advances of lightweight networks that prefer to use pointwise convolution for changing the number of feature map’s channel, ECDet makes an effort to design equal channel block for constructing the whole backbone network architecture. Meanwhile, we deploy depth-wise convolution to compress the feature pyramid network (FPN) detection head. The experiments show that ECDet only has 3.19 M model size and needs only 3.48B FLOPs with a 416×416 input image. Our method has a 5% improvement in accuracy compared to YOLO Nano, and it requires less computation. The comprehensive experiments demonstrate that our model achieves promising results in terms of available speed and accuracy trade-off on PASCAL VOC 2007 datasets.
Occlusion is one of the most challenging problems in visual object tracking. Recently, a lot of discriminative methods have been proposed to deal with this problem. For the discriminative methods, it is difficult to select the representative samples for the target template updating. In general, the holistic bounding boxes that contain tracked results are selected as the positive samples. However, when the objects are occluded, this simple strategy easily introduces the noises into the training data set and the target template and then leads the tracker to drift away from the target seriously. To address this problem, we propose a robust patch-based visual tracker with online representative sample selection. Different from previous works, we divide the object and the candidates into several patches uniformly and propose a score function to calculate the score of each patch independently. Then, the average score is adopted to determine the optimal candidate. Finally, we utilize the non-negative least square method to find the representative samples, which are used to update the target template. The experimental results on the object tracking benchmark 2013 and on the 13 challenging sequences show that the proposed method is robust to the occlusion and achieves promising results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.