Computer-aided detection (CAD) approaches have shown promising results for early esophageal cancer detection using Volumetric Laser Endoscopy (VLE) imagery. However, the relatively slow and computationally costly tissue segmentation employed in these approaches hamper their clinical applicability. In this paper, we propose to reframe the 2D tissue segmentation problem into a 1D tissue boundary detection problem. Instead of using an encoder-decoder architecture, we propose to follow the tissue boundary using a Recurrent Neural Network (RNN), exploiting the spatio-temporal relations within VLE frames. We demonstrate a near state-of-the-art performance using 18 times less floating point operations, enabling real-time execution in clinical practice.
Barrett's Esophagus is a precursor of esophageal adenocarcinoma, one of the most lethal forms of cancer. Volumetric laser endomicroscopy (VLE) is a relatively new technology used for early detection of abnormal cells in BE by imaging the inner tissue layers of the esophagus. Computer-Aided Detection (CAD) shows great promise in analyzing the VLE frames due to the advances in deep learning. However, a full VLE scan produces 1,200 scans of 4,096 x 2,048 pixels, making automated pre-processing for the tissue of interest extraction necessary. This paper explores an object detection for tissue detection in VLE scans. We show that this can be achieved in real time with very low inference time, using single-stage object detection like YOLO. Our best performing model achieves a value of 98.23% for the mean average precision of bounding boxes correctly predicting the tissue of interest. Additionally, we have found that the tiny YOLO with Partial Residual Networks architecture further reduces the inference speed with a factor of 10, while only sacrificing less than 1% of accuracy. This proposed method does not only segment the tissue of interest in real time without any latency, but it can also achieve this efficiently using limited GPU resources, rendering it attractive for embedded applications. Our paper is the first to introduce object detection as a new approach for VLE-data tissue segmentation and paves the way for real-time VLE-based detection of early cancer in BE.
Over the past few decades, primarily developed countries witnessed an increased incidence of esophageal adenocarcinoma (EAC). Screening and surveillance of Barrett’s esophagus (BE), which is known to augment the probability of developing EAC, can significantly improve survival rates. This is because early-stage dysplasia in BE can be treated effectively, while each subsequent stage complicates successful treatment and seriously reduces survival rates. This study proposes a convolutional neural network-based algorithm, which classifies images of BE visualized with White Light Endoscopy (WLE) as either dysplastic or non-dysplastic. To this end, we use merely pixels surrounding the dysplastic region, while excluding the pixels covering the dysplastic region itself. The phenomenon where the diagnosis of a patient can be determined from tissue other than the clearly observable diseased area, is termed the field effect. With its potential to identify missed lesions, it may prove to be a helpful innovation in the screening and surveillance process of BE. A statistical significant difference test indicates the presence of the field effect in WLE, when comparing the distribution of the algorithm classifications of unseen data and the distribution obtained by a random classification.
Routine surveillance endoscopies are currently used to detect dysplasia in patient with Barrett's Esophagus (BE). However, most of these procedures are performed by non-expert endoscopists in community hospitals. Leading to many missed dysplastic lesions, which can progress into advanced esophageal adenocarcinoma if left untreated.1 In recent years, several successful algorithms have been proposed for the detection of cancer in BE using high-quality overview images. This work addresses the first steps towards clinical application on endoscopic surveillance videos. Several challenges are identified that occur when moving from image-based to video-based analysis. (1) It is shown that algorithms trained on high-quality overview images do not naively transfer to endoscopic videos due to e.g. non-informative frames. (2) Video quality is shown to be an important factor in algorithm performance. Specifically, temporal location performance is highly correlated with video quality. (3) When moving to real-time algorithms, the additional compute necessary to address the challenges in videos will become a burden on the computational budget. However, in addition to challenges, videos also bring new opportunities not available in the current image-based methods such as the inclusion of temporal information. This work shows that a multi-frame approach increases performance compared to a naive single-image method when the above challenges are addressed.
Gastroenterologists are estimated to misdiagnose up to 25% of esophageal adenocarcinomas in Barrett's Esophagus patients. This prompts the need for more sensitive and objective tools to aid clinicians with lesion detection. Artificial Intelligence (AI) can make examinations more objective and will therefore help to mitigate the observer dependency. Since these models are trained with good-quality endoscopic video frames to attain high efficacy, high-quality images are also needed for inference. Therefore, we aim to develop a framework that is able to distinguish good image quality by a-priori informativeness classification which leads to high inference robustness. We show that we can maintain informativeness over the temporal domain using recurrent neural networks, yielding a higher performance on non-informativeness detection compared to classifying individual images. Furthermore, it is also found that by using Gradient weighted Class Activation Map (Grad-CAM), we can better localize informativeness within a frame. We have developed a customized Resnet18 feature extractor with 3 classifiers, consisting of a Fully-Connected (FC), Long-Short-Term-Memory (LSTM) and a Gated-Recurrent-Unit (GRU) classifier. Experimental results are based on 4,349 frames from 20 pullback videos of the esophagus. Our results demonstrate that the algorithm achieves comparative performance with the current state-of-the-art. The FC and LSTM classifier reach an F1 score of 91% and 91%. We found that the LSTM classifier based Grad-CAMs represent the origin of non-informativeness the best as 85% of the images were found to be highlighting the correct area.
The benefit of our novel implementation for endoscopic informativeness classification is that it is trained end- to-end, incorporates the spatiotemporal domain in the decision making for robustness, and makes the model decisions of the model insightful with the use of Grad-CAMs.
Volumetric Laser Endomicroscopy (VLE) is a promising balloon-based imaging technique for detecting early neoplasia in Barretts Esophagus. Especially Computer Aided Detection (CAD) techniques show great promise compared to medical doctors, who cannot reliably find disease patterns in the noisy VLE signal. However, an essential pre-processing step for the CAD system is tissue segmentation. At present, tissue is segmented manually but is not scalable for the entire VLE scan consisting of 1,200 frames of 4,096 × 2,048 pixels. Furthermore, the current CAD methods cannot use the VLE scans to their full potential, as only a small segment of the esophagus is selected for further processing, while an automated segmentation system results in significantly more available data. This paper explores the possibility of automatically segmenting relevant tissue for VLE scans using FusionNet and a domain-specific loss function. The contribution of this work is threefold. First, we propose a tissue segmentation algorithm for VLE scans. Second, we introduce a weighted ground truth that exploits the signal-to-noise ratio characteristics of the VLE data. Third, we compare our algorithm segmentation against two additional VLE experts. The results show that our algorithm annotations are indistinguishable from the expert annotations and therefore the algorithm can be used as a preprocessing step for further classification of the tissue.
Volumetric laser endomicroscopy (VLE) is an advanced imaging system offering a promising solution for the detection of early Barrett’s esophagus (BE) neoplasia. BE is a known precursor lesion for esophageal adenocarcinoma and is often missed during regular endoscopic surveillance of BE patients. VLE provides a circumferential scan of near-microscopic resolution of the esophageal wall up to 3-mm depth, yielding a large amount of data that is hard to interpret in real time. In a preliminary study on an automated analysis system for ex vivo VLE scans, novel quantitative image features were developed for two previously identified clinical VLE features predictive for BE neoplasia, showing promising results. This paper proposes a novel quantitative image feature for a missing third clinical VLE feature. The novel gland-based image feature called “gland statistics” (GS), is compared to several generic image analysis features and the most promising clinically-inspired feature “layer histogram” (LH). All features are evaluated on a clinical, validated data set consisting of 88 non-dysplastic BE and 34 neoplastic in vivo VLE images for eight different widely-used machine learning methods. The new clinically-inspired feature has on average superior classification accuracy (0.84 AUC) compared to the generic image analysis features (0.61 AUC), as well as comparable performance to the LH feature (0.86 AUC). Also, the LH feature achieves superior classification accuracy compared to the generic image analysis features in vivo, confirming previous ex vivo results. Combining the LH and the novel GS features provides even further improvement of the performance (0.88 AUC), showing great promise for the clinical utility of this algorithm to detect early BE neoplasia.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.