Crowdsourcing is a concept to encourage humans all over the world to generate ground truth for classification data such as images. While frameworks for binary and multi-label classification exist, crowdsourcing of medical image segmentation is covered only by few work. In this paper, we present a web-based platform supporting scientists of various domains to obtain segmentations, which are close to ground-truth references. The system is composed of frontend, authentication, management, processing, and persistence layers which are implemented combining various javascript tools, the django web framework, an asynchronous celery task, and a PostgreSQL database, respectively. It is deployed on a kubernetes cluster. A set of image data accompanied by a task instruction can be uploaded. Users can be invited or subscribe to join in. After passing a guided tutorial of pre- segmented example images, segmentations can be obtained from non-expert users from all over the world. The Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm generates estimated ground truth segmentation masks and evaluates the users performance continuously in the backend. As a proof of concept, a test-study with 75 photographs of human eyes was performed by 44 users. In just a few days, 2,060 segmentation masks with a total of 52,826 vertices along the mask contour have been collected.
The Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm is frequently used in medical image segmentation without available ground truth (GT). In this paper, we investigate the number of inexperi- enced users required to establish a reliable STAPLE-based GT and the number of vertices the user’s shall place for a point-based segmentation. We employ “WeLineation”, a novel web-based system for crowdsourcing seg- mentations. Within the study, 2,060 masks have been delivered by 44 users on 75 different photographic images of the human eye, where users had to segment the sclera. For all masks, GT was estimated using STAPLE. Then, STAPLE is computed using fewer user contributions and results are compared to the GT. Requiring an error rate lower than 2%, same segmentation performance is obtained with 13 experienced and 22 rather inexperienced users. More than 10 vertices shall be placed on the delineation contour in order to reach an accuracy larger than 95%. In average, a vertex along the segmentation contour shall be placed every 81 pixels. The results indicate that knowledge about the users performance can reduce the number of segmentation masks per image, which are needed to estimate reliable GT. Therefore, gathering performance parameters of users during a crowdsourcing study and applying this information to the assignment process is recommended. In this way, benefits in the cost-effectiveness of a crowdsourcing segmentation study can be achieved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.