Image-based Large Language Models (LLMs) are AI models that can understand the captured images and generate textual content based on the analysis of images or visual data. Incorporating the LLMs for assessing water quality, pressure, and environmental conditions can help analyze historical data and predict potential risks and threats in underwater environments. This can improve the intervention of autonomous underwater vehicles ( AUV) and remotely operated vehicles ( ROV) during emergencies where the visual data must be interpreted to make informed decisions. While LLMs are primarily associated with processing and generating text, they can be integrated with images through a process known as multimodal learning, where text and images are combined for tasks that involve both modalities. Implementing such frameworks is challenging when deployed in low-power microcontrollers primarily used in monitoring systems. This research proposes evaluating multimodal tokens to enable edge computing in bio-inspired robots to monitor the underwater environment. This can help break down large real-time videos into tokens of text-based instructions associated with the description of images. The mini-robots will transmit the collected “tokens” to the nearest AUV or ROV, where the image-based LLM will be deployed. We propose to evaluate this image-based LLM in our NVIDIA Jetson Nano-based AUV. In the proposed architecture, the mini-robots can move along the length of the water column to capture images of the underwater environment. Our proposed model is evaluated to generate texts for boat and fish images. This proposed framework with integrated image-based tokens can significantly reduce the response time and data traffic in underwater real-time monitoring systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.