Paper
11 October 2023 Machine learning applied to failure detection
Wenbin Qi, Jingyi Zhao
Author Affiliations +
Proceedings Volume 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023); 1280021 (2023) https://doi.org/10.1117/12.3004136
Event: 6th International Conference on Computer Information Science and Application Technology (CISAT 2023), 2023, Hangzhou, China
Abstract
The operation of distributed systems highly depends on reliable and continuous communication between nodes. The failure detection phase thus plays an important role in system unavailability. Previous studies have shown potentials for LSTM based failure detectors but the heavy computation cost remains a problem. We propose a technique to reduce the computation time of LSTM based FD by performing forecasting instead of making one prediction at a time (i.e. predicting multiple timestamps at once). Our method achieves better computation time compared to previous LSTM FDs but falls short of BFD. We discuss the heavy computation cost of back-propagation of LSTM and the high frequency nature of FD tasks and conclude that LSTM isn’t likely to be the best fit for Machine Learning based FD.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Wenbin Qi and Jingyi Zhao "Machine learning applied to failure detection", Proc. SPIE 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023), 1280021 (11 October 2023); https://doi.org/10.1117/12.3004136
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Machine learning

Computation time

Systems modeling

Neural networks

Performance modeling

Computer science

Back to Top