The Atacama Large Millimeter/submillimeter Array (ALMA) is a prominent astronomical observatory known for its detailed imaging capabilities. Efficient scheduling of ALMA’s data processing tasks, especially those involving complex pipeline executions, is crucial for maximizing operational productivity. This paper addresses the challenge by developing a predictive model that estimates the runtime of these tasks, enabling more effective scheduling and resource management. Our approach employs the Light Gradient Boosting Machine (LGBM) and Quantile Forest models to predict processing times and quantify uncertainties. The use of these models is innovative, as it not only provides accurate predictions but also offers insights into the variability of processing times. This is particularly beneficial for handling the dynamic nature of the data processing workload at ALMA. We enhance the model’s performance and reliability by incorporating variable scaling and logarithmic transformations. To determine the best model, we comprehensively evaluated seven different machine-learning techniques. Our results show that the LGBM model and quantile estimation outperform traditional methods in predicting task durations. This leads to more efficient scheduling, as it allows the system to account for potential delays and optimize the sequencing of jobs. The quantile approach, in particular, offers a robust method for dealing with the inherent uncertainty in processing times. Our predictive tool has demonstrated a substantial reduction in overall flow time, decreasing it by 5.7%. Further improvements were achieved using stochastic scheduling techniques, which leverage the uncertainty estimates provided by our model. This research highlights the potential of machine learning to significantly enhance the operational efficiency of large-scale observatories like ALMA, providing a scalable and practical solution for managing complex data processing tasks.
|