Time‑Series Retrieval for Grounding Multimodal Language Models in Remaining Useful Life Prediction

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jul 3, 2026
Valeriu Dimidov Raphael Frank

Abstract

Large language models (LLMs) and agentic AI systems are increasingly being explored for domain-specific maintenance and prognostics tasks, raising the question of whether they can effectively support prognostics and health management (PHM). In this paper, we investigate remaining useful life (RUL) estimation with multimodal large language models (MLLMs) grounded through time-series retrieval. We propose a framework in which historically similar degradation segments are retrieved from the training set and, together with the test trajectory, transformed into a visual comparison artifact that is processed by the MLLM through a structured multimodal prompt. The approach is evaluated on the FD001 partition of the C-MAPSS benchmark under repeated experiments comparing retrieval-based inference against a non-retrieval baseline based on random reference selection. The results show that time-series retrieval consistently improves MLLM-based RUL prediction across the evaluated models, yielding lower error and more stable performance. At the same time, the magnitude of the benefit depends on model capacity, indicating that retrieval is most effective when the underlying MLLM is able to exploit the retrieved evidence. Overall, the study shows that time-series RAG is a promising mechanism for improving multimodal prognostic reasoning, while also highlighting the current limitations of MLLM-based RUL estimation in practical PHM settings.

How to Cite

Dimidov, V. ., & Frank, R. (2026). Time‑Series Retrieval for Grounding Multimodal Language Models in Remaining Useful Life Prediction. PHM Society European Conference, 9(1), 1–11. https://doi.org/10.36001/phme.2026.v9i1.4969
Abstract 0 | PDF Downloads 0

##plugins.themes.bootstrap3.article.details##

Keywords

Remaining Useful Life Prediction, Retrieval-Augmented Generation, Multimodal Large Language Models, Time-Series Retrieval, Predictive Maintenance, Prognostics and Health Management

References
Chen, Y., & Liu, C. (2024). Remaining useful life prediction: A study on multidimensional industrial signal processing and efficient transfer learning based on large language models. arXiv preprint arXiv:2410.03134. doi: 10.48550/arXiv.2410.03134

Chen, Z., Wu, M., Zhao, R., Guretno, F., Yan, R., & Li, X. (2021). Machine remaining useful life prediction via an attention-based deep learning approach. IEEE Transactions on Industrial Electronics, 68(3), 2521–2531. doi: 10.1109/TIE.2020.2972443

DeCastro, J., Litt, J., & Frederick, D. (2008). A modular aero-propulsion system simulation of a large commercial aircraft engine. In 44th AIAA/ASME/SAE/ASEE Joint Propulsion Conference & Exhibit. American Institute of Aeronautics and Astronautics. doi: 10.2514/6.2008-4579

Hafsi, M. (2025). Potential of generative AI in knowledge-based predictive maintenance for aircraft engines. Annual Conference of the PHM Society, 17(1). doi: 10.36001/phmconf.2025.v17i1.4352

Jiang, W., & Hu, F. (2025). Artificial intelligence agent-enabled predictive maintenance: Conceptual proposal and basic framework. Computers, 14(8), 329. doi: 10.3390/computers14080329

Kirubanandan, R. (2025). Causal-aware LLM agents for PHM co-pilots: Health monitoring and intervention planning. Annual Conference of the PHM Society, 17(1). doi: 10.36001/phmconf.2025.v17i1.4321

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems (Vol. 33, pp. 9459–9474). Curran Associates, Inc. doi: 10.48550/arXiv.2005.11401

Ren, J., Liu, X., Wang, T., Zhao, Z., Chen, X., Li, W., & Yan, R. (2025). PHM-GPT: A large language model for prognostics and health management. Engineering, S2095809925006745. doi: 10.1016/j.eng.2025.11.001

Saxena, A., Goebel, K., Simon, D., & Eklund, N. (2008). Damage propagation modeling for aircraft engine run-to-failure simulation. In 2008 International Conference on Prognostics and Health Management (pp. 1–9). IEEE. doi: 10.1109/PHM.2008.4711414

Tan, Q., Yang, L., Zhu, F., & Wang, Z. (2025). Pre-trained LLM-based remaining useful life prediction of aircraft engines. In 2025 15th International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering (QR2MSE) and 8th International Conference on Materials and Reliability (ICMR) (Vol. 2025, pp. 1016–1024). doi: 10.1049/icp.2025.3534

Wang, H., Li, Y., Zhu, Y., Yan, J., Ren, L., & Yang, L. T. (2026). TS-MLLM: A multi-modal large language model-based framework for industrial time-series big data analysis. arXiv preprint arXiv:2603.07572. doi: 10.48550/arXiv.2603.07572

Yin, S., Fu, C., Zhao, S., Li, K., Sun, X., Xu, T., & Chen, E. (2024). A survey on multimodal large language models. National Science Review, 11(12), nwae403. doi: 10.1093/nsr/nwae403

Zhang, X., Chowdhury, R. R., Gupta, R. K., & Shang, J. (2024). Large language models for time series: A survey. arXiv preprint arXiv:2402.01801. doi: 10.48550/arXiv.2402.01801
Section
Technical Papers