Transformer-Based Architectures for Machinery Prognostics: A Review
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
Machinery prognostics requires robust modeling of multivariate degradation signals under noise, non-stationarity, variable operating conditions, and limited run-to-failure labels. Transformer-based deep learning architectures have recently attracted strong interest because self-attention can capture long-range temporal dependencies and inter-sensor interactions more directly than purely recurrent or convolutional models. This focused review presents Transformer-based approaches for machinery prognostics, with emphasis on remaining useful life (RUL) estimation and degradation representation learning.} \rboth{The literature is organized using a consistent taxonomy covering PHM task, Transformer backbone, hybridization strategy, and input representation.} \redit{We also analyze preprocessing choices that strongly influence performance, including windowing, health-indicator construction, tokenization, embedding, and positional encoding. Across benchmark datasets, studied studies frequently show gains from Transformers and hybrid attention models, especially when long temporal context and multivariate dependencies are central. However, improvements are not universal and remain sensitive to evaluation protocol, signal representation, and model complexity. Key open challenges include data efficiency, computational cost, cross-condition generalization, interpretability, and uncertainty quantification. The review concludes by identifying methodological gaps in the current literature and outlining research directions for robust, efficient, and deployable Transformer-based prognostics.
How to Cite
##plugins.themes.bootstrap3.article.details##
prognostics, transformers, deep learning, PHM
Biggio, L., Bendinelli, T., Kulkarni, C., & Fink, O. (2022). Dynaformer: A deep learning model for ageing-aware battery discharge prediction. arXiv preprint arXiv:2206.02555. Retrieved from https://arxiv.org/abs/2206.02555
Chang, Y., Li, F., Chen, J., Liu, Y., & Li, Z. (2022). Efficient temporal flow transformer accompanied with multi-head ProbSparse self-attention mechanism for remaining useful life prognostics. Reliability Engineering & System Safety, 226, 108701. doi: https://doi.org/10.1016/j.ress.2022.108701
Che, S., Lu, J., Bao, C., Zhang, C., & Liu, Y. (2023). Multiscale time-frequency sparse transformer based on partly interpretable method for bearing fault diagnosis. Shock and Vibration, 2023(1), 1639287. doi: 10.1155/2023/1639287
Chelouati, M., Jha, M. S., Galeotta, M., & Theilliol, D. (2021). Remaining useful life prediction for liquid propulsion rocket engine combustion chamber. In 2021 5th International Conference on Control and Fault-Tolerant Systems (SysTol) (pp. 225–230). Saint-Raphaël, France. doi: 10.1109/SysTol52990.2021.9595286
Chirukiri, V. T., Cheerala, U. B., Kanta, S., Karim, A., & Damacharla, P. (2025). FTT-GRU: A hybrid fast temporal transformer with GRU for remaining useful life prediction. arXiv preprint arXiv:2511.00564. Retrieved from https://arxiv.org/abs/2511.00564
de Beaulieu, M. H., Jha, M. S., Garnier, H., & Cerbah, F. (2022). Unsupervised prognostics based on deep virtual health index prediction. In PHM Society European Conference (Vol. 7, pp. 193–199). Turin, Italy.
de Beaulieu, M. H., Jha, M. S., Garnier, H., & Cerbah, F. (2024). Remaining useful life prediction based on physics-informed data augmentation. Reliability Engineering & System Safety, 252, 110451.
Deng, F., Chen, Z., Liu, Y., Yang, S., Hao, R., & Lyu, L. (2022). A novel combination neural network based on ConvLSTM-transformer for bearing remaining useful life prediction. Machines, 10(12), 1226. doi: 10.3390/machines10121226
Ding, Y., & Jia, M. (2022). Convolutional transformer: An enhanced attention mechanism architecture for remaining useful life estimation of bearings. IEEE Transactions on Instrumentation and Measurement, 71, 1–10. doi: 10.1109/TIM.2022.3181933
Dong, S., Xiao, J., Hu, X., Fang, N., Liu, L., & Yao, J. (2023). Deep transfer learning based on Bi-LSTM and attention for remaining useful life prediction of rolling bearing. Reliability Engineering & System Safety, 230, 108914. doi: https://doi.org/10.1016/j.ress.2022.108914
Fink, O., Wang, Q., Svensen, M., Dersin, P., Lee, W.-J., & Ducoffe, M. (2020). Potential, challenges and future directions for deep learning in prognostics and health management applications. Engineering Applications of Artificial Intelligence, 92, 103678. doi: 10.1016/j.engappai.2020.103678
Götz, L., Kollovieh, M., Günnemann, S., & Schwinn, L. (2025). Byte pair encoding for efficient time series forecasting. arXiv preprint arXiv:2505.14411.
Hu, Q., Zhao, Y., & Ren, L. (2023). Novel transformer-based fusion models for aero-engine remaining useful life estimation. IEEE Access, 11, 52668–52685. doi: 10.1109/ACCESS.2023.3277730
Huang, N., Kümmerle, C., & Zhang, X. (2024). UnitNorm: Rethinking normalization for transformers in time series. arXiv preprint arXiv:2405.15903. Retrieved from https://arxiv.org/abs/2405.15903
Jardine, A. K. S., Lin, D., & Banjevic, D. (2006). A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, 20(7), 1483–1510. doi: 10.1016/j.ymssp.2005.09.012
Jha, M. S., Bressel, M., Ould-Bouamama, B., & Dauphin-Tanguy, G. (2016). Particle filter-based hybrid prognostics of proton exchange membrane fuel cell in bond graph framework. Computers & Chemical Engineering, 95, 216–230. doi: https://doi.org/10.1016/j.compchemeng.2016.08.018
Jha, M. S., Dauphin-Tanguy, G., & Ould-Bouamama, B. (2016). Particle filter-based hybrid prognostics for health monitoring of uncertain systems in bond graph framework. Mechanical Systems and Signal Processing, 75, 301–329. doi: https://doi.org/10.1016/j.ymssp.2016.01.010
Jha, M. S., Theilliol, D., Belleoud, P., & Oriol, S. (2025). Deep learning-based prognostics of nonlinear systems under degradation in closed-loop. In 2025 6th International Conference on Control and Fault-Tolerant Systems (SysTol) (pp. 172–179). doi: 10.1109/SysTol66549.2025.11267335
Jiang, L., Zhang, T., Lei, W., Zhuang, K., & Li, Y. (2023). A new convolutional dual-channel transformer network with time window concatenation for remaining useful life prediction of rolling bearings. Advanced Engineering Informatics, 56, 101966. doi: https://doi.org/10.1016/j.aei.2023.101966
Jiao, Z., Pan, L., Fan, W., Xu, Z., & Chen, C. (2022). Partly interpretable transformer through binary arborescent filter for intelligent bearing fault diagnosis. Measurement, 203, 111950. doi: https://doi.org/10.1016/j.measurement.2022.111950
Jin, C., Li, B., Yang, Y., Yuan, X., Tu, R., Qiu, L., & Chen, X. (2025). Remaining useful life prediction of rolling bearings based on empirical mode decomposition and transformer Bi-LSTM network. Applied Sciences, 15(17), 9529.
Jin, X., Ji, Y., Li, S., Lv, K., Xu, J., Jiang, H., & Fu, S. (2025). Remaining useful life prediction for rolling bearings based on TCN–transformer networks using vibration signals. Sensors, 25(11), 3571. doi: 10.3390/s25113571
Kanso, S., Jha, M. S., Galeotta, M., & Theilliol, D. (2022). Remaining useful life prediction with uncertainty quantification of liquid propulsion rocket engine combustion chamber. IFAC-PapersOnLine, 55(6), 96–101. doi: https://doi.org/10.1016/j.ifacol.2022.07.112
Lei, Y., Han, T., Wang, B., Li, N., Yan, T., & Yang, J. (2019). XJTU-SY rolling element bearing accelerated life test datasets: A tutorial. Journal of Mechanical Engineering, 55(16), 1–6. doi: 10.3901/JME.2019.16.001
Liu, C.-L., Hsaio, W.-H., & Tu, Y.-C. (2019). Time series classification with multivariate convolutional neural network. IEEE Transactions on Industrial Electronics, 66(6), 4788–4797. doi: 10.1109/TIE.2018.2864702
Lv, S., Liu, S., & Li, H. (n.d.). New method for remaining useful life prediction based on recurrence multi-information time-frequency transformer networks. Quality and Reliability Engineering International, 41(5), 1643–1663. doi: 10.1002/qre.3740
Mittal, D., Bello, H., Zhou, B., Jha, M. S., Suh, S., & Lukowicz, P. (2023). Two-stage early prediction framework of remaining useful life for lithium-ion batteries. arXiv preprint arXiv:2308.03664. Retrieved from https://arxiv.org/abs/2308.03664
Nectoux, P., Gouriveau, R., Medjaher, K., Ramasso, E., Chebel-Morello, B., Zerhouni, N., & Varnier, C. (2012). PRONOSTIA: An experimental platform for bearings accelerated degradation tests. In Conference on Prognostics and Health Management (pp. 1–8). Denver, CO, USA.
Nguyen, H.-D., Nguyen, X. H., Dao, B. T., Do, C. N., Truong, H., & Tran, K. P. (2025). Remaining useful lifetime prediction of turbofan engines based on hybrid transformer deep architecture. SSRN. doi: 10.2139/ssrn.5358720
Ogunfowora, O., & Najjaran, H. (2023). A transformer-based framework for multivariate time series: A remaining useful life prediction use case. arXiv preprint arXiv:2308.09884. Retrieved from https://arxiv.org/abs/2308.09884
Patra, K. C., Sethi, R., & Behera, D. K. (2025). Estimation of the remaining useful life of aircraft engines using a CNN-LSTM-GRU hybrid model. International Journal of System Assurance Engineering and Management, 16(12), 3968–3982. doi: 10.1007/s13198-025-02911-4
Peng, H., Jiang, B., Mao, Z., & Liu, S. (2023). Local enhancing transformer with temporal convolutional attention mechanism for bearings remaining useful life prediction. IEEE Transactions on Instrumentation and Measurement, 72, 1–12. doi: 10.1109/TIM.2023.3291787
Pour, M. A., Karimi, M. S., & Mazloumi, A. H. (2025). Temporal convolutional and fusional transformer model with Bi-LSTM encoder–decoder for multi-time-window remaining useful life prediction. IEEE Access, 13, 203705–203722. doi: 10.1109/ACCESS.2025.3634285
Saxena, A., Goebel, K., Simon, D., & Eklund, N. (2008a). Damage propagation modeling for aircraft engine run-to-failure simulation. In 2008 International Conference on Prognostics and Health Management (pp. 1–9). doi: 10.1109/PHM.2008.4711414
Saxena, A., Goebel, K., Simon, D., & Eklund, N. (2008b). Damage propagation modeling for aircraft engine run-to-failure simulation. In 2008 International Conference on Prognostics and Health Management (pp. 1–9). Denver, CO, USA.
Sikorska, J. Z., Hodkiewicz, M., & Ma, L. (2011). Prognostic modelling options for remaining useful life estimation by industry. Mechanical Systems and Signal Processing, 25(5), 1803–1836.
Suh, S., Jang, J., Won, S., Jha, M. S., & Lee, Y. O. (2020). Supervised health stage prediction using convolutional neural networks for bearing wear. Sensors, 20(20), 5846. doi: 10.3390/s20205846
Suh, S., Mittal, D. A., Bello, H., Zhou, B., Jha, M. S., & Lukowicz, P. (2024). Remaining useful life prediction of lithium-ion batteries using spatio-temporal multimodal attention networks. Heliyon, 10(16), e36236. doi: 10.1016/j.heliyon.2024.e36236
Sun, N., Tang, J., Ye, X., Zhang, C., Zhu, S., Wang, S., & Sun, Y. (2024). Remaining useful life prognostics of bearings based on convolution attention networks and enhanced transformer. Heliyon, 10(19), e38317. doi: 10.1016/j.heliyon.2024.e38317
Talukder, S., Yue, Y., & Gkioxari, G. (2025). TOTEM: Tokenized time series embeddings for general time series analysis. arXiv preprint arXiv:2402.16412. Retrieved from https://arxiv.org/abs/2402.16412
Thuillier, J., Jha, M. S., Le Martelot, S., & Theilliol, D. (2024). Prognostics-aware control design for extended remaining useful life: Application to liquid propellant reusable rocket engine. International Journal of Prognostics and Health Management, 15(1). doi: 10.36001/ijphm.2024.v15i1.3460
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30.
Wang, H.-K., Cheng, Y., & Song, K. (2021). Remaining useful life estimation of aircraft engines using a joint deep learning model based on TCNN and transformer. Computational Intelligence and Neuroscience, 2021(1), 5185938. doi: 10.1155/2021/5185938
Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., & Sun, L. (2022). Transformers in time series: A survey. arXiv preprint arXiv:2202.07125.
Zerveas, G., Jayaraman, S., Patel, D., Bhamidipaty, A., & Eickhoff, C. (2020). A transformer-based framework for multivariate time series representation learning. arXiv preprint arXiv:2010.02803. Retrieved from https://arxiv.org/abs/2010.02803
Zhang, B., & Sennrich, R. (2019). Root mean square layer normalization. arXiv preprint arXiv:1910.07467. Retrieved from https://arxiv.org/abs/1910.07467
Zhang, H., Zhang, S., Qiu, L., Zhang, Y., Wang, Y., Wang, Z., & Yang, G. (2022). A remaining useful life prediction method based on PSR-former. Scientific Reports, 12(1), 17887. doi: 10.1038/s41598-022-22941-3
Zhang, M., He, C., Huang, C., & Yang, J. (2024). A weighted time embedding transformer network for remaining useful life prediction of rolling bearing. Reliability Engineering & System Safety, 251, 110399. doi: 10.1016/j.ress.2024.110399
Zhang, Z., Song, W., & Li, Q. (2022). Dual-aspect self-attention based on transformer for remaining useful life prediction. IEEE Transactions on Instrumentation and Measurement, 71, 1–11. doi: 10.1109/TIM.2022.3160561
Zheng, S., Ristovski, K., Farahat, A., & Gupta, C. (2017). Long short-term memory network for remaining useful life estimation. In IEEE International Conference on Prognostics and Health Management (ICPHM) (pp. 88–95). Dallas, TX, USA.

This work is licensed under a Creative Commons Attribution 3.0 Unported License.
The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:
As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.
First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.