A Study on the Impact of Categorical Alarm Data in Power Estimation and Anomaly Detection of Photovoltaic Inverters

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jul 3, 2026
Jorge Ruiz Amantegui Hai-Canh Vu Phuc Do

Abstract

This study investigates the impact of incorporating categorical inverter data into power-forecasting and anomaly-detection frameworks. Three forecasting models are evaluated on their ability to estimate power output on a large dataset coming from a fleet of multiple photovoltaic plants, over one hundred inverters and an approximate total of 33.2 installed MW. The forecasting models employed are Multi-Layer Perceptron, Long Short-Term Memory, and Extreme Gradient Boosting. Two encoding strategies for categorical alarm codes are compared: one-hot encoding and entity embeddings. Anomaly detection is performed by analysing residuals between predicted and measured power output. By systematically evaluating the integration of categorical inverter data into PV monitoring models,this work addresses an important gap in the literature and provides a foundation for future research exploring advanced methods for exploiting categorical operational data in photovoltaic systems.

How to Cite

Ruiz Amantegui, J., Vu, H.-C., & Do, P. (2026). A Study on the Impact of Categorical Alarm Data in Power Estimation and Anomaly Detection of Photovoltaic Inverters. PHM Society European Conference, 9(1), 1–10. https://doi.org/10.36001/phme.2026.v9i1.4856
Abstract 0 | PDF Downloads 0

##plugins.themes.bootstrap3.article.details##

Keywords

photovoltaic, anomaly detection, predictive maintenance, scada, power forecasting, categorical alarm data, residual analysis, machine learning

References
Bezerra, A., Silva, I., Guedes, L. A., Silva, D., Leitão, G., & Saito, K. (2019). Extracting value from industrial alarms and events: A data-driven approach based on exploratory data analysis. Sensors, 19(12), 2772. doi: 10.3390/s19122772

Chang, Z., & Han, T. (2024). Prognostics and health management of photovoltaic systems based on deep learning: A state-of-the-art review and future perspectives. Renewable and Sustainable Energy Reviews, 205, 114861. doi: 10.1016/j.rser.2024.114861

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). doi: 10.1145/2939672.2939785

De Benedetti, M., Leonardi, F., Messina, F., Santoro, C., & Vasilakos, A. (2018). Anomaly detection and predictive maintenance for photovoltaic systems. Neurocomputing, 310, 59–68. doi: 10.1016/j.neucom.2018.05.017

Guo, C., & Berkhahn, F. (2016). Entity embeddings of categorical variables. arXiv. doi: 10.48550/arXiv.1604.06737

Gutschi, C., Furian, N., Suschnigg, J., Neubacher, D., & Voessner, S. (2019). Log-based predictive maintenance in discrete parts manufacturing. Procedia CIRP, 79, 528–533. doi: 10.1016/j.procir.2019.02.098

Hashemi, B., Taheri, S., Cretu, A.-M., & Pouresmaeil, E. (2021). Systematic photovoltaic system power losses calculation and modeling using computational intelligence techniques. Applied Energy, 284, 116396. doi: 10.1016/j.apenergy.2020.116396

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. doi: 10.1162/neco.1997.9.8.1735

Ibrahim, M., Alsheikh, A., Awaysheh, F., & Alshehri, M. (2022). Machine learning schemes for anomaly detection in solar power plants. Energies, 15(3), 1082. doi: 10.3390/en15031082

International Energy Agency. (2025). World energy outlook 2025 (Tech. Rep.).

Kim, J., Kim, H., Kim, H., Lee, D., & Yoon, S. (2025). A comprehensive survey of deep learning for time series forecasting: Architectural diversity and open challenges. Artificial Intelligence Review, 58(7), 216. doi: 10.1007/s10462-025-11223-9

Liebermann, S., Um, J.-S., Hwang, Y., & Schlüter, S. (2021). Performance evaluation of neural network-based short-term solar irradiation forecasts. Energies, 14(11), 3030. doi: 10.3390/en14113030

Luo, M., Li, X., Zhang, D., Zhao, Y., & Lim, P. (2008). Categorical data analysis for equipment failure prediction. In 2008 34th Annual Conference of IEEE Industrial Electronics (pp. 1473–1478). IEEE. doi: 10.1109/IECON.2008.4758171

Marangis, D., Livera, A., Tziolis, G., Makrides, G., Kyprianou, A., & Georghiou, G. E. (2024). Trend-based predictive maintenance and fault detection analytics for photovoltaic power plants. Solar RRL, 8(24), 2400473. doi: 10.1002/solr.202400473

Onal, Y. (2022). Gaussian kernel-based SVR model for short-term photovoltaic MPP power prediction. Computer Systems Science and Engineering, 41(1), 141–156. doi: 10.32604/csse.2022.020367

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. doi: 10.1038/323533a0

Sheppard, S., Dickey, K. A., Koskey, S., Teasley, C., Perullo, C., Fregosi, D., & Li, W. (2024). Benchmarking a physics-based approach for anomaly detection at utility PV plants. In 2024 IEEE 52nd Photovoltaic Specialist Conference (PVSC) (pp. 856–858). IEEE. doi: 10.1109/PVSC57443.2024.10749158

Syamsuddin, A., Adhi, A. C., Kusumawardhani, A., Prahasto, T., & Widodo, A. (2024). Predictive maintenance based on anomaly detection in photovoltaic system using SCADA data and machine learning. Results in Engineering, 24, 103589. doi: 10.1016/j.rineng.2024.103589

Yao, S., Kang, Q., Zhou, M., Abusorrah, A., & Al-Turki, Y. (2021). Intelligent and data-driven fault detection of photovoltaic plants. Processes, 9(10), 1711. doi: 10.3390/pr9101711
Section
Technical Papers