Prediction of Production Line Status for Printed Circuit Boards



Published Jun 29, 2022
Haichuan Tang Yin Tian Junyan Dai Yuan Wang Jianli Cong Qi Liu Xuejun Zhao Yunxiao Fu


This paper focuses on the problem of predicting production line status for Printed Circuit Boards (PCBs). The problem contains three prediction tasks regarding PCB producing process. Firstly, data exploration is carried out and it reveals several data challenges, including data imbalance, data noise, small sample size, and component difference. To predict production line status for components of PCBs using records of inspection on pins, we proposed two possible feature extraction methods to compress the pin-level data into component-level. A statistical feature extraction method, which retrieves descriptive statistics such as mean, standard deviation, maximum, and minimum of pins on the same component, is applied to Task 1, while a PinNumber-based feature extraction method, which keep original values for 2-pin components, is applied to Task3. In addition, a neural-net model with feeding imbalance control is established for Task 1. and a random forests model is applied for both Task 2 and Task 3. Moreover, a threshold moving technique is proposed to optimize the threshold selection. Finally, the result shows that our models achieved f1-scores of 0.44, 0.54, and 0.71 on the test set for the three tasks, respectively.

How to Cite

Tang, H., Tian, Y., Dai, J., Wang, Y., Cong, J., Liu, Q., Zhao, X., & Fu, Y. (2022). Prediction of Production Line Status for Printed Circuit Boards. PHM Society European Conference, 7(1), 563–570.
Abstract 58 | PDF Downloads 31



Printed Circuit Board, Machine Learning, Data Imbalance

Giordano, D., & Trevisan, M. (2022). "PHME Data Challenge". European conference of the prognostics and health management society.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. the Journal of Machine Learning Research, 12, 2825-2830.
Batista, G. E., Prati, R. C., & Monard, M. C. (2004). A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1), 20-29.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). IEEE.
Provost, F. (2000, July). Machine learning from imbalanced data sets 101. In Proceedings of the AAAI’2000 workshop on imbalanced data sets (Vol. 68, No. 2000, pp. 1-3). AAAI Press.
Maloof, M. A. (2003, August). Learning when data sets are imbalanced and when costs are unequal and unknown. In ICML-2003 workshop on learning from imbalanced data sets II (Vol. 2, pp. 2-1).
Data Challenge Winners