Active Sim-to-Real Gap Reduction for Industrial Inspection via Digital Twin and Embedding Analysis

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jul 3, 2026
Huimin Zhuge Xian Yeow Lee Gregory Sin Raheem Ahmed Lasitha Vidyaratne Aman Kumar Ahmed Farahat

Abstract

Simulation-based training is increasingly used in automated industrial inspection, where collecting and annotating real-world inspection data is costly and often impractical. While synthetic data generated from digital twins enables scalable training, models trained solely in simulation suffer from a significant sim-to-real gap under real inspection conditions such as varying lighting, surface properties, and sensor noise. In this work, we propose a data-efficient sim-to-real adaptation framework that combines representative sample selection via k-determinantal point processes (k-DPP) with embedding-level alignment using Kullback–Leibler (KL) divergence. The key idea is to actively identify a small set of representative synthetic samples, acquire the corresponding real images, and align their latent feature representations while retaining the coverage provided by the larger synthetic dataset. We first train an RF-DETR(Detection Transformer) detector on 550 synthetic inspection images, achieving near-perfect performance in simulation but only 0.2516 mean Average Precision (mAP) on real-world images. Using only 50 paired real images (approximately 10% of the synthetic training set) together with 500 unpaired synthetic images, the proposed method increases real-world mAP from 0.2516 to 0.8853. The k-DPP sampling strategy maximizes the diversity of selected samples, reducing the risk of bias introduced by limited real-world data, while KL-based embedding alignment further reduces domain discrepancy between synthetic and real images. The proposed framework provides a lightweight and practical approach for reducing sim-to-real gaps in a representative industrial inspection setting where real data collection is limited.

How to Cite

Zhuge, H., Lee, X. Y., Sin, G., Ahmed, R., Vidyaratne, L., Kumar, A., & Farahat, A. (2026). Active Sim-to-Real Gap Reduction for Industrial Inspection via Digital Twin and Embedding Analysis. PHM Society European Conference, 9(1), 1–11. https://doi.org/10.36001/phme.2026.v9i1.5037
Abstract 0 | PDF Downloads 0

##plugins.themes.bootstrap3.article.details##

Keywords

Sim-to-Real Transfer, Digital Twin, Feature Alignment, Active Data Acquisition

References
Ash, J. T., Zhang, C., Krishnamurthy, A., Langford, J., & Agarwal, A. (2020). Deep batch active learning by diverse, uncertain gradient lower bounds. In *International Conference on Learning Representations (ICLR)*.

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In *European Conference on Computer Vision (ECCV)*.

Chou, P.-H., Wang, C.-C., & Mao, W.-L. (2025). YOLO-based defect detection for metal sheets. *arXiv preprint arXiv:2509.25659*.

Ganin, Y., & Lempitsky, V. (2016). Domain-adversarial training of neural networks. *Journal of Machine Learning Research*, 17(59), 1–35.

Gao, D., Wang, Q., Yang, J., & Wu, J. (2025). Domain adaptive object detection via synthetically generated intermediate domain and progressive feature alignment. *Image and Vision Computing*, 154, 105404.

Jocher, G., & Qiu, J. (2026). Ultralytics YOLO26. Retrieved from https://github.com/ultralytics/ultralytics

Kulesza, A., & Taskar, B. (2012). Determinantal point processes for machine learning. *Foundations and Trends in Machine Learning*, 5(2–3), 123–286.

Moore, B. E., & Corso, J. J. (2020). FiftyOne. GitHub. https://github.com/voxel51/fiftyone

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In *IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*.

Robinson, I., Robicheaux, P., Popov, M., Ramanan, D., & Peri, N. (2025). RF-DETR: Neural architecture search for real-time detection transformers. *arXiv preprint arXiv:2511.09554*.

Ruter, J., Durak, U., & Dauer, J. (2024). Investigating the sim-to-real generalizability of deep learning object detection models. *Journal of Imaging*.

Sener, O., & Savarese, S. (2018). Active learning for convolutional neural networks: A core-set approach. In *International Conference on Learning Representations (ICLR)*.

Settles, B. (2009). Active learning literature survey. University of Wisconsin-Madison.

Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2017). Domain randomization for transferring deep neural networks from simulation to the real world. *IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)*.

Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In *CVPR*.

Wu, Y., Chen, J., Yu, X., & Li, J. (2026). YOLO-FOA: A lightweight rotational target detection algorithm based on improved YOLO for optical fiber robot. *Biomimetic Intelligence and Robotics*, 100273.

Wu, Y., Guo, W., Tan, Z., et al. (2024). Syn2real detection in the sky: Generation and adaptation of synthetic aerial ship images. *Applied Sciences*.

Zhao, H., Guo, J., Dong, E., Guo, R., Zhao, L., Wang, C., . . . Li, Y. (2026). YOLO-GDCNN: Real-time operating point detection for live working robots in the power industry. *High Voltage*.

Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021). Deformable DETR: Deformable transformers for end-to-end object detection. In *ICLR*.

Zuo, Z., Dong, J., Gao, Y., & Wu, Z. (2024). HyperDefect-YOLO: Enhance YOLO with hypergraph computation for industrial defect detection. *arXiv preprint arXiv:2412.03969*.
Section
Technical Papers