Active Sim-to-Real Gap Reduction for Industrial Inspection via Digital Twin and Embedding Analysis

Huimin Zhuge; Xian Yeow Lee; Gregory Sin; Raheem Ahmed; Lasitha Vidyaratne; Aman Kumar; Ahmed Farahat

doi:10.36001/phme.2026.v9i1.5037

Active Sim-to-Real Gap Reduction for Industrial Inspection via Digital Twin and Embedding Analysis

PDF

Published Jul 3, 2026

DOI https://doi.org/10.36001/phme.2026.v9i1.5037

Huimin Zhuge

Hitachi America Ltd, Santa Clara, California, USA

Xian Yeow Lee

Hitachi America Ltd, Santa Clara, California, USA

Gregory Sin

Hitachi America Ltd, Santa Clara, California, USA

Raheem Ahmed

Hitachi America Ltd, Santa Clara, California, USA

Lasitha Vidyaratne

Hitachi America Ltd, Santa Clara, California, USA

Aman Kumar

Hitachi America Ltd, Santa Clara, California, USA

Ahmed Farahat

Hitachi America Ltd, Santa Clara, California, USA

Abstract

Simulation-based training is increasingly used in automated industrial inspection, where collecting and annotating real-world inspection data is costly and often impractical. While synthetic data generated from digital twins enables scalable training, models trained solely in simulation suffer from a significant sim-to-real gap under real inspection conditions such as varying lighting, surface properties, and sensor noise. In this work, we propose a data-efficient sim-to-real adaptation framework that combines representative sample selection via k-determinantal point processes (k-DPP) with embedding-level alignment using Kullback–Leibler (KL) divergence. The key idea is to actively identify a small set of representative synthetic samples, acquire the corresponding real images, and align their latent feature representations while retaining the coverage provided by the larger synthetic dataset. We first train an RF-DETR(Detection Transformer) detector on 550 synthetic inspection images, achieving near-perfect performance in simulation but only 0.2516 mean Average Precision (mAP) on real-world images. Using only 50 paired real images (approximately 10% of the synthetic training set) together with 500 unpaired synthetic images, the proposed method increases real-world mAP from 0.2516 to 0.8853. The k-DPP sampling strategy maximizes the diversity of selected samples, reducing the risk of bias introduced by limited real-world data, while KL-based embedding alignment further reduces domain discrepancy between synthetic and real images. The proposed framework provides a lightweight and practical approach for reducing sim-to-real gaps in a representative industrial inspection setting where real data collection is limited.

How to Cite

Zhuge, H., Lee, X. Y., Sin, G., Ahmed, R., Vidyaratne, L., Kumar, A., & Farahat, A. (2026). Active Sim-to-Real Gap Reduction for Industrial Inspection via Digital Twin and Embedding Analysis. PHM Society European Conference, 9(1), 1–11. https://doi.org/10.36001/phme.2026.v9i1.5037

Abstract 63 | PDF Downloads 28

Keywords

Sim-to-Real Transfer, Digital Twin, Feature Alignment, Active Data Acquisition

References

Ash, J. T., Zhang, C., Krishnamurthy, A., Langford, J., & Agarwal, A. (2020). Deep batch active learning by diverse, uncertain gradient lower bounds. In *International Conference on Learning Representations (ICLR)*.

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In *European Conference on Computer Vision (ECCV)*.

Chou, P.-H., Wang, C.-C., & Mao, W.-L. (2025). YOLO-based defect detection for metal sheets. *arXiv preprint arXiv:2509.25659*.

Ganin, Y., & Lempitsky, V. (2016). Domain-adversarial training of neural networks. *Journal of Machine Learning Research*, 17(59), 1–35.

Gao, D., Wang, Q., Yang, J., & Wu, J. (2025). Domain adaptive object detection via synthetically generated intermediate domain and progressive feature alignment. *Image and Vision Computing*, 154, 105404.

Jocher, G., & Qiu, J. (2026). Ultralytics YOLO26. Retrieved from https://github.com/ultralytics/ultralytics

Kulesza, A., & Taskar, B. (2012). Determinantal point processes for machine learning. *Foundations and Trends in Machine Learning*, 5(2–3), 123–286.

Moore, B. E., & Corso, J. J. (2020). FiftyOne. GitHub. https://github.com/voxel51/fiftyone

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In *IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*.

Robinson, I., Robicheaux, P., Popov, M., Ramanan, D., & Peri, N. (2025). RF-DETR: Neural architecture search for real-time detection transformers. *arXiv preprint arXiv:2511.09554*.

Ruter, J., Durak, U., & Dauer, J. (2024). Investigating the sim-to-real generalizability of deep learning object detection models. *Journal of Imaging*.

Sener, O., & Savarese, S. (2018). Active learning for convolutional neural networks: A core-set approach. In *International Conference on Learning Representations (ICLR)*.

Settles, B. (2009). Active learning literature survey. University of Wisconsin-Madison.

Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., & Abbeel, P. (2017). Domain randomization for transferring deep neural networks from simulation to the real world. *IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)*.

Tzeng, E., Hoffman, J., Saenko, K., & Darrell, T. (2017). Adversarial discriminative domain adaptation. In *CVPR*.

Wu, Y., Chen, J., Yu, X., & Li, J. (2026). YOLO-FOA: A lightweight rotational target detection algorithm based on improved YOLO for optical fiber robot. *Biomimetic Intelligence and Robotics*, 100273.

Wu, Y., Guo, W., Tan, Z., et al. (2024). Syn2real detection in the sky: Generation and adaptation of synthetic aerial ship images. *Applied Sciences*.

Zhao, H., Guo, J., Dong, E., Guo, R., Zhao, L., Wang, C., . . . Li, Y. (2026). YOLO-GDCNN: Real-time operating point detection for live working robots in the power industry. *High Voltage*.

Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021). Deformable DETR: Deformable transformers for end-to-end object detection. In *ICLR*.

Zuo, Z., Dong, J., Gao, Y., & Wu, Z. (2024). HyperDefect-YOLO: Enhance YOLO with hypergraph computation for industrial defect detection. *arXiv preprint arXiv:2412.03969*.

Issue

Vol. 9 No. 1 (2026): Proceedings of the European Conference of the PHM Society 2026

Section

Technical Papers

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:

As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.

First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Abstract

How to Cite

##plugins.themes.bootstrap3.article.details##