Improving Combined Detection and Classification of TEM Defects via Mask-Conditioned Latent Diffusion Augmentation

2026-06-01Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors created a new method to generate fake but realistic images of metal alloys with defects, which helps train computer models to find these defects in real microscope pictures. They used a special kind of AI called a mask-conditioned latent diffusion model to make images that come with automatic labels, so no one has to label them by hand. By adding these fake images to small real datasets, the authors improved the performance of a defect detection model slightly. They also found that how much improvement occurs depends on how the training and testing data are split. Their work shows that smart AI image generation can help when only a few labeled microscope images are available.

transmission electron microscopymicrostructural defectsdata augmentationlatent diffusion modelmask-conditioned generationMask R-CNNdefect detectionimage synthesisdeep learninglabeled data scarcity
Authors
Ni Li, Nuohao Liu, Ryan Jacobs, Ajay Annamareddy, Maciej P. Polak, Kevin Field, Izabela Szlufarska, Dane Morgan
Abstract
Analyzing microstructural defects in transmission electron microscopy (TEM) images, particularly in irradiated metal alloys, is often limited by the availability of high-quality, labeled data. To address this, we introduce a generative data augmentation approach using a mask-conditioned latent diffusion model (LDM) for synthesizing realistic TEM images with controllable, automatically labeled multi-class defect masks. Without requiring manual annotations for generation, our method enables the creation of synthetic image-mask pairs by sampling distributions learned from experimental masks. These generated data were used to augment small experimental datasets of varying sizes (10, 50, and 100 labeled experimental images) to train a Mask Regional Convolutional Neural Network (R-CNN) model for defect detection and classification. Our results show that generative augmentation yields small overall model performance improvements, with up to a 0.02 gain in the harmonic mean of detection and classification F1 scores. However, we also find that the relative contributions to detection and classification improvement depend on the specific train/test data split. These findings highlight the potential of targeted generative models to enhance deep learning performance in data-scarce microscopy-based image quantification tasks.