This paper provides a survey of metrics used to assess the quality of images generated by generative models. Specialized metrics are required to objectively evaluate image quality. A comparative analysis showed that a combination of different metrics is necessary for a comprehensive evaluation of generation quality. Perceptual metrics are effective for assessing image quality from the perspective of machine systems, while metrics evaluating structure and details are useful for analyzing human perception. Text-based metrics allow for the assessment of image-text alignment but cannot replace metrics focused on visual or structural evaluation. The results of this study will be beneficial for specialists in machine learning and computer vision, as well as contribute to the improvement of generative algorithms and the expansion of diffusion model applications.
Keywords: deep learning, metric, generative model, image quality, image
The article presents an analysis of modern methods of image generation: variational autoencoders (VAE), generative adversarial networks (GAN) and diffusion models. The main attention is paid to a comparative analysis of their performance, generation quality and computational requirements. The Frechet Inception Distance (FID) metric is used to assess the image quality. Diffusion models showed the best results (FID 20.8), outperforming VAE (FID 59.75) and GAN (FID 38.9), but require significant resources. VAEs are stable, but generate blurry images. GANs provide high quality, but suffer from training instability and mode collapse. Diffusion models, due to step-by-step noise decoding, combine detail and structure, which makes them the most promising. Also considered are methods of image-to-image generation used for image modification. The results of the study are useful for specialists in the field of machine learning and computer vision, contributing to the improvement of algorithms and expansion of the areas of application of generative models.
Keywords: deepfake, deep learning, artificial intelligence, GAN, VAE, diffusion model