Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Read original: arXiv:2409.00749 - Published 9/4/2024 by Wei Sun, Weixia Zhang, Yuqin Cao, Linhan Cao, Jun Jia, Zijian Chen, Zicheng Zhang, Xiongkuo Min, Guangtao Zhai

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Overview

Examines how to assess the quality of ultra-high-definition (UHD) images based on aesthetics, distortions, and visual saliency
Proposes a multi-task deep learning model to predict image quality, aesthetics, technical distortions, and saliency
Evaluates the model on a new UHD image quality assessment benchmark database

Plain English Explanation

The researchers wanted to find a way to automatically assess the quality of ultra-high-definition (UHD) images. They looked at three important factors that contribute to image quality:

Aesthetics: How visually appealing and artistic the image is.
Distortions: Any technical issues or flaws in the image, like blurriness or artifacts.
Saliency: What parts of the image draw the viewer's attention the most.

To evaluate image quality, the researchers developed a deep learning model that could predict all three of these aspects at once. This allowed them to get a more comprehensive understanding of what makes a UHD image high-quality.

They tested their model on a new database of UHD images that had been carefully evaluated by human raters. This helped them understand how well their automated approach matched up with human judgments of image quality.

Technical Explanation

The researchers proposed a multi-task deep learning model to jointly predict image aesthetics, technical distortions, and visual saliency for UHD images. Their model took a UHD image as input and produced four outputs:

An overall image quality score
An aesthetics score
A distortion score
A saliency map highlighting the most visually salient regions

To train and evaluate their model, the researchers created a new UHD image quality assessment (IQA) benchmark database with 1,952 images. Human raters carefully evaluated the images on aesthetics, distortions, and saliency.

The researchers found that their multi-task model was able to reliably predict these different aspects of image quality, outperforming existing single-task IQA models. This demonstrates the value of considering multiple quality factors together when assessing UHD images.

Critical Analysis

The researchers acknowledge several limitations of their work. Their benchmark database, while large, may not capture the full diversity of real-world UHD images. Additionally, the human ratings used to train and evaluate the model could be subjective and inconsistent.

The model also does not provide explanations for its quality predictions, making it a "black box" system. More interpretable approaches could give users better insights into what aspects of an image are driving the quality assessment.

Further research could explore incorporating additional quality factors, like semantic content or emotional impact, to build even more comprehensive UHD image quality models. Testing the generalization of these models across different domains and datasets would also be valuable.

Conclusion

This research presents an important step towards automated, holistic assessment of UHD image quality. By jointly modeling aesthetics, distortions, and saliency, the proposed deep learning model provides a more nuanced way to evaluate ultra-high-resolution visual content.

As UHD imaging becomes more prevalent, tools like this could have significant applications in areas like photography, video production, and image-based machine learning. Continued advancement in this field could lead to smarter, more intuitive image quality assessment systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency

Wei Sun, Weixia Zhang, Yuqin Cao, Linhan Cao, Jun Jia, Zijian Chen, Zicheng Zhang, Xiongkuo Min, Guangtao Zhai

UHD images, typically with resolutions equal to or higher than 4K, pose a significant challenge for efficient image quality assessment (IQA) algorithms, as adopting full-resolution images as inputs leads to overwhelming computational complexity and commonly used pre-processing methods like resizing or cropping may cause substantial loss of detail. To address this problem, we design a multi-branch deep neural network (DNN) to assess the quality of UHD images from three perspectives: global aesthetic characteristics, local technical distortions, and salient content perception. Specifically, aesthetic features are extracted from low-resolution images downsampled from the UHD ones, which lose high-frequency texture information but still preserve the global aesthetics characteristics. Technical distortions are measured using a fragment image composed of mini-patches cropped from UHD images based on the grid mini-patch sampling strategy. The salient content of UHD images is detected and cropped to extract quality-aware features from the salient regions. We adopt the Swin Transformer Tiny as the backbone networks to extract features from these three perspectives. The extracted features are concatenated and regressed into quality scores by a two-layer multi-layer perceptron (MLP) network. We employ the mean square error (MSE) loss to optimize prediction accuracy and the fidelity loss to optimize prediction monotonicity. Experimental results show that the proposed model achieves the best performance on the UHD-IQA dataset while maintaining the lowest computational complexity, demonstrating its effectiveness and efficiency. Moreover, the proposed model won first prize in ECCV AIM 2024 UHD-IQA Challenge. The code is available at https://github.com/sunwei925/UIQA.

9/4/2024

🖼️

Towards Ultra-High-Definition Image Deraining: A Benchmark and An Efficient Method

Hongming Chen, Xiang Chen, Chen Wu, Zhuoran Zheng, Jinshan Pan, Xianping Fu

Despite significant progress has been made in image deraining, existing approaches are mostly carried out on low-resolution images. The effectiveness of these methods on high-resolution images is still unknown, especially for ultra-high-definition (UHD) images, given the continuous advancement of imaging devices. In this paper, we focus on the task of UHD image deraining, and contribute the first large-scale UHD image deraining dataset, 4K-Rain13k, that contains 13,000 image pairs at 4K resolution. Based on this dataset, we conduct a benchmark study on existing methods for processing UHD images. Furthermore, we develop an effective and efficient vision MLP-based architecture (UDR-Mixer) to better solve this task. Specifically, our method contains two building components: a spatial feature rearrangement layer that captures long-range information of UHD images, and a frequency feature modulation layer that facilitates high-quality UHD image reconstruction. Extensive experimental results demonstrate that our method performs favorably against the state-of-the-art approaches while maintaining a lower model complexity. The code and dataset will be available at https://github.com/cschenxiang/UDR-Mixer.

5/28/2024

UHD-IQA Benchmark Database: Pushing the Boundaries of Blind Photo Quality Assessment

Vlad Hosu, Lorenzo Agnolucci, Oliver Wiedemann, Daisuke Iso, Dietmar Saupe

We introduce a novel Image Quality Assessment (IQA) dataset comprising 6073 UHD-1 (4K) images, annotated at a fixed width of 3840 pixels. Contrary to existing No-Reference (NR) IQA datasets, ours focuses on highly aesthetic photos of high technical quality, filling a gap in the literature. The images, carefully curated to exclude synthetic content, are sufficiently diverse to train general NR-IQA models. Importantly, the dataset is annotated with perceptual quality ratings obtained through a crowdsourcing study. Ten expert raters, comprising photographers and graphics artists, assessed each image at least twice in multiple sessions spanning several days, resulting in 20 highly reliable ratings per image. Annotators were rigorously selected based on several metrics, including self-consistency, to ensure their reliability. The dataset includes rich metadata with user and machine-generated tags from over 5,000 categories and popularity indicators such as favorites, likes, downloads, and views. With its unique characteristics, such as its focus on high-quality images, reliable crowdsourced annotations, and high annotation resolution, our dataset opens up new opportunities for advancing perceptual image quality assessment research and developing practical NR-IQA models that apply to modern photos. Our dataset is available at https://database.mmsp-kn.de/uhd-iqa-benchmark-database.html

9/5/2024

Adapting Pretrained Networks for Image Quality Assessment on High Dynamic Range Displays

Andrei Chubarau, Hyunjin Yoo, Tara Akhavan, James Clark

Conventional image quality metrics (IQMs), such as PSNR and SSIM, are designed for perceptually uniform gamma-encoded pixel values and cannot be directly applied to perceptually non-uniform linear high-dynamic-range (HDR) colors. Similarly, most of the available datasets consist of standard-dynamic-range (SDR) images collected in standard and possibly uncontrolled viewing conditions. Popular pre-trained neural networks are likewise intended for SDR inputs, restricting their direct application to HDR content. On the other hand, training HDR models from scratch is challenging due to limited available HDR data. In this work, we explore more effective approaches for training deep learning-based models for image quality assessment (IQA) on HDR data. We leverage networks pre-trained on SDR data (source domain) and re-target these models to HDR (target domain) with additional fine-tuning and domain adaptation. We validate our methods on the available HDR IQA datasets, demonstrating that models trained with our combined recipe outperform previous baselines, converge much quicker, and reliably generalize to HDR inputs.

5/2/2024