PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images

Read original: arXiv:2404.18409 - Published 4/30/2024 by Jiquan Yuan, Fanyi Yang, Jihe Li, Xinyan Cao, Jinming Che, Jinlong Lin, Xixin Cao

🔍

Overview

Rapid advancements in image generation technology have led to the creation of a vast array of AI-generated images (AIGIs), but the quality of these images is highly inconsistent.
The AI-generated image quality assessment (AIGIQA) field aims to evaluate the quality of AIGIs from the perspective of human perception, but current research has not fully explored this area.
Existing databases for AIGIQA are limited to images generated from single scenario settings, such as text-to-image generative models.
This paper presents a large-scale perceptual quality assessment database for both text-to-image and image-to-image AIGIs, called PKU-AIGIQA-4K, and proposes three AIGIQA methods based on pre-trained models.

Plain English Explanation

AI-powered image generation has made huge strides in recent years, allowing computers to create all kinds of images with remarkable realism. However, the quality of these AI-generated images can be quite uneven, with some looking great and others appearing low-quality or distorted.

Researchers have been working on ways to assess the quality of AI-generated images from the perspective of human perception, in a field called AI-generated image quality assessment (AIGIQA). But so far, the research in this area has been limited, with most existing databases only including images generated from a single type of scenario, like text-to-image models.

To address this gap, the researchers in this paper have created a large database called PKU-AIGIQA-4K that includes both text-to-image and image-to-image AI-generated images. They then conducted experiments to collect quality ratings for these images from human participants. Building on this database, the researchers also developed three new methods for assessing the quality of AI-generated images, using pre-trained AI models.

Technical Explanation

The paper first highlights the rapid progress in image generation technology and the resulting proliferation of AI-generated images (AIGIs). However, the authors note that the quality of these AIGIs is highly inconsistent, which can severely impact the visual experience of users.

To address this issue, the field of AI-generated image quality assessment (AIGIQA) has emerged, focused on evaluating the quality of AIGIs from a human perception perspective. However, the authors observe that current research in this area is limited, with existing databases restricted to images generated from single scenario settings, such as text-to-image generative models.

To fill this gap, the authors present a large-scale perceptual quality assessment database called PKU-AIGIQA-4K, which includes AIGIs generated from both text-to-image and image-to-image scenarios. They then conduct a comprehensive subjective experiment to collect quality labels for the AIGIs in the database.

Building on the PKU-AIGIQA-4K database, the authors propose three AIGIQA methods based on pre-trained models: a no-reference method (NR-AIGCIQA), a full-reference method (FR-AIGCIQA), and a partial-reference method (PR-AIGCIQA). These methods aim to evaluate the quality of AIGIs using different levels of reference information.

Finally, the authors leverage the PKU-AIGIQA-4K database to conduct extensive benchmark experiments, comparing the performance of the proposed methods with current AIGIQA approaches.

Critical Analysis

The key strength of this research is the creation of the PKU-AIGIQA-4K database, which addresses a critical gap in the existing AIGIQA literature by including AIGIs generated from both text-to-image and image-to-image scenarios. This broader coverage of AI-generated image types is an important step forward in developing more comprehensive and realistic AIGIQA tools.

However, the paper does not delve into the specific details of the image generation models and scenarios used to create the PKU-AIGIQA-4K database. It would be useful to have more information on the diversity of these models and scenarios, as well as any potential biases or limitations inherent in the database.

Additionally, while the proposed AIGIQA methods show promising results, the authors do not provide a thorough discussion of their limitations or potential areas for improvement. A more critical analysis of these methods, including their generalizability and robustness, would strengthen the overall contribution of the research.

Finally, the paper could benefit from a deeper exploration of the real-world implications and applications of the PKU-AIGIQA-4K database and the AIGIQA methods. Understanding how these advances could impact users, content creators, and the broader AI ecosystem would provide valuable context for the significance of this work.

Conclusion

This research paper presents a significant advancement in the field of AI-generated image quality assessment (AIGIQA) by establishing the large-scale PKU-AIGIQA-4K database, which includes AIGIs generated from both text-to-image and image-to-image scenarios. This broader coverage of AI image generation techniques is a crucial step towards developing more comprehensive and realistic AIGIQA tools.

Building on the PKU-AIGIQA-4K database, the authors propose three AIGIQA methods based on pre-trained models, demonstrating the potential of leveraging AI to assess the quality of AI-generated images. These methods could have important applications in areas such as content creation, user experience optimization, and AI model development.

Overall, this research contributes valuable insights and resources to the growing field of AIGIQA, laying the groundwork for further advancements in understanding and improving the quality of AI-generated images.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔍

PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images

Jiquan Yuan, Fanyi Yang, Jihe Li, Xinyan Cao, Jinming Che, Jinlong Lin, Xixin Cao

In recent years, image generation technology has rapidly advanced, resulting in the creation of a vast array of AI-generated images (AIGIs). However, the quality of these AIGIs is highly inconsistent, with low-quality AIGIs severely impairing the visual experience of users. Due to the widespread application of AIGIs, the AI-generated image quality assessment (AIGIQA), aimed at evaluating the quality of AIGIs from the perspective of human perception, has garnered increasing interest among scholars. Nonetheless, current research has not yet fully explored this field. We have observed that existing databases are limited to images generated from single scenario settings. Databases such as AGIQA-1K, AGIQA-3K, and AIGCIQA2023, for example, only include images generated by text-to-image generative models. This oversight highlights a critical gap in the current research landscape, underscoring the need for dedicated databases catering to image-to-image scenarios, as well as more comprehensive databases that encompass a broader range of AI-generated image scenarios. Addressing these issues, we have established a large scale perceptual quality assessment database for both text-to-image and image-to-image AIGIs, named PKU-AIGIQA-4K. We then conduct a well-organized subjective experiment to collect quality labels for AIGIs and perform a comprehensive analysis of the PKU-AIGIQA-4K database. Regarding the use of image prompts during the training process, we propose three image quality assessment (IQA) methods based on pre-trained models that include a no-reference method NR-AIGCIQA, a full-reference method FR-AIGCIQA, and a partial-reference method PR-AIGCIQA. Finally, leveraging the PKU-AIGIQA-4K database, we conduct extensive benchmark experiments and compare the performance of the proposed methods and the current IQA methods.

4/30/2024

AIGIQA-20K: A Large Database for AI-Generated Image Quality Assessment

Chunyi Li, Tengchuan Kou, Yixuan Gao, Yuqin Cao, Wei Sun, Zicheng Zhang, Yingjie Zhou, Zhichao Zhang, Weixia Zhang, Haoning Wu, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai

With the rapid advancements in AI-Generated Content (AIGC), AI-Generated Images (AIGIs) have been widely applied in entertainment, education, and social media. However, due to the significant variance in quality among different AIGIs, there is an urgent need for models that consistently match human subjective ratings. To address this issue, we organized a challenge towards AIGC quality assessment on NTIRE 2024 that extensively considers 15 popular generative models, utilizing dynamic hyper-parameters (including classifier-free guidance, iteration epochs, and output image resolution), and gather subjective scores that consider perceptual quality and text-to-image alignment altogether comprehensively involving 21 subjects. This approach culminates in the creation of the largest fine-grained AIGI subjective quality database to date with 20,000 AIGIs and 420,000 subjective ratings, known as AIGIQA-20K. Furthermore, we conduct benchmark experiments on this database to assess the correspondence between 16 mainstream AIGI quality models and human perception. We anticipate that this large-scale quality database will inspire robust quality indicators for AIGIs and propel the evolution of AIGC for vision. The database is released on https://www.modelscope.cn/datasets/lcysyzxdxc/AIGCQA-30K-Image.

4/5/2024

Bringing Textual Prompt to AI-Generated Image Quality Assessment

Bowen Qu, Haohui Li, Wei Gao

AI-Generated Images (AGIs) have inherent multimodal nature. Unlike traditional image quality assessment (IQA) on natural scenarios, AGIs quality assessment (AGIQA) takes the correspondence of image and its textual prompt into consideration. This is coupled in the ground truth score, which confuses the unimodal IQA methods. To solve this problem, we introduce IP-IQA (AGIs Quality Assessment via Image and Prompt), a multimodal framework for AGIQA via corresponding image and prompt incorporation. Specifically, we propose a novel incremental pretraining task named Image2Prompt for better understanding of AGIs and their corresponding textual prompts. An effective and efficient image-prompt fusion module, along with a novel special [QA] token, are also applied. Both are plug-and-play and beneficial for the cooperation of image and its corresponding prompt. Experiments demonstrate that our IP-IQA achieves the state-of-the-art on AGIQA-1k and AGIQA-3k datasets. Code will be available at https://github.com/Coobiw/IP-IQA.

5/22/2024

🤔

Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning

Jiarui Wang, Huiyu Duan, Guangtao Zhai, Xiongkuo Min

Artificial Intelligence Generated Content (AIGC) has grown rapidly in recent years, among which AI-based image generation has gained widespread attention due to its efficient and imaginative image creation ability. However, AI-generated Images (AIGIs) may not satisfy human preferences due to their unique distortions, which highlights the necessity to understand and evaluate human preferences for AIGIs. To this end, in this paper, we first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+, which provides human visual preference scores and detailed preference explanations from three perspectives including quality, authenticity, and correspondence. Then, based on the constructed AIGCIQA2023+ database, this paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning. Specifically, the MINT-IQA model first learn and evaluate human preferences for AI-generated Images from multi-perspectives, then via the vision-language instruction tuning strategy, MINT-IQA attains powerful understanding and explanation ability for human visual preference on AIGIs, which can be used for feedback to further improve the assessment capabilities. Extensive experimental results demonstrate that the proposed MINT-IQA model achieves state-of-the-art performance in understanding and evaluating human visual preferences for AIGIs, and the proposed model also achieves competing results on traditional IQA tasks compared with state-of-the-art IQA models. The AIGCIQA2023+ database and MINT-IQA model will be released to facilitate future research.

5/14/2024