SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instructi...

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3129869551

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

About this item

Full title

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

Author / Creator

Chen, Zewen , Wang, Juan , Wang, Wen , Xu, Sunhan , Xiong, Hang , Zeng, Yun , Guo, Jian , Wang, Shuxun , Yuan, Chunfeng , Li, Bing and Hu, Weiming

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-11

Language

English

Formats

Articles

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

Subjects

Subjects and topics

More information

Scope and Contents

Contents

Existing Image Quality Assessment (IQA) methods achieve remarkable success in analyzing quality for overall image, but few works explore quality analysis for Regions of Interest (ROIs). The quality analysis of ROIs can provide fine-grained guidance for image quality improvement and is crucial for scenarios focusing on region-level quality. This paper proposes a novel network, SEAGULL, which can SEe and Assess ROIs quality with GUidance from a Large vision-Language model. SEAGULL incorporates a vision-language model (VLM), masks generated by Segment Anything Model (SAM) to specify ROIs, and a meticulously designed Mask-based Feature Extractor (MFE) to extract global and local tokens for specified ROIs, enabling accurate fine-grained IQA for ROIs. Moreover, this paper constructs two ROI-based IQA datasets, SEAGULL-100w and SEAGULL-3k, for training and evaluating ROI-based IQA. SEAGULL-100w comprises about 100w synthetic distortion images with 33 million ROIs for pre-training to improve the model's ability of regional quality perception, and SEAGULL-3k contains about 3k authentic distortion ROIs to enhance the model's ability to perceive real world distortions. After pre-training on SEAGULL-100w and fine-tuning on SEAGULL-3k, SEAGULL shows remarkable performance on fine-grained ROI quality assessment. Code and datasets are publicly available at the https://github.com/chencn2020/Seagull....

Alternative Titles

Full title

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

Authors, Artists and Contributors

Author / Creator

Chen, Zewen
Wang, Juan
Wang, Wen
Xu, Sunhan
Xiong, Hang
Zeng, Yun
Guo, Jian
Wang, Shuxun
Yuan, Chunfeng
Li, Bing
Hu, Weiming

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_3129869551

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_3129869551

Other Identifiers

E-ISSN

2331-8422

How to access this item

Full text available

View in old catalogue

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instructi...

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instructi...

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

About this item

Publication information

Subjects

More information

Scope and Contents

Alternative Titles

Authors, Artists and Contributors

Identifiers

Primary Identifiers

Other Identifiers

How to access this item

Connecting people and collections

Indigenous engagement

Learning

Stories