Log in to save to my catalogue

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2894089588

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

About this item

Full title

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-04

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

With the rapid development of large language models (LLMs) and their integration into large multimodal models (LMMs), there has been impressive progress in zero-shot completion of user-oriented vision-language tasks. However, a gap remains in the domain of chart image understanding due to the distinct abstract components in charts. To address this, we introduce a large-scale MultiModal Chart Instruction (\textbf{MMC-Instruction}) dataset comprising 600k instances supporting diverse tasks and chart types. Leveraging this data, we develop MultiModal Chart Assistant (\textbf{MMCA}), an LMM that achieves state-of-the-art performance on existing chart QA benchmarks. Recognizing the need for a comprehensive evaluation of LMM chart understanding, we also propose a MultiModal Chart Benchmark (\textbf{MMC-Benchmark}), a comprehensive human-annotated benchmark with nine distinct tasks evaluating reasoning capabilities over charts. Extensive experiments on MMC-Benchmark reveal the limitations of existing LMMs on correctly interpreting charts, even for the most recent GPT-4V model. Our work provides an instruction-tuning methodology and benchmark to advance multimodal understanding of charts. Code and data are available at https://github.com/FuxiaoLiu/MMC....

Alternative Titles

Full title

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_2894089588

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2894089588

Other Identifiers

E-ISSN

2331-8422

How to access this item