Log in to save to my catalogue

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2880592668

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

About this item

Full title

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2023-10

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

Multimodal large language models (MLLMs) have shown great potential in perception and interpretation tasks, but their capabilities in predictive reasoning remain under-explored. To address this gap, we introduce a novel benchmark that assesses the predictive reasoning capabilities of MLLMs across diverse scenarios. Our benchmark targets three impor...

Alternative Titles

Full title

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_2880592668

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2880592668

Other Identifiers

E-ISSN

2331-8422

How to access this item