Log in to save to my catalogue

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2878323455

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

About this item

Full title

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2023-10

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

This paper presents PaLI-3, a smaller, faster, and stronger vision language model (VLM) that compares favorably to similar models that are 10x larger. As part of arriving at this strong performance, we compare Vision Transformer (ViT) models pretrained using classification objectives to contrastively (SigLIP) pretrained ones. We find that, while sl...

Alternative Titles

Full title

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_2878323455

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2878323455

Other Identifiers

E-ISSN

2331-8422

How to access this item