Log in to save to my catalogue

DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding

DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2986601222

DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding

About this item

Full title

DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding

Publisher

Ithaca: Cornell University Library, arXiv.org

Journal title

arXiv.org, 2024-03

Language

English

Formats

Publication information

Publisher

Ithaca: Cornell University Library, arXiv.org

More information

Scope and Contents

Contents

Point scene understanding is a challenging task to process real-world scene point cloud, which aims at segmenting each object, estimating its pose, and reconstructing its mesh simultaneously. Recent state-of-the-art method first segments each object and then processes them independently with multiple stages for the different sub-tasks. This leads to a complex pipeline to optimize and makes it hard to leverage the relationship constraints between multiple objects. In this work, we propose a novel Disentangled Object-Centric TRansformer (DOCTR) that explores object-centric representation to facilitate learning with multiple objects for the multiple sub-tasks in a unified manner. Each object is represented as a query, and a Transformer decoder is adapted to iteratively optimize all the queries involving their relationship. In particular, we introduce a semantic-geometry disentangled query (SGDQ) design that enables the query features to attend separately to semantic information and geometric information relevant to the corresponding sub-tasks. A hybrid bipartite matching module is employed to well use the supervisions from all the sub-tasks during training. Qualitative and quantitative experimental results demonstrate that our method achieves state-of-the-art performance on the challenging ScanNet dataset. Code is available at https://github.com/SAITPublic/DOCTR....

Alternative Titles

Full title

DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding

Identifiers

Primary Identifiers

Record Identifier

TN_cdi_proquest_journals_2986601222

Permalink

https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2986601222

Other Identifiers

E-ISSN

2331-8422

How to access this item