Location-Sensitive Visual Recognition with Cross-IOU Loss
Location-Sensitive Visual Recognition with Cross-IOU Loss
About this item
Full title
Author / Creator
Duan, Kaiwen , Xie, Lingxi , Qi, Honggang , Bai, Song , Huang, Qingming and Tian, Qi
Publisher
Ithaca: Cornell University Library, arXiv.org
Journal title
Language
English
Formats
Publication information
Publisher
Ithaca: Cornell University Library, arXiv.org
Subjects
More information
Scope and Contents
Contents
Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks. This paper summarizes these tasks as location-sensitive visual recognition and proposes a unified solution named location-sensitive network (LSNet). Based on a deep neural network as the backbone, LSNet predicts an anchor point and a set of landmarks which together define the shape of the target object. The key to optimizing the LSNet lies in the ability of fitting various scales, for which we design a novel loss function named cross-IOU loss that computes the cross-IOU of each anchor point-landmark pair to approximate the global IOU between the prediction and ground-truth. The flexibly located and accurately predicted landmarks also enable LSNet to incorporate richer contextual information for visual recognition. Evaluated on the MS-COCO dataset, LSNet set the new state-of-the-art accuracy for anchor-free object detection (a 53.5% box AP) and instance segmentation (a 40.2% mask AP), and shows promising performance in detecting multi-scale human poses. Code is available at https://github.com/Duankaiwen/LSNet...
Alternative Titles
Full title
Location-Sensitive Visual Recognition with Cross-IOU Loss
Authors, Artists and Contributors
Author / Creator
Identifiers
Primary Identifiers
Record Identifier
TN_cdi_proquest_journals_2512176252
Permalink
https://devfeature-collection.sl.nsw.gov.au/record/TN_cdi_proquest_journals_2512176252
Other Identifiers
E-ISSN
2331-8422