开yun体育官网入口登录苹果 开yun体育官网入口登录苹果
开yun体育官网入口登录苹果
College of Computer Science and Software Engineering, SZU

Regional Relation Modeling for Visual Place Recognition

 

International ACM SIGIR conference on research and development in Information Retrieval (SIGIR)

 

 Yingying Zhu*1  Biao Li1    Jiong Wang2    Zhou Zhao2 

1Shenzhen University    2Zhejiang University

 

 

 

 

Abstract

In the process of visual perception, humans perceive not only the appearance of objects existing in a place but also their relationships (e.g. spatial layout). However, the dominant works on visual place recognition are always based on the assumption that two images depict the same place if they contain enough similar objects, while the relation information is neglected. In this paper, we propose a regional relation module which models the regional relationships and converts the convolutional feature maps to the relational feature maps. We further design a cascaded pooling method to get discriminative relation descriptors by preventing the influence of confusing relations and preserving as much useful information as possible. Extensive experiments on two place recognition benchmarks demonstrate that training with the proposed regional relation module improves the appearance descriptors and the relation descriptors are complementary to appearance descriptors. When these two kinds of descriptors are concatenated together, the resulting combined descriptors outperform the state-of-the-art methods.

 

 

 

Figure 1: Motivation of our method. The first row shows two images from Tokyo 24/7 dataset at different places but having similar buildings. The second row describes the similarity comparison (red dashed line) of two images and the relation information (blue dashed line) between objects. Two images might get considerable similarity and be falsely matched when only the building appearance is considered. But when the relationships between buildings are considered, the similarity becomes very low.

 

 

Figure 2: Illustration of our network architecture for regional relation modeling. Regional relation module models regional relations and converts the convolutional feature maps to relational feature maps. The cascaded pooling method is proposed to aggregate the relational feature maps and get a global relation descriptor. Row-wise KMP denotes K-Max pooling at row-wise as described in Section 4.2.

 

 

Figure 3: Retrieval examples. From top to bottom are the query images and Top-1 retrieval results of three kinds of descriptors, which are based on AlexNet because the performance gaps are larger than that of VGGNet. True matching is marked with green border while false matching is red.

 

 

Acknowledgement

This work was supported by: (i) National Natural Science Foundation of China (Grant No. 61602314); (ii) Natural Science Foundation of Guangdong Province of China (Grant No. 2016A030313043); (iii) the Major Fundamental Research Project in the Science and Technology Plan of Shenzhen (Grant No. JCYJ20190808172007500).

 

Bibtex

@inproceedings{DBLP:conf/sigir/ZhuLWZ20,

  author = {Yingying Zhu and Biao Li and Jiong Wang and Zhou Zhao},

  title = {Regional Relation Modeling for Visual Place Recognition},

  booktitle = {Proceedings of the 43rd International {ACM} {SIGIR} conference on research and development in Information Retrieval, {SIGIR} 2020, Virtual Event, China, July 25-30, 2020},

  pages = {821--830},

  publisher = {{ACM}},

  year = {2020},

  url  = {https://doi.org/10.1145/3397271.3401176},

  doi  = {10.1145/3397271.3401176},

  biburl = {https://dblp.org/rec/conf/sigir/ZhuLWZ20.bib},

  bibsource = {dblp computer science bibliography, https://dblp.org}

}

 

Downloads