Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching

Lu, Xiaoyong; Du, Songlin

doi:10.1007/978-3-031-72946-1_15

Xiaoyong Lu¹³ &
Songlin Du¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15100))

Included in the following conference series:

European Conference on Computer Vision

962 Accesses
9 Citations

Abstract

Current feature matching methods prioritize improving modeling capabilities to better align outputs with ground-truth matches, which are the theoretical upper bound on matching results, metaphorically depicted as the “ceiling”. However, these enhancements fail to address the underlying issues that directly hinder ground-truth matches, including the scarcity of matchable points in small scale images, matching conflicts in dense methods, and the keypoint-repeatability reliance in sparse methods. We propose a novel feature matching method named RCM, which Raises the Ceiling of Matching from three aspects. 1) RCM introduces a dynamic view switching mechanism to address the scarcity of matchable points in source images by strategically switching image pairs. 2) RCM proposes a conflict-free coarse matching module, addressing matching conflicts in the target image through a many-to-one matching strategy. 3) By integrating the semi-sparse paradigm and the coarse-to-fine architecture, RCM preserves the benefits of both high efficiency and global search, mitigating the reliance on keypoint repeatability. As a result, RCM enables more matchable points in the source image to be matched in an exhaustive and conflict-free manner in the target image, leading to a substantial 260% increase in ground-truth matches. Comprehensive experiments show that RCM exhibits remarkable performance and efficiency in comparison to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+

from €37.37 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Chapter: EUR 29.95; Price includes VAT (Netherlands)

eBook: EUR 60.98; Price includes VAT (Netherlands)

Softcover Book: EUR 80.65; Price includes VAT (Netherlands)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Shared Vision Transformer Helps Scene Text Retrieval

Target recognition in SAR images using radial Chebyshev moments

Article Open access 21 January 2017

Refinement Correction Network for Scene Text Detection

References

Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: Proceedings of the CVPR, pp. 5173–5182 (2017)
Google Scholar
Barroso-Laguna, A., Tian, Y., Mikolajczyk, K.: Scalenet: a shallow architecture for scale estimation. In: Proceedings of the CVPR, pp. 12808–12818 (2022)
Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Chapter Google Scholar
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_56
Chapter Google Scholar
Chen, H., et al.: Learning to match features with seeded graph matching network. In: Proceedings of the ICCV, pp. 6301–6310 (2021)
Google Scholar
Chen, H., et al.: ASpanFormer: detector-free image matching with adaptive span transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13692, pp. 20–36 . Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_2
Chen, Y., Huang, D., Xu, S., Liu, J., Liu, Y.: Guide local feature matching by overlap estimation. In: Proceedings of the AAAI, pp. 365–373 (2022)
Google Scholar
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the CVPR, pp. 5828–5839 (2017)
Google Scholar
DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: Proceedings of the CVPRW, pp. 224–236 (2018)
Google Scholar
Dusmanu, M., et al.: D2-Net: a trainable CNN for joint description and detection of local features. In: Proceedings of the CVPR, pp. 8092–8101 (2019)
Google Scholar
Edstedt, J., Bökman, G., Wadenbäck, M., Felsberg, M.: Dedode: detect, don’t describe-describe, don’t detect for local feature matching. In: Proceedings of the 3DV, pp. 148–157 (2024)
Google Scholar
Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: Roma: robust dense feature matching. In: Proceedings of the CVPR, pp. 19790–19800 (2024)
Google Scholar
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017)
Article Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 381–395 (1981)
Google Scholar
Fu, Y., Wu, Y.: Scale-Net: learning to reduce scale differences for large-scale invariant image matching. arXiv preprint arXiv:2112.10485 (2021)
Germain, H., Bourmaud, G., Lepetit, V.: Sparse-to-dense hypercolumn matching for long-term visual localization. In: Proceedings of the 3DV, pp. 513–523 (2019)
Google Scholar
Germain, H., Bourmaud, G., Lepetit, V.: S2DNet: learning image features for accurate sparse-to-dense matching. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 626–643. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_37
Chapter Google Scholar
Gleize, P., Wang, W., Feiszli, M.: Silk: simple learned keypoints. In: Proceedings of the CVPR, pp. 22499–22508 (2023)
Google Scholar
Huang, D., et al.: Adaptive assignment for geometry aware local feature matching. In: Proceedings of the CVPR, pp. 5425–5434 (2023)
Google Scholar
Jiang, W., Trulls, E., Hosang, J., Tagliasacchi, A., Yi, K.M.: COTR: correspondence transformer for matching across images. In: Proceedings of the ICCV, pp. 6207–6217 (2021)
Google Scholar
Katharopoulos, A., Vyas, A., Pappas, N., Fleuret, F.: Transformers are rnns: fast autoregressive transformers with linear attention. In: Proceedings of the ICML, pp. 5156–5165 (2020)
Google Scholar
Li, X., Han, K., Li, S., Prisacariu, V.: Dual-resolution correspondence networks. In: Proceedings of the NeurIPS, pp. 17346–17357 (2020)
Google Scholar
Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Proceedings of the CVPR, pp. 2041–2050 (2018)
Google Scholar
Lindenberger, P., Sarlin, P., Pollefeys, M.: Lightglue: local feature matching at light speed. In: Proceedings of the ICCV, pp. 17627–17638 (2023)
Google Scholar
Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T.: SIFT flow: dense correspondence across different scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 28–42. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_3
Chapter Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Lu, X., Yan, Y., Kang, B., Du, S.: Paraformer: parallel attention transformer for efficient feature matching. arXiv preprint arXiv:2303.00941 (2023)
Luo, Z., et al.: Aslfeat: learning local features of accurate shape and localization. In: Proceedings of the CVPR, p. 6589 (2020)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Proc. NeurIPS 32 (2019)
Google Scholar
Revaud, J., De Souza, C., Humenberger, M., Weinzaepfel, P.: R2D2: reliable and repeatable detector and descriptor. In: Proceedings of the NeurIPS, pp. 12414–12424 (2019)
Google Scholar
Rocco, I., Arandjelović, R., Sivic, J.: Efficient neighbourhood consensus networks via submanifold sparse convolutions. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 605–621. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_35
Chapter Google Scholar
Rocco, I., Cimpoi, M., Arandjelović, R., Torii, A., Pajdla, T., Sivic, J.: Neighbourhood consensus networks. In: Proceedings of the NeurIPS, pp. 1658–1669 (2018)
Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: Proceedings of the ICCV, pp. 2564–2571 (2011)
Google Scholar
Sarlin, P., Cadena, C.and Siegwart, R., Dymczyk, M.: From coarse to fine: robust hierarchical localization at large scale. In: Proceedings of the CVPR, pp. 12716–12725 (2019)
Google Scholar
Sarlin, P., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Proceedings of the CVPR, pp. 4937–4946 (2020)
Google Scholar
Schonberger, J.L., Frahm, J.: Structure-from-motion revisited. In: Proceedings of the CVPR, pp. 4104–4113 (2016)
Google Scholar
Shi, Y., Cai, J., Shavit, Y., Mu, T., Feng, W., Zhang, K.: ClusterGNN: cluster-based coarse-to-fine graph neural network for efficient feature matching. In: Proceedings of the CVPR, pp. 12517–12526, June 2022
Google Scholar
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: LoFTR: detector-free local feature matching with transformers. In: Proceedings of the CVPR, pp. 8922–8931 (2021)
Google Scholar
Truong, P., Danelljan, M., Timofte, R., Van Gool, L.: PDC-Net+: enhanced probabilistic dense correspondence network. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the NeurIPS, pp. 5998–6008 (2017)
Google Scholar
Wang, Q., Zhang, J., Yang, K., Peng, K., Stiefelhagen, R.: Matchformer: interleaving attention in transformers for feature matching. In: Proceedings of the ACCV, pp. 2746–2762 (2022)
Google Scholar
Wang, Q., Zhou, X., Hariharan, B., Snavely, N.: Learning feature descriptors using camera pose supervision. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 757–774. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_44
Chapter Google Scholar
Zhang, J., et al.: Learning two-view correspondences and geometry using order-aware network. In: Proceedings of the ICCV, pp. 5845–5854 (2019)
Google Scholar
Zhang, Z., Sattler, T., Scaramuzza, D.: Reference pose generation for long-term visual localization via learned features and view synthesis. Int. J. Comput. Vis. 821–844 (2021)
Google Scholar
Zhao, X., Wu, X., Chen, W., Chen, P.C., Xu, Q., Li, Z.: Aliked: a lighter keypoint and descriptor extraction network via deformable transformation. IEEE Trans. Instrum. Meas. 72, 1–16 (2023)
Google Scholar
Zhou, Q., Sattler, T., Leal-Taixe, L.: Patch2pix: epipolar-guided pixel-level correspondences. In: Proceedings of the CVPR, pp. 4669–4678 (2021)
Google Scholar

Download references

Acknowledgments

This work was supported by the SEU Innovation Capability Enhancement Plan for Doctoral Students under grant CXJH SEU 24128.

Author information

Authors and Affiliations

School of Automation, Southeast University, Nanjing, China
Xiaoyong Lu & Songlin Du

Authors

Xiaoyong Lu
View author publications
Search author on:PubMed Google Scholar
Songlin Du
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Songlin Du .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 51443 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, X., Du, S. (2025). Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15100. Springer, Cham. https://doi.org/10.1007/978-3-031-72946-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-72946-1_15
Published: 02 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72945-4
Online ISBN: 978-3-031-72946-1
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

Publish with us

Policies and ethics

Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching