Abstract
Current feature matching methods prioritize improving modeling capabilities to better align outputs with ground-truth matches, which are the theoretical upper bound on matching results, metaphorically depicted as the “ceiling”. However, these enhancements fail to address the underlying issues that directly hinder ground-truth matches, including the scarcity of matchable points in small scale images, matching conflicts in dense methods, and the keypoint-repeatability reliance in sparse methods. We propose a novel feature matching method named RCM, which Raises the Ceiling of Matching from three aspects. 1) RCM introduces a dynamic view switching mechanism to address the scarcity of matchable points in source images by strategically switching image pairs. 2) RCM proposes a conflict-free coarse matching module, addressing matching conflicts in the target image through a many-to-one matching strategy. 3) By integrating the semi-sparse paradigm and the coarse-to-fine architecture, RCM preserves the benefits of both high efficiency and global search, mitigating the reliance on keypoint repeatability. As a result, RCM enables more matchable points in the source image to be matched in an exhaustive and conflict-free manner in the target image, leading to a substantial 260% increase in ground-truth matches. Comprehensive experiments show that RCM exhibits remarkable performance and efficiency in comparison to state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: Proceedings of the CVPR, pp. 5173–5182 (2017)
Barroso-Laguna, A., Tian, Y., Mikolajczyk, K.: Scalenet: a shallow architecture for scale estimation. In: Proceedings of the CVPR, pp. 12808–12818 (2022)
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). https://doi.org/10.1007/11744023_32
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_56
Chen, H., et al.: Learning to match features with seeded graph matching network. In: Proceedings of the ICCV, pp. 6301–6310 (2021)
Chen, H., et al.: ASpanFormer: detector-free image matching with adaptive span transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022. ECCV 2022. LNCS, vol. 13692, pp. 20–36 . Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_2
Chen, Y., Huang, D., Xu, S., Liu, J., Liu, Y.: Guide local feature matching by overlap estimation. In: Proceedings of the AAAI, pp. 365–373 (2022)
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the CVPR, pp. 5828–5839 (2017)
DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperPoint: self-supervised interest point detection and description. In: Proceedings of the CVPRW, pp. 224–236 (2018)
Dusmanu, M., et al.: D2-Net: a trainable CNN for joint description and detection of local features. In: Proceedings of the CVPR, pp. 8092–8101 (2019)
Edstedt, J., Bökman, G., Wadenbäck, M., Felsberg, M.: Dedode: detect, don’t describe-describe, don’t detect for local feature matching. In: Proceedings of the 3DV, pp. 148–157 (2024)
Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: Roma: robust dense feature matching. In: Proceedings of the CVPR, pp. 19790–19800 (2024)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40(3), 611–625 (2017)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 381–395 (1981)
Fu, Y., Wu, Y.: Scale-Net: learning to reduce scale differences for large-scale invariant image matching. arXiv preprint arXiv:2112.10485 (2021)
Germain, H., Bourmaud, G., Lepetit, V.: Sparse-to-dense hypercolumn matching for long-term visual localization. In: Proceedings of the 3DV, pp. 513–523 (2019)
Germain, H., Bourmaud, G., Lepetit, V.: S2DNet: learning image features for accurate sparse-to-dense matching. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 626–643. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_37
Gleize, P., Wang, W., Feiszli, M.: Silk: simple learned keypoints. In: Proceedings of the CVPR, pp. 22499–22508 (2023)
Huang, D., et al.: Adaptive assignment for geometry aware local feature matching. In: Proceedings of the CVPR, pp. 5425–5434 (2023)
Jiang, W., Trulls, E., Hosang, J., Tagliasacchi, A., Yi, K.M.: COTR: correspondence transformer for matching across images. In: Proceedings of the ICCV, pp. 6207–6217 (2021)
Katharopoulos, A., Vyas, A., Pappas, N., Fleuret, F.: Transformers are rnns: fast autoregressive transformers with linear attention. In: Proceedings of the ICML, pp. 5156–5165 (2020)
Li, X., Han, K., Li, S., Prisacariu, V.: Dual-resolution correspondence networks. In: Proceedings of the NeurIPS, pp. 17346–17357 (2020)
Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Proceedings of the CVPR, pp. 2041–2050 (2018)
Lindenberger, P., Sarlin, P., Pollefeys, M.: Lightglue: local feature matching at light speed. In: Proceedings of the ICCV, pp. 17627–17638 (2023)
Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T.: SIFT flow: dense correspondence across different scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 28–42. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_3
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Lu, X., Yan, Y., Kang, B., Du, S.: Paraformer: parallel attention transformer for efficient feature matching. arXiv preprint arXiv:2303.00941 (2023)
Luo, Z., et al.: Aslfeat: learning local features of accurate shape and localization. In: Proceedings of the CVPR, p. 6589 (2020)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. Proc. NeurIPS 32 (2019)
Revaud, J., De Souza, C., Humenberger, M., Weinzaepfel, P.: R2D2: reliable and repeatable detector and descriptor. In: Proceedings of the NeurIPS, pp. 12414–12424 (2019)
Rocco, I., Arandjelović, R., Sivic, J.: Efficient neighbourhood consensus networks via submanifold sparse convolutions. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12354, pp. 605–621. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58545-7_35
Rocco, I., Cimpoi, M., Arandjelović, R., Torii, A., Pajdla, T., Sivic, J.: Neighbourhood consensus networks. In: Proceedings of the NeurIPS, pp. 1658–1669 (2018)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: Proceedings of the ICCV, pp. 2564–2571 (2011)
Sarlin, P., Cadena, C.and Siegwart, R., Dymczyk, M.: From coarse to fine: robust hierarchical localization at large scale. In: Proceedings of the CVPR, pp. 12716–12725 (2019)
Sarlin, P., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: Proceedings of the CVPR, pp. 4937–4946 (2020)
Schonberger, J.L., Frahm, J.: Structure-from-motion revisited. In: Proceedings of the CVPR, pp. 4104–4113 (2016)
Shi, Y., Cai, J., Shavit, Y., Mu, T., Feng, W., Zhang, K.: ClusterGNN: cluster-based coarse-to-fine graph neural network for efficient feature matching. In: Proceedings of the CVPR, pp. 12517–12526, June 2022
Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: LoFTR: detector-free local feature matching with transformers. In: Proceedings of the CVPR, pp. 8922–8931 (2021)
Truong, P., Danelljan, M., Timofte, R., Van Gool, L.: PDC-Net+: enhanced probabilistic dense correspondence network. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the NeurIPS, pp. 5998–6008 (2017)
Wang, Q., Zhang, J., Yang, K., Peng, K., Stiefelhagen, R.: Matchformer: interleaving attention in transformers for feature matching. In: Proceedings of the ACCV, pp. 2746–2762 (2022)
Wang, Q., Zhou, X., Hariharan, B., Snavely, N.: Learning feature descriptors using camera pose supervision. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 757–774. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_44
Zhang, J., et al.: Learning two-view correspondences and geometry using order-aware network. In: Proceedings of the ICCV, pp. 5845–5854 (2019)
Zhang, Z., Sattler, T., Scaramuzza, D.: Reference pose generation for long-term visual localization via learned features and view synthesis. Int. J. Comput. Vis. 821–844 (2021)
Zhao, X., Wu, X., Chen, W., Chen, P.C., Xu, Q., Li, Z.: Aliked: a lighter keypoint and descriptor extraction network via deformable transformation. IEEE Trans. Instrum. Meas. 72, 1–16 (2023)
Zhou, Q., Sattler, T., Leal-Taixe, L.: Patch2pix: epipolar-guided pixel-level correspondences. In: Proceedings of the CVPR, pp. 4669–4678 (2021)
Acknowledgments
This work was supported by the SEU Innovation Capability Enhancement Plan for Doctoral Students under grant CXJH SEU 24128.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lu, X., Du, S. (2025). Raising the Ceiling: Conflict-Free Local Feature Matching with Dynamic View Switching. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15100. Springer, Cham. https://doi.org/10.1007/978-3-031-72946-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-72946-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72945-4
Online ISBN: 978-3-031-72946-1
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
