Abstract
This paper studies graph clustering with application to feature matching and proposes an effective method, termed as GC-LAC, that can establish reliable feature correspondences and simultaneously discover all potential visual patterns. In particular, we regard each putative match as a node and encode the geometric relationships into edges where a visual pattern sharing similar motion behaviors corresponds to a strongly connected subgraph. In this setting, it is natural to formulate the feature matching task as a graph clustering problem. To construct a geometric meaningful graph, based on the best practices, we adopt a local affine strategy. By investigating the motion coherence prior, we further propose an efficient and deterministic geometric solver (MCDG) to extract the local geometric information that helps construct the graph. The graph is sparse and general for various image transformations. Subsequently, a novel robust graph clustering algorithm (D2SCAN) is introduced, which defines the notion of density-reachable on the graph by replicator dynamics optimization. Extensive experiments focusing on both the local and the whole of our GC-LAC with various practical vision tasks including relative pose estimation, homography and fundamental matrix estimation, loop-closure detection, and multimodel fitting, demonstrate that our GC-LAC is more competitive than current state-of-the-art methods, in terms of generality, efficiency, and effectiveness. The source code for this work is publicly available at: https://github.com/YifanLu2000/GCLAC.






















Similar content being viewed by others
Data availability
The data or code during the current study are available from the corresponding author on reasonable request.
References
Angeli, A., Filliat, D., Doncieux, S., & Meyer, J. A. (2008). Fast and incremental method for loop-closure detection using bags of visual words. IEEE Transactions on Robotics, 24(5), 1027–1037.
Baeza-Yates, R., Ribeiro-Neto, B., et al. (1999). Modern information retrieval (Vol. 463). ACM press.
Barath, D., & Matas, J. (2018). Graph-cut ransac. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 6733–6741).
Barath, D., & Matas, J. (2019). Progressive-x: Efficient, anytime, multi-model fitting algorithm. In: Proceedings of the IEEE/CVF international conference on computer vision, (pp. 3780–3788).
Barath, D., Matas, J., & Noskova, J. (2019). Magsac: marginalizing sample consensus. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 197–205).
Barath, D., Mishkin, D., Polic, M., Förstner, W., & Matas, J. (2023). A large-scale homography benchmark. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (pp. 360–370).
Barath, D., Noskova, J., Ivashechkin, M., & Matas, J. (2020). Magsac++, a fast, reliable and accurate robust estimator. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, (pp. 1304–1312).
Berg, A.C., Berg, T.L., & Malik, J. (2005). Shape matching and object recognition using low distortion correspondences. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol. 1, (pp. 26–33). IEEE.
Bian, J., Lin, W.Y., Matsushita, Y., Yeung, S.K., Nguyen, T.D., & Cheng, M.M. (2017). Gms: Grid-based motion statistics for fast, ultra-robust feature correspondence. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4181–4190).
Bian, J.W., Wu, Y.H., Zhao, J., Liu, Y., Zhang, L., Cheng, M.M., & Reid, I. (2019). An evaluation of feature matchers for fundamental matrix estimation. In: Proceedings of the British Machine vision conference.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Brachmann, E., & Rother, C. (2019). Neural-guided ransac: Learning where to sample model hypotheses. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4322–4331).
Cai, Z., Chin, T.J., Le, H., & Suter, D. (2018). Deterministic consensus maximization with biconvex programming. In: Proceedings of the European conference on computer vision (ECCV), (pp. 685–700).
Cavalli, L., Larsson, V., Oswald, M.R., Sattler, T., & Pollefeys, M. (2020). Handcrafted outlier detection revisited. In: European Conference on Computer Vision, (pp. 770–787). Springer.
Chang, H., & Yeung, D. Y. (2008). Robust path-based spectral clustering. Pattern Recognition, 41(1), 191–203.
Cho, M., Lee, J., & Lee, K.M. (2010). Reweighted random walks for graph matching. In: European conference on Computer vision, (pp. 492–505). Springer.
Chum, O., & Matas, J. (2005). Matching with prosac-progressive sample consensus. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 220–226).
Chum, O., Matas, J., & Kittler, J. (2003). Locally optimized ransac. In: Proceedings of the Joint Pattern Recognition Symposium, (pp. 236–243).
Chum, O., Werner, T., & Matas, J. (2005). Two-view geometry estimation unaffected by a dominant plane. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 772–779).
Cour, T., Srinivasan, P., & Shi, J. (2006). Balanced graph matching. Advances in neural information processing systems, 19, 313–320.
Cummins, M., & Newman, P. (2008). Fab-map: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6), 647–665.
DeTone, D., Malisiewicz, T., & Rabinovich, A. (2018). Superpoint: Self-supervised interest point detection and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, (pp. 224–236).
Duchenne, O., Joulin, A., & Ponce, J. (2011). A graph-matching kernel for object categorization. In: 2011 International Conference on Computer Vision, (pp. 1792–1799). IEEE.
Egozi, A., Keller, Y., & Guterman, H. (2012). A probabilistic approach to spectral graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 18–27.
Ester, M., Kriegel, H. P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, 96, 226–231.
Fan, A., Jiang, X., Wang, Y., Jiang, J., & Ma, J. (2020). Geometric estimation via robust subspace recovery. In: Proceedings of the European Conference on Computer Vision, (pp. 462–478).
Fan, A., Ma, J., Jiang, X., & Ling, H. (2021). Efficient deterministic search with robust loss functions for geometric model fitting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11), 8212–8229.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Fu, L., & Medico, E. (2007). Flame, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinformatics, 8(1), 1–15.
Gao, Y., Ma, J., & Yuille, A. L. (2017). Semi-supervised sparse representation based classification for face recognition with insufficient labeled samples. IEEE Transactions on Image Processing, 26(5), 2545–2560.
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE conference on computer vision and pattern recognition, (pp. 3354–3361). IEEE.
Gionis, A., Mannila, H., & Tsaparas, P. (2007). Clustering aggregation. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), 4.
Guo, X., & Cao, X. (2012). Good match exploration using triangle constraint. Pattern Recognition Letters, 33(7), 872–881.
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge University Press.
Heinly, J., Schonberger, J.L., Dunn, E., & Frahm, J.M. (2015). Reconstructing the world* in six days*(as captured by the yahoo 100 million image dataset). In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 3287–3295).
Horst, M., & Möller, R. (2017). Visual place recognition for autonomous mobile robots. Robotics, 6(2), 9.
Jain, A.K., & Law, M.H. (2005). Data clustering: A user’s dilemma. In: International conference on pattern recognition and machine intelligence, (pp. 1–10). Springer.
Jiang, J., Chen, C., Ma, J., Wang, Z., Wang, Z., & Hu, R. (2016). SRLSP: A face image super-resolution algorithm using smooth regression with local structure prior. IEEE Transactions on Multimedia, 19(1), 27–40.
Jiang, X., & Ma, J. (2023). Robust model reasoning and fitting via dual sparsity pursuit. In: Thirty-seventh Conference on Neural Information Processing Systems.
Jiang, X., Ma, J., Jiang, J., & Guo, X. (2020). Robust feature matching using spatial clustering with heavy outliers. IEEE Transactions on Image Processing, 29, 736–746.
Jiang, X., Ma, J., Xiao, G., Shao, Z., & Guo, X. (2021). A review of multimodal image matching: Methods and applications. Information Fusion, 73, 22–71.
Jiang, X., Wang, Y., Fan, A., & Ma, J. (2022). Learning for mismatch removal via graph attention networks. ISPRS Journal of Photogrammetry and Remote Sensing, 190, 181–195.
Jiang, X., Xia, Y., Zhang, X. P., & Ma, J. (2022). Robust image matching via local graph structure consensus. Pattern Recognition, 126, 108–588.
Jin, Y., Mishkin, D., Mishchuk, A., Matas, J., Fua, P., Yi, K. M., & Trulls, E. (2021). Image matching across wide baselines: From paper to practice. International Journal of Computer Vision, 129(2), 517–547.
Kearnes, S., McCloskey, K., Berndl, M., Pande, V., & Riley, P. (2016). Molecular graph convolutions: moving beyond fingerprints. Journal of Computer-Aided Molecular Design, 30(8), 595–608.
Knapitsch, A., Park, J., Zhou, Q. Y., & Koltun, V. (2017). Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics, 36(4), 1–13.
Lebeda, K., Matas, J., & Chum, O. (2012). Fixing the locally optimized ransac. In: Proceedings of the British Machine Vision Conference, (pp. 1–11).
Leordeanu, M., & Hebert, M. (2005). A spectral technique for correspondence problems using pairwise constraints. In: Proceedings of the IEEE International Conference on Computer Vision, (pp. 1482–1489).
Li, X., & Hu, Z. (2010). Rejecting mismatches by correspondence function. International Journal of Computer Vision, 89(1), 1–17.
Liang, L., Zhao, W., Hao, X., Yang, Y., Yang, K., Liang, L., & Yang, Q. (2020). Image registration using two-layer cascade reciprocal pipeline and context-aware dissimilarity measure. Neurocomputing, 371, 1–14.
Lin, W. Y., Wang, F., Cheng, M. M., Yeung, S. K., Torr, P. H., Do, M. N., & Lu, J. (2017). Code: Coherence based decision boundaries for feature correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(1), 34–47.
Liu, H., & Yan, S. (2010). Robust graph mode seeking by graph shift. In: ICML.
Liu, M., Pradalier, C., & Siegwart, R. (2013). Visual homing from scale with an uncalibrated omnidirectional camera. IEEE Transactions on Robotics, 29(6), 1353–1365.
Lowe, D.G. (1999). Object recognition from local scale-invariant features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. (1150–1157).
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Lu, Y., Ma, J., Fang, L., Tian, X., & Jiang, J. (2023). Robust and scalable gaussian process regression and its applications. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (pp. 950–959).
Lu, Y., Ma, J., Mei, X., Huang, J., & Zhang, X. P. (2024). Feature matching via topology-aware graph interaction model. IEEE/CAA Journal of Automatica Sinica, 11(1), 1–18.
Ma, J., Fan, A., Jiang, X., & Xiao, G. (2022). Feature matching via motion-consistency driven probabilistic graphical model. International Journal of Computer Vision, 130(9), 2249–2264.
Ma, J., Jiang, X., Fan, A., Jiang, J., & Yan, J. (2021). Image matching from handcrafted to deep features: A survey. International Journal of Computer Vision, 129(1), 23–79.
Ma, J., Zhao, J., Jiang, J., Zhou, H., & Guo, X. (2019). Locality preserving matching. International Journal of Computer Vision, 127(5), 512–531.
Ma, J., Zhao, J., Tian, J., Yuille, A. L., & Tu, Z. (2014). Robust point matching via vector field consensus. IEEE Transactions on Image Processing, 23(4), 1706–1721.
Ma, J., Zhou, H., Zhao, J., Gao, Y., Jiang, J., & Tian, J. (2015). Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Transactions on Geoscience and Remote Sensing, 53(12), 6469–6481.
MacQueen, J. (1967). Classification and analysis of multivariate observations. In: 5th Berkeley Symp. Math. Statist. Probability, (pp. 281–297).
Magri, L., & Fusiello, A. (2014). T-linkage: A continuous relaxation of j-linkage for multi-model fitting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 3954–3961).
Magri, L., & Fusiello, A. (2015). Robust multiple model fitting with preference analysis and low-rank approximation. In: Proceedings of the British Machine Vision Conference.
Magri, L., & Fusiello, A. (2016). Multiple model fitting as a set coverage problem. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 3318–3326).
Magri, L., & Fusiello, A. (2019). Fitting multiple heterogeneous models by multi-class cascaded t-linkage. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 7460–7468).
Magri, L., Leveni, F., & Boracchi, G. (2021). Multilink: Multi-class structure recovery via agglomerative clustering and model selection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (pp. 1853–1862).
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Van Gool, L. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1), 43–72.
Mishkin, D., Matas, J., & Perdoch, M. (2015). Mods: Fast and robust method for two-view matching. Computer Vision and Image Understanding, 141, 81–93.
Ng, A., Jordan, M., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14, 849–856.
Pavan, M., & Pelillo, M. (2006). Dominant sets and pairwise clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 167–172.
Pham, T. T., Chin, T. J., Yu, J., & Suter, D. (2014). The random cluster model for robust geometric fitting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(8), 1658–1671.
Puerto-Souza, G.A., & Mariottini, G.L. (2012). Hierarchical multi-affine (hma) algorithm for fast and accurate feature matching in minimally-invasive surgical images. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, (pp. 2007–2012).
Qi, C.R., Su, H., Mo, K., & Guibas, L.J. (2017). Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 652–660).
Qiu, X., Mao, Q., Tang, Y., Wang, L., Chawla, R., Pliner, H. A., & Trapnell, C. (2017). Reversed graph embedding resolves complex single-cell trajectories. Nature Methods, 14(10), 979–982.
Raguram, R., Chum, O., Pollefeys, M., Matas, J., & Frahm, J. M. (2012). USAC: A universal framework for random sample consensus. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 2022–2038.
Sarlin, P.E., DeTone, D., Malisiewicz, T., & Rabinovich, A. (2020). Superglue: Learning feature matching with graph neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (pp. 4938–4947).
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.
Sturm, J., Engelhard, N., Endres, F., Burgard, W., & Cremers, D. (2012). A benchmark for the evaluation of rgb-d slam systems. In: Proceedings of the IEEE/RSJ International conference on intelligent robots and systems, (pp. 573–580).
Sun, J., Shen, Z., Wang, Y., Bao, H., & Zhou, X. (2021). Loftr: Detector-free local feature matching with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (pp. 8922–8931).
Sun, W., Jiang, W., Trulls, E., Tagliasacchi, A., & Yi, K.M. (2020). Acne: Attentive context normalization for robust permutation-equivariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 286–295).
Szpak, Z.L., Chojnacki, W., & van den Hengel, A. (2015). Robust multiple homography estimation: An ill-solved problem. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 2132–2141).
Thomee, B., Shamma, D. A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., & Li, L. J. (2016). Yfcc100m: The new data in multimedia research. Communications of the ACM, 59(2), 64–73.
Torr, P. H., & Zisserman, A. (2000). Mlesac: A new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding, 78(1), 138–156.
Torr, P. H. S. (2002). Bayesian model estimation and selection for epipolar geometry and generic manifold fitting. International Journal of Computer Vision, 50(1), 35–61.
Tuytelaars, T., & Mikolajczyk, K. (2008). Local invariant feature detectors: a survey. Foundations and Trends® in Computer Graphics and Vision, 3(3), 177–280.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., & Gomez, A. N. (2017). L. u. Kaiser, and I. Polosukhin, “Attention is all you need,”. Advances in neural information processing systems, 30, 5998–6008.
Weibull, J. W. (1997). Evolutionary game theory. MIT press.
Wilson, K., & Snavely, N. (2014). Robust global translations with 1dsfm. In: Proceedings of the European conference on computer vision, (pp. 61–75).
Wong, H.S., Chin, T.J., Yu, J., & Suter, D. (2011). Dynamic and hierarchical multi-structure geometric model fitting. In: Proceedings of the IEEE international conference on computer vision, (pp. 1044–1051).
Yang, K., Pan, A., Yang, Y., Zhang, S., Ong, S. H., & Tang, H. (2017). Remote sensing image registration using multiple image features. Remote Sensing, 9(6), 581.
Yi, K.M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M., & Fua, P. (2018). Learning to find good correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 2666–2674).
Yuille, A. L., & Grzywacz, N. M. (1989). A mathematical analysis of the motion coherence theory. International Journal of Computer Vision, 3(2), 155–175.
Zahn, C. T. (1971). Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers, 100(1), 68–86.
Zass, R., & Shashua, A. (2008). Probabilistic graph and hypergraph matching. In: 2008 IEEE conference on computer vision and pattern recognition, (pp. 1–8). IEEE.
Zhang, J., Sun, D., Luo, Z., Yao, A., Zhou, L., Shen, T., Chen, Y., Quan, L., & Liao, H. (2019). Learning two-view correspondences and geometry using order-aware network. In: Proceedings of the IEEE International conference on computer vision, (pp. 5845–5854).
Zhang, K., Jiang, X., & Ma, J. (2021). Appearance-based loop closure detection via locality-driven accurate motion field learning. IEEE Transactions on Intelligent Transportation Systems, 23(3), 2350–2365.
Zhang, K., Li, Z., & Ma, J. (2021). Appearance-based loop closure detection via bidirectional manifold representation consensus. In: IEEE international conference on robotics and automation, (pp. 6811–6817). IEEE.
Zhang, S., & Ma, J. (2023). Convmatch: Rethinking network design for two-view correspondence learning. In: Proc. AAAI Conf. Artif. Intell, pp. 1–12.
Zhao, C., Cao, Z., Li, C., Li, X., & Yang, J. (2019). Nm-net: Mining reliable neighbors for robust feature correspondences. In: Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 215–224).
Zhao, C., Ge, Y., Zhu, F., Zhao, R., Li, H., & Salzmann, M. (2021). Progressive correspondence pruning by consensus learning. In: Proceedings of the IEEE/CVF International conference on computer vision, (pp. 6464–6473).
Zhao, J., & Ma, J. (2017). Visual homing by robust interpolation for sparse motion flow. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, (pp. 1282–1288).
Zitova, B., & Flusser, J. (2003). Image registration methods: A survey. Image and Vision Computing, 21(11), 977–1000.
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant no. 62276192.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no Conflict of interest.
Additional information
Communicated by Ondra Chum.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lu, Y., Ma, J. Feature Matching via Graph Clustering with Local Affine Consensus. Int J Comput Vis 133, 2259–2286 (2025). https://doi.org/10.1007/s11263-024-02291-5
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1007/s11263-024-02291-5

