{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,27]],"date-time":"2025-10-27T16:18:13Z","timestamp":1761581893537,"version":"build-2065373602"},"reference-count":56,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2019,11,15]],"date-time":"2019-11-15T00:00:00Z","timestamp":1573776000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61841602"],"award-info":[{"award-number":["61841602"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Provincial Science and Technology Innovation Special Fund Project of Jilin Province","award":["20190302026GX"],"award-info":[{"award-number":["20190302026GX"]}]},{"name":"Jilin Province Development and Reform Commission Industrial Technology Research and Development Project","award":["2019C054-4"],"award-info":[{"award-number":["2019C054-4"]}]},{"name":"Higher Education Research Project of Jilin Association for Higher Education","award":["JGJX2018D10"],"award-info":[{"award-number":["JGJX2018D10"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The massive number of images demands highly efficient image retrieval tools. Deep distance metric learning (DDML) is proposed to learn image similarity metrics in an end-to-end manner based on the convolution neural network, which has achieved encouraging results. The loss function is crucial in DDML frameworks. However, we found limitations to this model. When learning the similarity of positive and negative examples, the current methods aim to pull positive pairs as close as possible and separate negative pairs into equal distances in the embedding space. Consequently, the data distribution might be omitted. In this work, we focus on the distribution structure learning loss (DSLL) algorithm that aims to preserve the geometric information of images. To achieve this, we firstly propose a metric distance learning for highly matching figures to preserve the similarity structure inside it. Second, we introduce an entropy weight-based structural distribution to set the weight of the representative negative samples. Third, we incorporate their weights into the process of learning to rank. So, the negative samples can preserve the consistency of their structural distribution. Generally, we display comprehensive experimental results drawing on three popular landmark building datasets and demonstrate that our method achieves state-of-the-art performance.<\/jats:p>","DOI":"10.3390\/e21111121","type":"journal-article","created":{"date-parts":[[2019,11,15]],"date-time":"2019-11-15T11:25:56Z","timestamp":1573817156000},"page":"1121","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Distribution Structure Learning Loss (DSLL) Based on Deep Metric Learning for Image Retrieval"],"prefix":"10.3390","volume":"21","author":[{"given":"Lili","family":"Fan","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"}]},{"given":"Hongwei","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"}]},{"given":"Haoyu","family":"Zhao","sequence":"additional","affiliation":[{"name":"Editorial Department of Journal (Engineering and Technology Edition), Jilin University, Changchun 130012, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0196-7913","authenticated-orcid":false,"given":"Pingping","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"}]},{"given":"Huangshui","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,11,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007, January 17\u201322). Object retrieval with large vocabularies and fast spatial matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383172"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"J\u00e9gou, H., Douze, M., Schmid, C., and P\u00e9rez, P. (2010, January 13\u201318). Aggregating local descriptors into a compact image representation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5540039"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Perronnin, F., and Dance, C. (2007, January 17\u201322). Fisher Kernels on Visual Vocabularies for Image Categorization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383266"},{"key":"ref_4","unstructured":"Ke, Y., Wang, Y., Liang, D., Huang, T., and Tian, Y. (2016, January 6\u20139). CNN vs. SIFT for Image Retrieval: Alternative or Complementary?. Proceedings of the ICMR\u201916 Acm International Conference on Multimedia Retrieval, New York, NY, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Seddati, O., Dupont, S., Mahmoudi, S., and Parian, M. (2017, January 22\u201329). Towards Good Practices for Image Retrieval Based on CNN Features. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCVW.2017.150"},{"key":"ref_6","unstructured":"Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., and Farhadi, A. (June, January 29). Target-drivenvisualnavigationin indoor scenes using deep reinforcement learning. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Singapore."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). FaceNet: A Unified Embedding for Face Recognition and Clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_8","unstructured":"Weinberger, K.Q.J.J. (2006, January 3\u20136). Distance Metric Learning for Large Margin Nearest Neighbor Classification. Proceedings of the Advances in Neural Information Processing Systems 19 (Nips 2006), Vancouver, BC, Canada."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhong, G., Zheng, Y., Li, S., and Fu, Y. (2016, January 24\u201329). Scalable large margin online metric learning. Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada.","DOI":"10.1109\/IJCNN.2016.7727478"},{"key":"ref_10","unstructured":"Song, H.O., Yu, X., Jegelka, S., and Savarese, S. (July, January 26). Deep Metric Learning via Lifted Structured Feature Embedding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"4269","DOI":"10.1109\/TIP.2017.2717505","article-title":"Discriminative deep metric learning for face and kinship verification","volume":"26","author":"Lu","year":"2017","journal-title":"IEEE Trans. Image Proc."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2281","DOI":"10.1109\/TPAMI.2017.2749576","article-title":"Sharable and Individual Multi-View Metric Learning","volume":"40","author":"Hu","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1007\/s11263-017-1016-8","article-title":"End-to-End Learning of Deep Visual Representations for Image Retrieval","volume":"124","author":"Gordo","year":"2017","journal-title":"Int. J. Comput. Vis."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2644","DOI":"10.1109\/TCSVT.2017.2711015","article-title":"Deep Localized Metric Learning IEEE Transactions on Circuits and Systems for Video Technology","volume":"28","author":"Duan","year":"2017","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_15","unstructured":"Yumin, S., Bohyung, H., Wonsik, K., and Kyoung, K. (2019, January 18\u201320). Stochastic Class-Based Hard Example Mining for Deep Metric Learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Denton, TX, USA."},{"key":"ref_16","unstructured":"Hadsell, R., Chopra, S., and Lecun, Y. (2006, January 17). Dimensionality Reduction by Learning an Invariant Mapping. Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition (CVPR), New York, NY, USA."},{"key":"ref_17","unstructured":"Chopra, S., Hadsell, R., and Lecun, Y. (2005, January 20). Learning a Similarity Metric Discriminatively, with Application to Face Verification. Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition (CVPR), Toronto, ON, Canada."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., and Wu, Y. (2014, January 24\u201327). Learning Fine-Grained Image Similarity with Deep Ranking. Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.180"},{"key":"ref_19","unstructured":"Yin, C., Feng, Z., Lin, Y., and Belongie, S. (July, January 26). Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Las Vegas, NV, USA."},{"key":"ref_20","unstructured":"Sohn, K. (2016, January 6\u20137). Improved Deep Metric Learning with Multi-class N-pair Loss Objective. Proceedings of the Advances in Neural Information Processing Systems 29 (Nips 2016), Barcelona, Spain."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Song, H.O., Jegelka, S., Rathod, V., and Murphy, K. (2017, January 21\u201326). Deep Metric Learning via Facility Location. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.237"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Movshovitz-Attias, Y., Toshev, A., Leung, T.K., Ioffe, S., and Singh, S. (2017, January 21\u201326). No Fuss Distance Metric Learning using Proxies. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/ICCV.2017.47"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wang, X., Hua, Y., Kodirov, E., Hu, G., and Robertson, N.M. (2019). Ranked List Loss for Deep Metric Learning. arXiv.","DOI":"10.1109\/CVPR.2019.00535"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2008, January 24\u201326). Lost in quantization: Improving particular object retrieval in large scale image databases. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AL, USA.","DOI":"10.1109\/CVPR.2008.4587635"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Jegou, H., Douze, M., and Schmid, C. (2008, January 12\u201318). Hamming embedding and weak geometric consistency for large scale image search. I. Proceedings of the 10th European Conference on Computer Vision, ECCV, Marseille, France.","DOI":"10.1007\/978-3-540-88682-2_24"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Wu, C.Y., Manmatha, R., Smola, A.J., and Kr\u00e4henb\u00fchl, P. (2017, January 22\u201329). Sampling Matters in Deep Embedding Learning. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.309"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","article-title":"Speeded-Up Robust Features (SURF)","volume":"110","author":"Bay","year":"2008","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Mohedano, E., McGuinness, K., O\u2019Connor, N.E., Salvador, A., Marques, F., and Giro-I-Nieto, X. (2016, January 6\u20139). Bags of Local Convolutional Features for Scalable Instance Search. Proceedings of the ICMR\u201916: Acm International Conference on Multimedia Retrieval, New York, NY, USA.","DOI":"10.1145\/2911996.2912061"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wei, S., Wu, X., and Dong, X. (2013). Partitioned K-Means Clustering for Fast Construction of Unbiased Visual Vocabulary. The Era of Interactive Media, Springer.","DOI":"10.1007\/978-1-4614-3501-3_40"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Yandex, A.B., and Lempitsky, V. (2015, January 13\u201316). Aggregating Local Deep Features for Image Retrieval. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.150"},{"key":"ref_31","unstructured":"Tolias, G., and Sicre, R. (2015). Particular object retrieval with integral max-pooling of CNN activations. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Jegou, H., and Chum, O. (2012, January 7\u201313). Negative evidences and co-occurences in image retrieval: The benefit of pca and whitening. Pt II. Proceedings of the Computer Vision\u2014ECCV, Florence, Italy.","DOI":"10.1007\/978-3-642-33709-3_55"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Gordo, A., Almazan, J., Revaud, J., and Larlus, D. (2016, January 11\u201314). Deep image retrieval: Learning global representations for image search. VI. Proceedings of the Computer Vision\u2014Eccv 2016, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46466-4_15"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Radenovi\u0107, F., Tolias, G., and Chum, O. (2016, January 11\u201314). CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples. Proceedings of the Computer Vision\u2014Eccv 2016, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_1"},{"key":"ref_35","unstructured":"Simonyan, K., and Zisserman, A.J.C.S. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., Radenovi\u0107, F., Chum, O., and Frahm, J.M. (2015, January 7\u201312). From single image query to detailed 3d reconstruction. Proceedings of the Computer Vision & Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299148"},{"key":"ref_37","first-page":"1655","article-title":"Fine-tuning CNN Image Retrieval with No Human Annotation","volume":"41","author":"Tolias","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.1109\/TPAMI.2017.2711011","article-title":"NetVLAD: CNN Architecture for Weakly Supervised Place Recognition","volume":"40","author":"Arandjelovic","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Razavian, A.S., Azizpour, H., Sullivan, J., and Carlsson, S. (2014, January 23\u201328). CNN Features off-the-shelf: An astounding baseline for recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Columbus, OH, USA.","DOI":"10.1109\/CVPRW.2014.131"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1109\/LSP.2018.2810106","article-title":"Off-Feature Information Incorporated Metric Learning for Face Recognition","volume":"25","author":"Huang","year":"2018","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Feng, G., Liu, W., Tao, D., and Zhou, Y. (2019). Hessian Regularized Distance Metric Learning for People Re-Identification. Neural Process. Lett.","DOI":"10.1007\/s11063-019-10000-4"},{"key":"ref_42","unstructured":"Rui, W., Wu, X.J., Chen, K.X., and Kittler, J. (2018, January 20\u201324). Multiple Manifolds Metric Learning with Application to Image Set Classification. Proceedings of the 24th International Conference on Pattern Recognition (ICPR), Beijing, China."},{"key":"ref_43","unstructured":"Min, T., Jun, Y., Zhou, Y., Fei, G., Yong, R., and Tao, D. (2018, January 22\u201324). User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric Learning. Proceedings of the ICMR\u201920: Acm International Conference on Multimedia Retrieval, Galway, Ireland."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Cao, R., Zhang, Q., Zhu, J., Li, Q., and Qiu, G. (2019). Enhancing Remote Sensing Image Retrieval with Triplet Deep Metric Learning Network. arXiv.","DOI":"10.1080\/2150704X.2019.1647368"},{"key":"ref_45","unstructured":"Xiang, J., Zhang, G., Hou, J., Nong, S., and Rui, H. (2018). Multiple Target Tracking by Learning Feature Representation and Distance Metric Jointly. arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Yang, J., She, D., Lai, Y.-K., and Yang, M.-H. (2018, January 2\u20137). Retrieving and classifying affective images via deep metric learning. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11275"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/j.neucom.2016.01.052","article-title":"Individual Adaptive Metric Learning for Visual Tracking","volume":"191","author":"Yi","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"2460","DOI":"10.1109\/TCSVT.2017.2726526","article-title":"SLMOML: Online Metric Learning With Global Convergence","volume":"28","author":"Zhong","year":"2017","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_49","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_50","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_51","unstructured":"Radenovic, F., Schonberger, J.L., Ji, D., Frahm, J.-M., Chum, O., and Matas, J. (July, January 26). From Dusk Till Dawn: Modeling in the Dark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Las Vegas, NV, USA."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Papandreou, G., Kokkinos, I., and Savalle, P.A. (2015, January 7\u201312). Modeling Local and Global Deformations in Deep Learning: Epitomic Convolution, Multiple Instance Learning, and Sliding Window Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Cvpr), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298636"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.F. (2009, January 18\u201320). ImageNet: A Large-Scale Hierarchical Image Database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Jose, CA, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_54","first-page":"251","article-title":"Applications a Baseline for Visual Instance Retrieval with Deep Convolutional Networks","volume":"4","author":"Razavian","year":"2016","journal-title":"ITE Trans. Media Technol. Appl."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Kalantidis, Y., Mellina, C., and Osindero, S. (2016). Cross-Dimensional Weighting for Aggregated Deep Convolutional Features. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46604-0_48"},{"key":"ref_56","unstructured":"Ong, E.J., Husain, S., and Bober, M. (2017). Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval. arXiv."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/11\/1121\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:34:47Z","timestamp":1760189687000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/11\/1121"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,15]]},"references-count":56,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2019,11]]}},"alternative-id":["e21111121"],"URL":"https:\/\/doi.org\/10.3390\/e21111121","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2019,11,15]]}}}