Abstract
Image annotation is an important research problem in content-based image retrieval (CBIR) and computer vision with broad applications. A major challenge is the so-called “semantic gap” between the low-level visual features and the high-level semantic concepts. It is difficult to effectively annotate and extract semantic concepts from an image. In an image with multiple semantic concepts, different objects corresponding to different concepts may often appear in different parts of the image. If we can properly partition the image into regions, it is likely that the semantic concepts are better represented in the regions and thus the annotation of the image as a whole can be more accurate. Motivated by this observation, in this paper we develop a novel stratification-based approach to image annotation. First, an image is segmented into some likely meaningful regions. Each region is represented by a set of discretized visual features. A naïve Bayesian method is proposed to model the relationship between the discrete visual features and the semantic concepts. The topic-concept distribution and the significance of the regions in the image are also considered. An extensive experimental study using real data sets shows that our method significantly outperforms many traditional methods. It is comparable to the state-of-the-art Continuous-space Relevance Model in accuracy, but is much more efficient – it is over 200 times faster in our experiments.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blei, D., Jordan, M.I.: Modeling annotated data. In: Proc. of the 26th Annual International ACM SIGIR Conference, pp. 127–134. ACM, Toronto (2003)
Croft, W.B.: Combining approaches to information retrieval. In: Croft, W.B. (ed.) Advances in information retrieval. MIT Press, Cambridge (2000)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proc. of the Twelfth International Conference on Machine Learning, pp. 194–202. Morgan Kaufmann, Tahoe City (1995)
Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Proc. of the Seventh European Conference on Computer Vision, pp. 97–112. Springer, Copenhagen (2002)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proc. of the 13th International Joint Conference on Artificial Intelligence, pp. 1022–1029. Morgan Kaufmann, Chambery (1993)
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, pp. 1002–1009 (2004)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proc. of the 26th Annual International ACM SIGIR Conference, pp. 119–126. ACM, Toronto (2003)
Jin, R., Cai, J.Y., Si, L.: Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning. In: Proc. of the 12th ACM Annual Conference on Multimedia (ACM MM 2004) New York, USA (2004)
Kohavi, R., Sahami, M.: Error-Based and Entropy-Based Discretization of Continuous Features. In: Proc. of the Second International Conference on Knowledge Discovery and Data Mining, pp. 114–119. AAAI Press, Portland (1996)
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9), 1075–1088 (2003)
Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. Advances in Neural Information Processing Systems. MIT Press, Vancouver (2004)
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proc. of the First International Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9), 1075–1088 (2000)
Yang, Y., Webb, G.I.: Discretization for data mining. In: Wang, J. (Ed.), Encyclopedia of data warehousing and mining. Idea Group Reference (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ye, J., Zhou, X., Pei, J., Chen, L., Zhang, L. (2005). A Stratification-Based Approach to Accurate and Fast Image Annotation. In: Fan, W., Wu, Z., Yang, J. (eds) Advances in Web-Age Information Management. WAIM 2005. Lecture Notes in Computer Science, vol 3739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11563952_26
Download citation
DOI: https://doi.org/10.1007/11563952_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29227-2
Online ISBN: 978-3-540-32087-6
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

