Using Supervised Clustering to Enhance Classifiers

Eick, Christoph F.; Zeidat, Nidal

doi:10.1007/11425274_26

Christoph F. Eick²² &
Nidal Zeidat²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3488))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

1221 Accesses
10 Citations

Abstract

This paper centers on a novel data mining technique we term supervised clustering. Unlike traditional clustering, supervised clustering is applied to classified examples and has the goal of identifying class-uniform clusters that have a high probability density. This paper focuses on how data mining techniques in general, and classification techniques in particular, can benefit from knowledge obtained through supervised clustering. We discuss how better nearest neighbor classifiers can be constructed with the knowledge generated by supervised clustering, and provide experimental evidence that they are more efficient and more accurate than a traditional 1-nearest-neighbor classifier. Finally, we demonstrate how supervised clustering can be used to enhance simple classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Lazy Learning: Classification Using Nearest Neighbors

Using Decision Trees for Interpretable Supervised Clustering

Article Open access 15 February 2024

Data Classification and Clustering

References

Basu, S., Bilenko, M., Mooney, R.: Semi-supervised Clustering by Seeding. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML 2002), Sydney, Australia, July 2002, pp. 19–26 (2002)
Google Scholar
Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning Distance Functions Using Equivalence Relations. In: Proc. ICML 2003, Washington DC (August 2003)
Google Scholar
Demiriz, A., Benett, K.-P., Embrechts, M.J.: Semi-supervised Clustering using Genetic Algorithms. In: Proc. ANNIE 1999 (1999)
Google Scholar
Eick, C., Zeidat, N., Zhao, Z.: Supervised Clustering – Algorithms and Benefits. In: Proc. ICTAI 2004, Boca Raton, FL (November 2004)
Google Scholar
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, Chichester (1990)
Google Scholar
University of California at Irving, Machine Learning Repository, http://www.ics.uci.edu/~mlearn/MLRepository.html
Sinkkonen, J., Kaski, S., Nikkila, J.: Discriminative Clustering: Optimal Contingency Tables by Learning Metrics. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430. Springer, Heidelberg (2002)
Chapter Google Scholar
Slonim, N., Tishby, N.: Agglomerative Information Bottleneck. In: Neural Information Processing Systems (NIPS 1999) (1999)
Google Scholar
Tishby, N., Periera, F.C., Bialek, W.: The Information Bottleneck Method. In: Proceedings of the 37th Allerton Conference on Communication and Computation (1999)
Google Scholar
Vilalta, R., Achari, M., Eick, C.: Class Decomposition Via Clustering: A New Framework For Low-Variance Classifiers. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM 2003), Melbourne, FL (November 2003)
Google Scholar
Wilson, D.L.: Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cybernetics 2, 408–420 (1972)
Article MATH Google Scholar
Xing, E.P., Ng, A., Jordan, M., Russell, S.: Distance Metric Learning with Applications to Clustering with Side Information. In: Advances in Neural Information Processing 15. MIT Press, Cambridge (2003)
Google Scholar
Zeidat, N., Eick, C.: Using k-medoid Style Algorithms for Supervised Summary Generation. In: Proc. MLMTA 2004, Las Vegas (June 2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Houston, Houston, TX, 77204-3010, USA
Christoph F. Eick & Nidal Zeidat

Authors

Christoph F. Eick
View author publications
Search author on:PubMed Google Scholar
Nidal Zeidat
View author publications
Search author on:PubMed Google Scholar

Editor information

Editors and Affiliations

LIRIS - UFR d’Informatique, Université Claude Bernard Lyon 1, 43, boulevard du 11 novembre 1918, 69622, Villeurbanne, France
Mohand-Said Hacid
Department of Computer Science, State University of New York, 12222, Albany, NY, USA
Neil V. Murray
Department of Computer Science, University of North Carolina, 28223, Charlotte, NC, USA
Zbigniew W. Raś
Shimane University, 89-1 Enya-cho Izumo, 6938501, Shimane, Japan
Shusaku Tsumoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eick, C.F., Zeidat, N. (2005). Using Supervised Clustering to Enhance Classifiers. In: Hacid, MS., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds) Foundations of Intelligent Systems. ISMIS 2005. Lecture Notes in Computer Science(), vol 3488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11425274_26

Download citation

DOI: https://doi.org/10.1007/11425274_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25878-0
Online ISBN: 978-3-540-31949-8
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics