Abstract
Parallel processing seems to be the great hope to speed up and scale up data mining algorithms, in order to cope with the huge size of real-world databases and data warehouses. However, most projects on parallel data mining have focused on the paralleilization of a single kind of algorithm or knowledge discovery paradigm. This tutorial will present a considerably broader view of the area of parallel data mining. In particular, it will discuss the parallelization of algorithms of four different knowledge discovery paradigms, namely rule induction, instance-based learning (or nearest neighbours), genetic algorithms and neutral networks. In addition, this tutorial will address both the use of “general- purpose” parallel machines and the use of commercially-available parallel database servers. Different parallelization strategies will be discussed and compared, for each of the four above- mentioned knowledge discovery paradigms.
Chapter PDF
Similar content being viewed by others
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Freitas, A.A. (1998). Scalable, high-performance data mining with parallel processing. In: Żytkow, J.M., Quafafou, M. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1998. Lecture Notes in Computer Science, vol 1510. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0094852
Download citation
DOI: https://doi.org/10.1007/BFb0094852
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65068-3
Online ISBN: 978-3-540-49687-8
eBook Packages: Springer Book Archive
