Natural Induction and Conceptual Clustering: A Review of Applications




Michalski, Ryszard S.
Kaufman, Kenneth A.
Pietrzykowski, Jaroslaw
Wojtusiak, Janusz
Mitchell, Scott
Seeman, Doug

Journal Title

Journal ISSN

Volume Title



Natural induction and conceptual clustering are two methodologies pioneered by the GMU Machine Learning and Inference Laboratory for discovering conceptual relationships in data, and presenting them in the forms easy for people to interpret and understand. The first methodology is for supervised learning (learning from examples) and the second for unsupervised learning (clustering). Examples of their application to a wide range of practical domains are presented, including bioinformatics, medicine, agriculture, volcanology, demographics, intrusion detection and computer user modeling, manufacturing, civil engineering, optimization of functions of very large number of variables (100-1000), design of complex engineering systems, tax fraud detection, and musicology. Most of the results were obtained by applying our recent natural induction program, AQ21, which is downloadable from To give the Reader a quick insight into differences between natural induction implemented in AQ21 and some well-known learning methods, such as those implemented in C4.5, RIPPER, and CN2, as well as between conceptual clustering and conventional clustering, Sections 15 and 16 describe results from applying all these methods to very simple, designed problems.



Data mining, Machine learning, Natural induction, Cluster analysis, Conceptual clustering


Michalski, R. S., Kaufman, K., Pietrzykowski, J., Wojtusiak, J., Mitchell, S. and Seeman, W.D., "Natural Induction and Conceptual Clustering: A Review of Applications," Reports of the Machine Learning and Inference Laboratory, MLI 06-3, George Mason University, Fairfax, VA, June, 2006 (Updated: August 23, 2006).