Beneficial Sequential Combination of Data Mining Algorithms

Authors
M. Goller, M. Humer, M. Schrefl
Paper
Goll06a (2006)
Citation
Yannis Manolopoulos, Joaquim Filipe, Panos Constantopoulos, José Eordeiro (eds): Proceedings of the 8th International Conference on Enterprise Information Systems (ICEIS 2006), May 23-27, 2006, Paphos, Cyprus, ISBN 972-8865-41-4, pp. 135-143, 2006.
Resources
Copy  (In order to obtain the copy please send an email with subject  Goll06a  to dke.win@jku.at)

Abstract

Depending on the goal of an instance of the Knowledge Discovery in Databases (KDD) process, there are instances that require more than a single data mining algorithm to determine a solution. Sequences of data mining algorithms offer room for improvement that are yet unexploited.

If it is known that an algorithm is the first of a sequence of algorithms and there will be future runs of other algorithms, the first algorithm can determine intermediate results that the succeeding algorithms need. The anteceding algorithm can also determine helpful statistics for succeeding algorithms. As the anteceding algorithm has to scan the data anyway, computing intermediate results happens as a by-product of computing the anteceding algorithm's result. On the one hand, a succeeding algorithm can save time because several steps of that algorithm have already been pre-computed.

On the other hand, additional information about the analysed data can improve the quality of results such as the accuracy of classification, as demonstrated in experiments with synthetical and real data.