Abstract
Cluster analysis is to be included among the favorite data mining techniques. Cluster analysis of time series has received great attention only recently mainly because of the several difficult issues involved. Among several available methods, genetic algorithms proved to be able to handle efficiently this topic. Several partitions are considered and iteratively selected according to some adequacy criterion. In this artificial “struggle for survival” partitions are allowed to interact and mutate to improve and produce a “high quality” solution. Given a set of time series two genetic algorithms are considered for clustering (the number of clusters is assumed unknown). Both algorithms require a model to be fitted to each time series to obtain model parameters and residuals. These methods are applied to a real data set concerned with the visitors flow recorded, in state owned museums with paid admission, in the Lazio region of Italy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
BARAGONA, R. (2001): A Simulation Study on Clustering Time Series with Meta-Heuristic Methods. Quaderni di Statistica, 3, 1–26.
BERKHIN, P. (2002): Survey of Clustering Data Mining Techniques. Technical Report, Accrue Software, San Jose, California, http://citeseer.ist.psu.edu/bcrkhin02survey.html.
BOX, G.E.P., JENKINS, G.M. and REINSEL, G.C. (1994): Time Series Analysis. Forecasting and Control (3rd Edition). Prentice Hall. San Francisco.
CORDUAS, M. (1992): Una Nota sulla Distanza tra Modelli ARIMA per Serie Storiche Correlate. Statistica, 52, 515–520.
CORDUAS, M. (2000): La Metrica Autoregressiva tra Modelli ARIMA: una Procedura in Linguaggio GAUSS. Quaderni di Statistica, 2, 1–37.
FINDLEY, D.F., MONSELL, B.C, BELL, W.R., OTTO, M.C. and CHEN, B.-C. (1998): New Capabilities and Methods of the X-12-ARIMA Seasonal Adjustment Program (with discussion). Journal of Business and Economic Statistics, 16, 127–176.
GOLDBERG, D.E. (1989): Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley. Reading, Massachusetts.
GÓMEZ, V. and MARAVALL, A. (1996): Programs Tramo and Seats: Instructions for Users. Technical Report 9628, The Banco de España, Servicios de Estudios.
HAUPT, R.L. and HAUPT. S.E. (2004): Practical Genetic Algorithms (2nd Edition). John Wiley & Sons, Hoboken. New Jersey.
HOLLAND, J.H. (1975): Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor.
JENNISON, C. and SHEEHAN, N. (1995): Theoretical and empirical properties of the genetic algorithm as a numerical optimizer, Journal of Computational and Graphical Statistics, 4, 296–318.
JONES, D.R. and BELTRAMO, M.A. (1991): Solving Partitioning Problems with Genetic Algorithms. In: R.K. Belew and L.B. Booker (Eds.): Proceedings of the Fourth International Conference on Genetic Algorithms. Morgan Kaufmann, San Diego, California, 442–449.
LIAO, T.W. (2005): Clustering of Time Series Data-A Survey. Pattern Recognition, 38, 1857–1874.
PATTARIN, F., PATERLINI, S. and MINERVA, T. (2004): Clustering Financial Time Series: an Application to Mutual Funds Style Analysis. Computational Statistics & Data Analysis, 47, 353–372.
PICCOLO, D. (1990): A Distance Measure for Classifying ARIMA Models. Journal of Time Series Analysis, 11, 153–164.
REEVES, C.R. and ROWE, J.E. (2003): Genetic Algorithms-Principles and Perspective: a Guide to GA Theory. Kluwer Academic Publishers, London.
SAHNL S. and GONZALEZ, T. (1976): P-Complete Approximation Problems. Journal of the Association for Computing Machinery, 23, 555–565.
TONG, H. and DABAS, P. (1990): Cluster of Time Series Models: an Example. Journal of Applied Statistics, 17, 187–198.
WINKER, P. and GILLI, M. (2004): Application of Optimization Heuristics to Estimation and Modelling Problems. Computational Statistics & Data Analysis, 47, 211–223.
ZANL S. (1983): Osservazioni sulle Serie Storiche Multiple e l’Analisi dei Gruppi. In: D. Piccolo (Ed.): Analisi Moderna delle Serie Storiche. Franco Angeli, Milano. 263–274.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Heidelberg
About this paper
Cite this paper
Baragona, R., Vitrano, S. (2006). Genetic Algorithms-based Approaches for Clustering Time Series. In: Zani, S., Cerioli, A., Riani, M., Vichi, M. (eds) Data Analysis, Classification and the Forward Search. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-35978-8_1
Download citation
DOI: https://doi.org/10.1007/3-540-35978-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35977-7
Online ISBN: 978-3-540-35978-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)