ABSTRACT
Classification of time series has been attracting great interest over the past decade. Recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data.
In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. As we shall show with extensive empirical evaluations in diverse domains, algorithms based on the time series shapelet primitives can be interpretable, more accurate and significantly faster than state-of-the-art classifiers.
Supplemental Material
- Anon. 1525. Founders' and benefectors' book of Tewkesbury Abbey, in Latin England. Online version www.bodley.ox.ac.uk/dept/scwmss/wmss/medieval/mss/top/glouc/d/002.htmGoogle Scholar
- Breiman, L., Friedman, J., Olshen, R. A., and Stone, C. J. 1984. Classification and regression trees. Wadsworth.Google Scholar
- Chiu, B., Keogh, E., and Lonardi, S. 2003. Probabilistic Discovery of Time Series Motifs. In Proc of the 9th ACM SIGKDD. 493--498. Google ScholarDigital Library
- Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., and Keogh, E. 2008. Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures. In Proc of the 34th VLDB. 1542--1552. Google ScholarDigital Library
- Geurts, P. 2001. Pattern Extraction for Time Series Classification. In Proc of the 5th PKDD, 115--127. Google ScholarDigital Library
- Jeffery, C. 2005. http://public.lanl.gov/eads/datasets/emp/index.htmlGoogle Scholar
- Keogh, E. and Kasetty, S. 2002. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In Proc' of the 8th ACM SIGKDD. 102--111. Google ScholarDigital Library
- Keogh, E., Wei, L., Xi, X., Lee, S., and Vlachos, M. 2006. LB_Keogh Supports Exact Indexing of Shapes under Rotation Invariance with Arbitrary Representations and Distance Measures. In the Proc of 32th VLDB. 882--893. Google ScholarDigital Library
- Koschorreck,W. and Werner, W., editors. 1981. Facsimile edition with commentary: Kommentar zum Faksimile des Codex Manesse: Die grosse Heidelberger Liederhandschrift.Google Scholar
- Montagu, J.A. 1840. A guide to the study of heraldry. Publisher: London : W. Pickering. Online version www.archive.org/details/guidetostudyofhe00montuoftGoogle Scholar
- Rodríguez, J.J. and Alonso, C.J. 2004. Interval and dynamic time warping-based decision trees. In Proc of the 2004 ACM Symposium on Applied Computing, 548--552. Google ScholarDigital Library
- Salzberg, S.L. 1997. On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1, 317--328, 1997. Google ScholarDigital Library
- Wilcoxon, F. 1945. Individual Comparisons by Ranking Methods. Biometrics, 1, 80--83.Google ScholarCross Ref
- Xi, X., Keogh, E., Shelton, C., Wei, L., and Ratanamahatana, C. A. 2006. Fast Time Series Classification Using Numerosity Reduction. In the Proc of the 23rd ICML. 1033--1040. Google ScholarDigital Library
- Ye, L. 2009. The Time Series Shapelet Webpage. www.cs.ucr.edu/~lexiangy/shapelet.htmlGoogle Scholar
Index Terms
- Time series shapelets: a new primitive for data mining
Recommendations
Logical-shapelets: an expressive primitive for time series classification
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningTime series shapelets are small, local patterns in a time series that are highly predictive of a class and are thus very useful features for building classifiers and for certain visualization and summarization tasks. While shapelets were introduced only ...
Time series shapelets: a novel technique that allows accurate, interpretable and fast classification
Classification of time series has been attracting great interest over the past decade. While dozens of techniques have been introduced, recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat ...
Fast Time Series Classification Based on Infrequent Shapelets
ICMLA '12: Proceedings of the 2012 11th International Conference on Machine Learning and Applications - Volume 01Time series shapelets are small and local time series subsequences which are in some sense maximally representative of a class. E.Keogh uses distance of the shapelet to classify objects. Even though shapelet classification can be interpretable and more ...
Comments