skip to main content
10.1145/1557019.1557122acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Time series shapelets: a new primitive for data mining

Authors Info & Claims
Published:28 June 2009Publication History

ABSTRACT

Classification of time series has been attracting great interest over the past decade. Recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data.

In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. As we shall show with extensive empirical evaluations in diverse domains, algorithms based on the time series shapelet primitives can be interpretable, more accurate and significantly faster than state-of-the-art classifiers.

Skip Supplemental Material Section

Supplemental Material

p947-ye.mp4

mp4

99.8 MB

References

  1. Anon. 1525. Founders' and benefectors' book of Tewkesbury Abbey, in Latin England. Online version www.bodley.ox.ac.uk/dept/scwmss/wmss/medieval/mss/top/glouc/d/002.htmGoogle ScholarGoogle Scholar
  2. Breiman, L., Friedman, J., Olshen, R. A., and Stone, C. J. 1984. Classification and regression trees. Wadsworth.Google ScholarGoogle Scholar
  3. Chiu, B., Keogh, E., and Lonardi, S. 2003. Probabilistic Discovery of Time Series Motifs. In Proc of the 9th ACM SIGKDD. 493--498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., and Keogh, E. 2008. Querying and Mining of Time Series Data: Experimental Comparison of Representations and Distance Measures. In Proc of the 34th VLDB. 1542--1552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Geurts, P. 2001. Pattern Extraction for Time Series Classification. In Proc of the 5th PKDD, 115--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jeffery, C. 2005. http://public.lanl.gov/eads/datasets/emp/index.htmlGoogle ScholarGoogle Scholar
  7. Keogh, E. and Kasetty, S. 2002. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In Proc' of the 8th ACM SIGKDD. 102--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Keogh, E., Wei, L., Xi, X., Lee, S., and Vlachos, M. 2006. LB_Keogh Supports Exact Indexing of Shapes under Rotation Invariance with Arbitrary Representations and Distance Measures. In the Proc of 32th VLDB. 882--893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Koschorreck,W. and Werner, W., editors. 1981. Facsimile edition with commentary: Kommentar zum Faksimile des Codex Manesse: Die grosse Heidelberger Liederhandschrift.Google ScholarGoogle Scholar
  10. Montagu, J.A. 1840. A guide to the study of heraldry. Publisher: London : W. Pickering. Online version www.archive.org/details/guidetostudyofhe00montuoftGoogle ScholarGoogle Scholar
  11. Rodríguez, J.J. and Alonso, C.J. 2004. Interval and dynamic time warping-based decision trees. In Proc of the 2004 ACM Symposium on Applied Computing, 548--552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Salzberg, S.L. 1997. On comparing classifiers: Pitfalls to avoid and a recommended approach. Data Mining and Knowledge Discovery, 1, 317--328, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Wilcoxon, F. 1945. Individual Comparisons by Ranking Methods. Biometrics, 1, 80--83.Google ScholarGoogle ScholarCross RefCross Ref
  14. Xi, X., Keogh, E., Shelton, C., Wei, L., and Ratanamahatana, C. A. 2006. Fast Time Series Classification Using Numerosity Reduction. In the Proc of the 23rd ICML. 1033--1040. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ye, L. 2009. The Time Series Shapelet Webpage. www.cs.ucr.edu/~lexiangy/shapelet.htmlGoogle ScholarGoogle Scholar

Index Terms

  1. Time series shapelets: a new primitive for data mining

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
      June 2009
      1426 pages
      ISBN:9781605584959
      DOI:10.1145/1557019

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 June 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader