skip to main content
article
Free Access

An overview of data warehousing and OLAP technology

Published:01 March 1997Publication History
Skip Abstract Section

Abstract

Data warehousing and on-line analytical processing (OLAP) are essential elements of decision support, which has increasingly become a focus of the database industry. Many commercial products and services are now available, and all of the principal database management system vendors now have offerings in these areas. Decision support places some rather different requirements on database technology compared to traditional on-line transaction processing applications. This paper provides an overview of data warehousing and OLAP technologies, with an emphasis on their new requirements. We describe back end tools for extracting, cleaning and loading data into a data warehouse; multidimensional data models typical of OLAP; front end client tools for querying and data analysis; server extensions for efficient query processing; and tools for metadata management and for managing the warehouse. In addition to surveying the state of the art, this paper also identifies some promising research issues, some of which are related to problems that the database research community has worked on for years, but others are only just beginning to be addressed. This overview is based on a tutorial that the authors presented at the VLDB Conference, 1996.

References

  1. 1 Inmon, W. H., Building the Data Warehouse. John Wiley, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 http://www.olapcouncil.orgGoogle ScholarGoogle Scholar
  3. 3 Codd, E. F., S. B. Codd, C. T. Salley, "Providing OLAP (On-Line Analytical Processing) to User Analyst: An IT Mandate." Available from Arbor Software's web site http://www.arborsoft.com/OLAP.html.Google ScholarGoogle Scholar
  4. 4 http://pwp.starnetinc.com/larryg/articles.htmlGoogle ScholarGoogle Scholar
  5. 5 Kimball, R. The Data Warehouse Toolkit. John Wiley, 1996.Google ScholarGoogle Scholar
  6. 6 Barclay, T., R. Barnes, J. Gray, P. Sundaresan, "Loading Databases using Dataflow Parallelism." SIGMOD Record, Vol. 23, No. 4, Dec.1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 Blakeley, J. A., N. Coburn, P. Larson. "Updating Derived Relations: Detecting Irrelevant and Autonomously Computable Updates." ACM TODS, Vol. 4, No. 3, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 Gupta, A., I. S. Mumick, "Maintenance of Materialized Views: Problems, Techniques, and Applications." Data Eng. Bulletin, Vol. 18, No. 2, June 1995.Google ScholarGoogle Scholar
  9. 9 Zhuge, Y., H. Garcia-Molina, J. Hammer, J. Widom, "View Maintenance in a Warehousing Environment," Proc. of SIGMOD Conf., 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 Roussopoulos, N., et al., "The Maryland ADMS Project: Views R Us." Data Eng. Bulletin, Vol. 18, No. 2, June 1995.Google ScholarGoogle Scholar
  11. 11 O'Neil P., Quass D. "Improved Query Performance with Variant Indices", To appear in Proc. of SIGMOD Conf., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 O'Neil P., Graefe G. "Multi-Table Joins through Bitmapped Join Indices" SIGMOD Record, Sep. 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13 Harinarayan V., Rajaraman A., Ullman J. D. "Implementing Data Cubes Efficiently" Proc. of SIGMOD Conf., 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 Chaudhuri S., Krishnamurthy R., Potamianos S., Shim K. "Optimizing Queries with Materialized Views" Intl. Conference on Data Engineering, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 Levy A., Mendelzon A., Sagiv Y. "Answering Queries Using Views" Proc. of PODS, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 Yang H. Z., Larson P. A. "Query Transformations for PSJ Queries", Proc. of VLDB, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 Kim W. "On Optimizing a SQL-like Nested Query" ACM TODS, Sep. 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 Ganski, R., Wong H. K. T., "Optimization of Nested SQL Queries Revisited" Proc. of SIGMOD Conf., 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19 Dayal, U., "Of Nests and Trees: A Unified Approach to Processing Queries that Contain Nested Subqueries, Aggregates and Quantifiers" Proc. VLDB Conf., 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 Murlaikrishna, "Improved Unnesting Algorithms for Join Aggregate SQL Queries" Proc. VLDB Conf., 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 Seshadri P., Pirahesh H., Leung T. "Complex Query Decorrelation" Intl. Conference on Data Engineering, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22 Mumick I. S., Pirahesh H. "Implementation of Magic Sets in Starburst" Proc. of SIGMOD Conf., 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23 Chaudhuri S., Shim K. "Optimizing Queries with Aggregate Views", Proc. of EDBT, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24 Chaudhuri S., Shim K. "Including Group By in Query Optimization", Proc. of VLDB, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. 25 Yan P., Larson P. A. "Eager Aggregation and Lazy Aggregation", Proc. of VLDB, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26 Gupta A., Harinarayan V., Quass D. "Aggregate-Query Processing in Data Warehouse Environments", Proc. of VLDB, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27 Chaudhuri S., Shim K. "An Overview of Cost-based Optimization of Queries with Aggregates" IEEE Data Enginering Bulletin, Sep. 1995.Google ScholarGoogle Scholar
  28. 28 Dewitt D. J., Gray J. "Parallel Database Systems: The Future of High Performance Database Systems" CACM, June 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. 29 Gray J. et.al. "Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab and Sub Totals" Data Mining and Knowledge Discovery Journal, Vol. 1, No. 1, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. 30 Agrawal S. et.al. "On the Computation of Multidimensional Aggregates" Proc. of VLDB Conf., 1996.Google ScholarGoogle Scholar
  31. 31 Kimball R., Strehlo., "Why decision support fails and how to fix it", reprinted in SIGMOD Record, 24(3), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. 32 Chatziantoniou D., Ross K. "Querying Multiple Features in Relational Databases" Proc. of VLDB Conf., 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. 33 Widom, J. "Research Problems in Data Warehousing." Proc. 4th Intl. CIKM Conf., 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. 34 Wu, M-C., A. P. Buchmann. "Research Issues in Data Warehousing." Submitted for publication.Google ScholarGoogle Scholar

Index Terms

  1. An overview of data warehousing and OLAP technology

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGMOD Record
          ACM SIGMOD Record  Volume 26, Issue 1
          March 1997
          77 pages
          ISSN:0163-5808
          DOI:10.1145/248603
          Issue’s Table of Contents

          Copyright © 1997 Authors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 March 1997

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader