Handling measurement function changes with Slowly Changing Measures

M. Goller, S. Berger
Berg15a (2015)
Journal Information Systems, Elsevier Publ., Vol. 53, Nov. 2015, DOI:10.1016/j.is.2014.12.009, pp. 107-123, 2015.
Copy  (In order to obtain the copy please send an email with subject  Berg15a  to dke.win@jku.at)

Abstract (English)

Data Warehouses (DWs) are historical databases on business events, organized as multi-dimensional hypercubes that support analytical decision making. Extract, Transform, and Load (ETL) processes apply measurement functions to compute parameterized scores from the business events, such as sales figures, customer reliability scores, churn likelihood, or sentiment indices. These scores, saved as measures in the DW, serve as basis for analytical reports and corporate performance analysis. Dimensions model subject-oriented data used as analysis perspectives when interpreting the measures. While measures and measurement functions are traditionally regarded as stable within the DW schema, its dimension values commonly change over time.

In reality, measures are also subject to change if DW designers (i) tune a parameter of the underlying measurement function, or (ii) update the scoring algorithm as a whole. In both scenarios, the changes must be obvious to the business analysts. Otherwise the changed measure semantics leads to incomparable measure values, and thus unsound and worthless analysis results.

To handle measure evolution properly, this paper proposes Slowly Changing Measures (SCMs) as an additional DW modeling concept. Its core idea is a valid time for measurement functions, analogous to dimensions. The paper proposes four increasingly rich SCM types, each specifying a set of options for the change history management of measure definitions. Most of the SCM types provide for probable changes of measurement functions at design time, reducing manual interference to the necessary minimum upon the actual change.

Each SCM type is explained in detail, and illustrated using a practical scenario. Furthermore, the paper presents a proof-of-concept prototype based on the TPC-H business model, and implemented in a simple relational database system. The pros and cons of every SCM type are discussed, and recommendations given for their implementation in a practical DW system.