Generate data warehouse matrix
 


The Enterprise Bus Matrix

The Enterprise Bus Matrix is a Data Warehouse planning tool and model created by Ralph Kimball, and is part of the Data Warehouse Bus Architecture. The Matrix is the logical definition of one of the core concepts of Kimball’s approach to Dimensional Modeling – Conformed dimensions.

The Bus Matrix defines part of the Data Warehouse Bus Architecture and is an output of the Business Requirements phase in The Kimball Lifecycle. It is applied in the following phases of dimensional modeling and development of the Data Warehouse . The matrix can be categorized as a hybrid model, being part technical design tool, part project management tool and part communication tool 

Background

The Enterprise Bus Matrix stems from the issue of how one goes about creating the overall Data Warehouse environment. Historically there has been the structure of the centralized and planned approach and the more loosely defined, department specific, solutions developed in a more independent matter. Autonomous projects can result in a range of isolated stove pipe data marts. Naturally each approach has its issues; the overall visionary approach often struggles with long delivery cycles and lack of reaction time as the formalities and scope issues is evident. On the other hand the development of isolated data marts, leading to Stovepipe systems that lacks synergy in development. Over time this approach will lead to a so called data-mart-in-a-box architecture  where interoperability and lack of cohesion is apparent, and can hinder the realization of an overall enterprise Data Warehouse. As an attempt to handle this matter Ralph Kimball introduced the enterprise bus.

The Bus Matrix

In short words the bus matrix purpose is one of high abstraction and visionary planning on the Data Warehouse architectural level. By dictating coherency in the development and implementation of an overall Data Warehouse the Bus Architecture approach enables an overall vision of the broader enterprise integration and consistency while at the same time dividing the problem into more manageable parts  – all in a technology and software independent manner .

The bus matrix and architecture builds upon the concept of conformed dimensions - creating a structure of common dimensions that ideally can be used across the enterprise by all business processes related to the DW and the corresponding fact tables from which they derive their context. According to Kimball and Marg Rosses article “Differences of Opinion”  "The Enterprise Data warehouse built on the bus architecture ”identifies and enforces the relationship between business process metrics (facts) and descriptive attributes (dimensions)”.

The concept of a bus is well known in the language of Information Technology, and is what reflects the conformed dimension concept in the Data Warehouse, creating the skeletal structure where all parts of a system connect, ensuring interoperability and consistency of data, and at the same time considers future expansion. This makes the conformed dimensions act as the integration ‘glue’, creating a robust backbone of the enterprise Data Warehouse.

Establishment and applicability

Figure 1 shows the base for a single document planning tool for the whole of the DW implementation - a graphical overview of the enterprises core business processes or events each correspond to a measurement table of facts, that typically is complemented by a major source system in the horizontal rows. In the vertical columns the groups of contextual data is found as the common, conformed dimensions.

In this way the shared dimensions are defined, as each process indicates what dimensions it applies to through the cells figure 2. By this definition and coordination of conformed dimensions and processes the development of the overall data DW bus architecture is realized. The matrix identifies the shared dimensions related to processes and fact tables, and can be a tool for planning, prioritizing what needs to be approached, coordinating implementation and communicating the importance for conformed dimensions .

Kimball extends the matrix bus in detail as seen in figure 3 by introducing the other steps of the Datawarehouse Methodology; The Fact tables, Granularity, and at last the description of the needed facts. description of the fact tables, granularity and fact instances of each process, structuring and specifying what is needed across the enterprise in a more specific matter, further exemplifying how the matrix can be used as a planning too.


Bus matrix  allow us to present the relations between measure group (business process) and dimensions (group by / filter by). By creating a bus matrix it is much easier to understand the overall objective of the data warehouse by both technical and business people which helps a lot when it comes to understanding each other.

Few definisions

Measures is everything what you can measure in a business process so basically any "fields" where you can apply aggregate fuctions like min,max, count, sum. An example of a measure is "Total Visits" = count of visits. So when you visited this webpage our count of visits increased by 1.

Measure Group is simply a collection of measures related to one "business process" for instance we can have Visit as a business process and we can measure it by "Total Visits" but also we can have "Average Time Per Visit" or "Average Position from search engine". All of these measures belong to one measure groups (business process) called Visits.

Dimensions are "a subject" important from business perspective with a set of attributes that you can use to group by or filter measures. For instance you can have Country dimension that will have two attributes: Country Name and Continent. So you can show Total Visits by Country or Continent (or both). So if you visited this website from UK total visits would increase by 1 for UK country and Europe Continent. You could also show total visits by country (group by) for Europe only (filter by).

NOTE: Dimension "natural key" is on the lowest level of "the subject" and you use attributes (other fields) to describe it usually on higher level. Date is a very good example. You have Date dimension which "natural key" is date for instance 14th of April 2012 and you have attributes (fields) that describe it for instance DayOfWeek: Saturday, Month: April, Year: 2012. The same applies to other dimension and thanks to this approach you will generally have very few dimensions but with many attributes inside that is very easy to understand and intuitive to business users.