Certified Data Mining and Warehousing Professional Design


Design is the creation of a plan or convention for the construction of an object or a system (as in architectural blueprints, engineering drawing, business process, circuit diagrams and sewing patterns). Design has different connotations in different fields. In some cases the direct construction of an object (as in pottery, engineering, management, cowboy coding and graphic design) is also considered as design.


Design the Dimensional Model

User requirements and data realities drive the design of the dimensional model, which must address business needs, grain of detail, and what dimensions and facts to include.

The dimensional model must suit the requirements of the users and support ease of use for direct access. The model must also be designed so that it is easy to maintain and can adapt to future changes. The model design must result in a relational database that supports OLAP cubes to provide "instantaneous" query results for analysts.

An OLTP system requires a normalized structure to minimize redundancy, provide validation of input data, and support a high volume of fast transactions. A transaction usually involves a single business event, such as placing an order or posting an invoice payment. An OLTP model often looks like a spider web of hundreds or even thousands of related tables.

In contrast, a typical dimensional model uses a star or snowflake design that is easy to understand and relate to business needs, supports simplified business queries, and provides superior query performance by minimizing table joins.

For example, contrast the very simplified OLTP data model in the first diagram below with the data warehouse dimensional model in the second diagram. Which one better supports the ease of developing reports and simple, efficient summarization queries?

 Click here for larger image

Figure 2. Flow Chart (click for larger image)


Figure 3. Star Diagram

Back to top

Dimensional Model Schemas

The principal characteristic of a dimensional model is a set of detailed business facts surrounded by multiple dimensions that describe those facts. When realized in a database, the schema for a dimensional model contains a central fact table and multiple dimension tables. A dimensional model may produce a star schema or a snowflake schema.

Star Schemas

A schema is called a star schema if all dimension tables can be joined directly to the fact table. The following diagram shows a classic star schema.

 Click here for larger image

Figure 4. Classic star schema, sales (click for larger image)

The following diagram shows a clickstream star schema.

 Click here for larger image

Figure 5. Clickstream star schema (click for larger image)

Snowflake Schemas

A schema is called a snowflake schema if one or more dimension tables do not join directly to the fact table but must join through other dimension tables. For example, a dimension that describes products may be separated into three tables (snowflaked) as illustrated in the following diagram.

 Click here for larger image

Figure 6. Snowflake, three tables (click for larger image)

A snowflake schema with multiple heavily snowflaked dimensions is illustrated in the following diagram.

 Click here for larger image

Figure 7. Many dimension snowflake (click for larger image)

Star or Snowflake

Both star and snowflake schemas are dimensional models; the difference is in their physical implementations. Snowflake schemas support ease of dimension maintenance because they are more normalized. Star schemas are easier for direct user access and often support simpler and more efficient queries. The decision to model a dimension as a star or snowflake depends on the nature of the dimension itself, such as how frequently it changes and which of its elements change, and often involves evaluating tradeoffs between ease of use and ease of maintenance. It is often easiest to maintain a complex dimension by snow flaking the dimension. By pulling hierarchical levels into separate tables, referential integrity between the levels of the hierarchy is guaranteed. Analysis Services reads from a snowflaked dimension as well as, or better than, from a star dimension. However, it is important to present a simple and appealing user interface to business users who are developing ad hoc queries on the dimensional database. It may be better to create a star version of the snowflaked dimension for presentation to the users. Often, this is best accomplished by creating an indexed view across the snowflaked dimension, collapsing it to a virtual star.


It consists of the following topics -

 For Support