BI Life Cycle
Any data warehouse project can seem overwhelming at the beginning as project work includes:
- Project planning, organization and management
- Working with the business to gather business requirements, create data models based on these requirements and designing BI functionality to meet these needs
- Designing and implementing DW and BI architecture
- Designing data models and databases to implement these models
- Integrating data from other applications such as CRM, SCM, ERP, and Web
- Selection and use of Extract, Transform & Load (ETL) tools to populate the necessary databases
- Selection and use of Business Intelligence (BI) & On-Line Analytical Processing (OLAP) tools to provide the necessary business functionality
The key steps of work:
Project scope & initial analysis
- Business & IT Requirements
- Project plan
- Resource plan
- Logical data models
- Extract, transform and load (ETL)
- Reporting & analytical functionality
- Infrastructure setup
- Physical databases
- Extract, transform and load (ETL)
- Reporting and analytical functionality
- Operations startup
- Production roll-out
A typical BI Project life cycle which comprises of following phases:
1. Project Initiation and Analysis:
For any project to start a business need is a must. for example "replacing an old customized reporting system with a new reporting and analysis system which is more flexible, cost effective, fast and requires less maintainance" or "Need to consolidate data from disparate sources and a common standardized platform for reporting and analysis" could be a business need.This business need is evaluated at a very high level as to how critical is the need and how it well it falls in line with the strategic objectives of an Organization. Formally a Project manager is identified and appointed who further investigate and perform some first level of analysis.
It is then Project manager who creates a document with details like Business case, initial risk assessment, scheduling and budgeting, stakeholders identification etc taking help from the available resources and then a formal approval is taken.
2. Planning and Designing:
Requirements gathering is the most important and critical part of the whole project. A requirement document is created for this purpose and all the requirements are documented in much details. The requirements are finalized and then project scope is created. To fulfill these requirements, various tools are evaluated and whichever is the best fit is selected. Now the team resources are identified. All the logistics, hardware and software requirements are identified and procured.
Data models are designed and it is documented in a Technical design document which also specifies other technical details like features and functionalities of the design, security and authentication, best practices, repository variables, layers, database connections, connection pools etc etc..
Prototypes are designed to show the features and functionalities as documented in the requirement document and is formally approved. Project kickoff.
Before starting the development of the data models and reports, the project scope is divided into small more manageable modules or batches.
It is a good practice to divide it on the basis of functional areas and subject areas.
So let's suppose we decided to do it for functional area Human Resources among other functional areas like Sales and Distribution, finance, Inventory and Manufacturing etc under which we are planning to create different subject areas like
Employee Compensation and Bonus, Learning and Development, Employee movements and transfers etc. You may have one data model for complete HR or seperate data models for each subject areas.
For smaller organizations with less complex OLTP data structures, it is possible and feasible to have a single data model for complete Human Resources.
For large and complex OLTP structures, it is generally not possible as otherwise the size of the fact table will be extremely large horizontally as well as vertically. This will give an extremely slow performance as well as from maintainance perspective also the time taken to load the fact table will be more and unpractical.
Once we decide on our strategy for the development, we start with developing the data model as per the designs created in the previous phase i.e Planning and Designing.
The data model is developed generally in layers. In Oracle OBIEE, there are three layers and Cognos allows you to build as many layers as you wish and BO provides 2 layers(I am not very sure on this and would request some comments on this).
In Qlikview, we can make it single layered or 2 layered by renaming the column names in the script.
For all practical purposes, upto 3 layers is a good idea but you may agree or disagree on that. Based on your
requirements of maintainance you can decide on that.
OBIEE has 3 predefined layers namely Physical Layer, Business and Modeling layer and Presentation layer.
Physical layer is where we simply make connections to the database and import the metadata for database objects like tables, columns, views, primary and foreign key relationships. Now we do not make any changes related to changing the names of the columns which help the administrator and developers from maintainance perspective.
Based on the available physical layer objects we create our OLAP models in Business layer by defining dimensions, facts and Hierarchies.
In the presentation layer, we categorize the objects based on subject areas from the objects available in OLAP model in Business layer. We rename the objects present in Presentation layer from end users perspective or business terminology.
This whole process really helps the developers to understand and visualize the complete model and saves lot of time in debugging or day to day maintainance activities. This process oriented approach is again an attempt to divide and rule and making our life a bit simpler.
Once you are done with your model, the next step is to start developing your reports and bringing in the identified resources for report development into the team.
The reports based on the subject area are divided among the team and with the help of report specifications available in Technical design document created in the previous phase.The reports are generally designed by report developers.
While the report designers are engaged, the data model developers work developing the next subject area. Initially the team size is less and as the work keeps on growing, more people are added in the team.
The development also includes setting up the object level as well as data level security , creating users and groups and creating variables as per the technical design document.
Testing is one of the most critical phase and also sometimes most time consuming phase of a BI project Implementation.
There are 3 types of testing done on a project namely Unit Testing (UT), System Integration Testing (SIT) and User Acceptance Testing (UAT).
Unit Testing is generally done by the developers and they test that the code or report they have developed is working as per the requirement or specifications. This is performed in Development environment where Developers had developed the
report. Developers prepare a test case plan for themselves listing all the cases they would like to test. These cases could be testing the font size, color, spell check, prompts, filters, data values etc.
Sometimes developers may exchange the reports with their team members to perform a peer unit testing and this is a good practice as it is little easier to find out mistakes in other's report than your own.
Once Unit testing is complete, the code (data models and reports) are transferred to Test Environment.
This Testing Environment is generally similar to the production environment but that is not the case always. Having the test environment same as production environment allows us to anticipate the performance and behavior of the developed solution in a much better way.
Once the code is transferred to Test environment, System Integration Testing is performed. SIT checks how all the individually pieces works collectively and are integrated well to produce the desired output. This test is performed by the IT team members or by identified testers from the client side. However before they perform the test a sampling based dry run is required to be performed by the development team.
Once the testing team start testing the application, they put all the defects in a defect log sheet mentioning the
details of the defect.
At this point of time, it is recommended to appoint some dedicated members from the development team to fix those identified defects and update the defect log sheet. While this activity is going on, other team members are assigned next set of development work and they keep working on developing next batch of reports. It may happen that the same team will fix the defects by allocating some portion of their time and rest of the time in developing next batch reports. But this may bring some imbalance or turbulence in the system as it will become very difficult to really work on two things simultaneously. Bug fixing involves lot of coordination with ETL team as well as testers and sometimes consumes time more than what was anticipated which ultimately may impact the development activities. Having a dedicated team for bux fixing activities would be very useful and effective.
Once all bugs are identified and fixed, the Business users are asked to perform the User acceptance test. The test cases are prepared by business users and they check if all the requirements are fulfilled and they really getting what they want. Here business users compare the result set with the result set from their existing system and verify the results.
One of the biggest challenge in SIT or UAT is if any data related inaccuracies are found, it becomes really difficult to find the root cause. The developer needs to check which version is true. There may be some internal logic or transformations or formulas applied in the existing application and this analysis consumes whole lot of time requiring lot of coordination with the team supporting the existing application or system.
After the code is tested and verified completely, it is transferred to the production environment and opened for the real end users. Before that general awareness sessions and training sessions are held for the end users to use the new system. For some time the new system is put on a stabilization period (Generally ranges from 15 days to 2 or 3 months but it could be even more) where in case a bug occurs, it is fixed by the same development team.
During this time the new system or application is made to run parallel with the existing system or application and once the stabilization period is over, the old system is replaced partially or completely.
Once the stabilization period is over and the system gets stabilized, the support team is provided with all the project related knowledge and the development team is rolled off.
The implementation of BI Project gets over.