Certified Data Mining and Warehousing Professional Agile Development process

Agile Development process
 


"Agile Development" is an umbrella term for several iterative and incremental software development methodologies. The most popular agile methodologies include Extreme Programming (XP), Scrum, Crystal, Dynamic Systems Development Method (DSDM), Lean Development, and Feature-Driven Development (FDD).

While each of the agile methods is unique in its specific approach, they all share a common vision and core values . They all fundamentally incorporate iteration and the continuous feedback that it provides to successively refine and deliver a software system. They all involve continuous planning, continuous testing, continuous integration, and other forms of continuous evolution of both the project and the software. They are all lightweight (especially compared to traditional waterfall-style processes), and inherently adaptable. As important, they all focus on empowering people to collaborate and make decisions together quickly and effectively.

The Evolution of Agile Development

Many of the individual principles and practices that are promoted by agile development have been around for years, even decades. As opposed to implementing these best practices piecemeal, agile methodologies have "packaged" various customer, management, and in some cases, engineering practices and principles together in a way that helps guide teams through the process of rapidly planning and delivering working, tested software. Each of the agile methodologies combines both old and new ideas into refinements that are certainly greater than the sums of their parts.

While it is true that many of the practices associated with agile development have been around for quite some time, the average software development team has yet to embrace many of the principles and practices. Even today, the average software team does not iterate, does not deliver software incrementally, and does not practice continuous planning nor automate testing. Now that these practices have been combined in a manner that can more easily be understood and adopted, the trend appears to be rapidly changing for the better, especially during the last several years.

As with any new way of doing business though, Agile methods have generated quite a bit of controversy within the software community. Yet since their emergence, in project after project, they have continued to deliver higher quality software systems in less time than traditional processes. If you are a software development professional, you definitely owe it to yourself to become familiar with the theory and practice of agile development. Hopefully the information presented on this site can assist.

 

Agile Manifesto

In February 2001, 17 software developers met at the Snowbird, Utah resort, to discuss lightweight development methods. They published the Manifesto for Agile Software Development to define the approach now known as agile software development. Some of the manifesto's authors formed the Agile Alliance, a nonprofit organization that promotes software development according to the manifesto's principles.

The Agile Manifesto reads, in its entirety, as follows:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
That is, while there is value in the items on the right, we value the items on the left more.

The meanings of the manifesto items on the left within the agile software development context are described below:

  • Individuals and interactions – in agile development, self-organization and motivation are important, as are interactions like co-location and pair programming.
  • Working software – working software will be more useful and welcome than just presenting documents to clients in meetings.
  • Customer collaboration – requirements cannot be fully collected at the beginning of the software development cycle, therefore continuous customer or stakeholder involvement is very important.
  • Responding to change – agile development is focused on quick responses to change and continuous development.

Twelve principles underlie the Agile Manifesto, including:

  • Customer satisfaction by rapid delivery of useful software
  • Welcome changing requirements, even late in development
  • Working software is delivered frequently (weeks rather than months)
  • Working software is the principal measure of progress
  • Sustainable development, able to maintain a constant pace
  • Close, daily co-operation between business people and developers
  • Face-to-face conversation is the best form of communication (co-location)
  • Projects are built around motivated individuals, who should be trusted
  • Continuous attention to technical excellence and good design
  • Simplicity- The art of maximizing the amount of work not done - is essential
  • Self-organizing teams
  • Regular adaptation to changing circumstances

In 2005, a group headed by Alistair Cockburn and Jim Highsmith wrote an addendum of project management principles, the Declaration of Interdependence, to guide software project management according to agile development methods.

Characteristics

There are many specific agile development methods. Most promote development, teamwork, collaboration, and process adaptability throughout the life-cycle of the project.

Agile methods break tasks into small increments with minimal planning and do not directly involve long-term planning. Iterations are short time frames (timeboxes) that typically last from one to four weeks. Each iteration involves a cross functional team working in all functions: planning, requirements analysis, design, coding, unit testing, and acceptance testing. At the end of the iteration a working product is demonstrated to stakeholders. This minimizes overall risk and allows the project to adapt to changes quickly. An iteration might not add enough functionality to warrant a market release, but the goal is to have an available release (with minimal bugs) at the end of each iteration. Multiple iterations might be required to release a product or new features.

Team composition in an agile project is usually cross-functional and self-organizing, without consideration for any existing corporate hierarchy or the corporate roles of team members. Team members normally take responsibility for tasks that deliver the functionality an iteration requires. They decide individually how to meet an iteration's requirements.

Agile methods emphasize face-to-face communication over written documents when the team is all in the same location. Most agile teams work in a single open office (called a bullpen), which facilitates such communication. Team size is typically small (5-9 people) to simplify team communication and team collaboration. Larger development efforts can be delivered by multiple teams working toward a common goal or on different parts of an effort. This might require a coordination of priorities across teams. When a team works in different locations, they maintain daily contact through videoconferencing, voice, e-mail, etc.

No matter what development disciplines are required, each agile team will contain a customer representative. This person is appointed by stakeholders to act on their behalf and makes a personal commitment to being available for developers to answer mid-iteration questions. At the end of each iteration, stakeholders and the customer representative review progress and re-evaluate priorities with a view to optimizing the return on investment (ROI) and ensuring alignment with customer needs and company goals.

Most agile implementations use a routine and formal daily face-to-face communication among team members. This specifically includes the customer representative and any interested stakeholders as observers. In a brief session, team members report to each other what they did the previous day, what they intend to do today, and what their roadblocks are. This face-to-face communication exposes problems as they arise. "These meetings, sometimes referred as daily stand-ups or daily scrum meetings, are held at the same place and same time every day and should last no more than 15 minutes. Standing up usually enforces that rule."

Agile development emphasizes working software as the primary measure of progress. This, combined with the preference for face-to-face communication, produces less written documentation than other methods. The agile method encourages stakeholders to prioritize "wants" with other iteration outcomes, based exclusively on business value perceived at the beginning of the iteration (also known as value-driven).

Specific tools and techniques, such as continuous integration, automated or xUnit test, pair programming, test-driven development, design patterns, domain-driven design, code refactoring and other techniques are often used to improve quality and enhance project agility.

Light Agile Development (LAD) is a flavor of agile methodology that applies hand picked techniques from the wider range of agile practices to suit different companies, development teams, situations and environments. Another key aspect of LAD is that it tends to be user-centric, focusing primarily on the user experience and usable software interfaces and uses agile methodologies to deliver them. Most real-world implementations of Agile are really LAD in practice, since a core value of the methodology is to be flexible, sensible and to focus on getting stuff done.

In agile software development, an information radiator is a (normally large) physical display placed in a prominent location in an office, where passers-by can see it, and which presents an up-to-date summary of the status of a software product or products. The name was coined by Alistair Cockburn, and described in his 2002 book Agile Software Development. A build light indicator may be used to inform a team about the current status of their project.

 

Data Warehousing and agile development

best practices describe ways to reduce overall risk on your project while increasing the probability that you will deliver a DW or BI solution which meets the actual needs of its end users.  Successful DW/BI projects take an evolutionary approach to development, and better yet an agile one.  The principles, practices and philosophies of the Agile Modeling (AM) and Agile Data (AD) methods are applied throughout.  I've organized the best practices into the following categories:

  1. Do some initial architecture envisioning.  At the beginning of a project, during Iteration 0, you want to do some initial architecture modeling to identify a potential vision for how your team will build the data warehouse.  As Agile Model Driven Development (AMDD) suggests, you do not need to create a comprehensive, detailed model up front, you only need a high-level vision at the beginning of the project and the details can be identified on a just-in-time (JIT) basis via model storming.  Sometimes a simple whiteboard sketch is all you need to understand your architectural vision.  If so, then just do that.  For a BI/DW project, the initial architecture views would likely be some form of deployment diagram capturing the technologies you intend to use and a high-level domain model overviewing the business entities and the relationships between them.
  1. Model the details just in time (JIT).  The best time to model details isn't at the beginning of a project but instead to model storm them throughout the project in a JIT manner.  There are several reasons for this.  First, like it or not, the requirements are going to change throughout the project.  Second, by waiting to analyze the details JIT, you have much more domain knowledge than if you had done so at the beginning of a project.  For example, if a requirement is to be implemented three months into a project, if you explore the details of that requirement at that point you have three months more domain knowledge than if you had done so at the beginning of the project, therefore you can ask more intelligent questions.  Third, if you've been delivering working software on a regular basis (see below) your stakeholders now have three months worth of experience with the system that you've built.  In other words, they can give you better answers.  Fourth, modeling everything up front appears to result in significant wastage.
  2. Prove the architecture early.  Everything works in PowerPoint slides, on a whiteboard, or in CASE tool models but it isn't until you prove it with code that you know that your architecture actually works.  Processes such as Rational Unified Process (RUP), Agile Unified Process (AUP), and OpenUP suggest that you build a working, end-to-end "skeleton" of your system to prove that all aspects of it work.  In the case of a DW, this would entail that you show that you can access the major legacy data sources, that your extract-transform-load (ETL) strategy works, that your database regression testing strategy works, and that your reporting tools can access your DW.
  3. Focus on usage.  If you want to develop a system effectively, including a DW/BI system, then you need to understand how people will potentially use it to support their business objectives. This means that we need a usage-centered approach to development driven by use cases or usage scenarios, not a data-centered one driven by data models. Data is clearly an important part of the overall picture, but it's only one of many parts. If we focus on data and not usage we run the risk of building something that nobody is interested in using, an all-too-common occurrence on traditional data warehouse efforts.
  4. Don't get hung up on "the one truth".  The "one truth" philosophy says that it is desirable to have a single definition for a data element or business term, that there should be a common, shared definition for your master reference data and perhaps even your major business entities. To get to this "one truth", when it is possible, often requires significant effort which often goes past the point of diminishing returns.  "One truth" is a nice vision to work toward, but don't let it prevent your team from delivering important business value in a timely manner. The fact is that various portions of your organization have different ways of working, different priorities, and different constraints.  Seeking the one truth for a data element often proves to be an artificial constraint imposed by traditional data professionals, not by the actual business.  You can in fact take an Agile approach to Master Data Management (MDM).
  5. Organize your work by requirements.  On agile projects we perform work based on prioritized requirements, not by technical issues such as source systems. Each iteration we do the work to fulfill the highest priority stakeholder requirements which fit into that iteration. During each iteration we get a little more data from system X, and some more from system Y, and some more from system Z, and so on.  If our iterations are two weeks in length, we pull two weeks worth of work from the top of the priority stack. By working in this manner we are always in a position where we are achieving the maximum benefit for our stakeholder's IT investment, thereby reducing their risk.
  6. Active stakeholder participation.   Stakeholder involvement is critical throughout your project, and better yet active stakeholder participation where stakeholders are not only involved with your project on a daily basis, they are also directly involved with the actual modeling effort itself.

More Tips

  1. Adopt common development standards.  Just like when you're building any other system, you should follow your organization's common development guidance.  This includes modeling style guidelines, coding guidelines, data naming conventions, report design guidelines, and so on.  Consistency is an important contributor to quality, and guidance which is willingly followed by developers seems to be more effective in practice than guidance which is mandated and enforced by your governing bodies.  If such guidance doesn't exist within your organization then you'll want to develop some, or better yet, adopt existing ones from industry. It's amazing what you can find on the Internet these days just by looking for it.
  2. Use good tools.  DW/BI development is complex, and good tools will help to address that complexity. In addition to the standard tools for data modeling, extract-transform-load (ETL), and reporting you'll also need tools which support evolutionary development techniques such as database refactoring, database testing, and database deployment.  The agile/evolutionary tools are currently emerging in the marketplace.
  3. Don't underestimate legacy data challenges.  Existing data sources are often a mess, as I discuss in The Joy of Legacy Data.  Ideally you'll refactor the source to fix any data quality problems, but if that's not an option then you'll need to cleanse the source data as much as possible as you extract it from the legacy sources.  In my experience data cleansing should be seen as a process smell which indicates the need for legacy data source owners to become better at database evolution.
  4. Travel light.  Serial approaches to development are typically documentation heavy, often in a naive attempt to counteract the inherent communication risks of waterfall lifecycles.  With the serial approach you often see teams create comprehensive logical and physical data models and detailed report specifications.  With an Agile approach where you develop working software each iteration, you quickly discover two things.  First, an evolutionary approach to development demands an evolutionary approach to data modeling and therefore you don't need to create detailed data models up front.  Second, by having developers work closely with stakeholders you don't need to create detailed report specifications, instead it is more effective to simply create a report and get feedback from your stakeholders which you then act on iteratively.
  5. Adopt a lean approach to data governance.  Traditional, command-and-control approaches to data governance appear to work very poorly in practice.  The 2006 DDJ survey into the current state of data management practices showed that 66% of development teams will choose to "work around" their organization's data group, and when they do so that 75% of the time it is because they find the data group too difficult to work with, too slow to respond, or that the data group doesn't provide sufficient value to justify the effort of working with them. This is clearly problematic.  It is possible to take a lean/agile approach to data governance.
 For Support