Subscribe by Email

Your email:

Posts by Month

Arbutus Blog

Current Articles | RSS Feed RSS Feed

Agile Technologies and Data Warehousing

  | Share on Twitter Twitter | Share on Facebook Facebook | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon |  Share on LinkedIn LinkedIn | Submit to Reddit reddit 

“An ounce of planning is worth a pound of prevention”. In the DW world, this pearl of wisdom could not be more true. Many traditional data warehouse projects require months of planning before the earliest implementation steps are achieved.  Users do not get a chance to view or vet the implementation until many resources and much time have already been spent. Although there can be many reasons for this, it can be argued that one of the key contributing factors for this inefficiency is the implementation methodology itself, especially at the start of a project.

An idea growing in acceptance is the adoption of agile technologies at the earliest stages of implementing a data warehouse or data mart. This has proven to reduce the timelines for data warehouse or data mart projects. One of the key advantages with agile technologies is that it enables you to profile, test and see your data warehouse or data mart before it’s actually built.

Using an agile technology, before the first implementation steps, users are provided with immediate access to core information.  Users can immediately see the results and provide instant feedback.  This feedback can be easily incorporated, yielding an instant warehouse model that reflects the evolving scope and requirements of your data warehouse project.  If needed, the agile technologies allow you to choose to continue using the prototype for a period of time, to ensure that the requirements have stabilized before you undertake the full implementation.

Requirements of agile technologies for data warehousing:

  • Directly reading real source data, including complex mainframe legacy sources when needed
  • Providing real interfaces to end-user applications
  • Integrating any number of disparate data sources into a single logical view

Key features of agile technologies for data warehousing:

  • The ability to fully implement business rules
  • The ability to perform any data mapping or profiling
  • The capabilities to implement any necessary data cleansing or conversion requirements
  • The flexibility to add new or modified tables, columns, or data relationships
  • The adaptability to modify or add cleansing, transformation or mapping rules
  • The ability to provide filtering
  • The ability to perform dynamic calculations

Important benefits of agile technologies for data warehousing:

  • Saving time to model and implement the data warehouse
  • Saving money by minimizing the subsequent re-work required
  • Ensuring end-user buy in by incorporating end-user feedback early and often to create a functional and acceptable model

Once the final model has been established and has achieved acceptance, implementation can proceed using standard data warehousing tools and techniques. As the project proceeds, there can be a much higher comfort level with the knowledge that the users have already seen, worked with and participated in the development of the final data model.

Agile ETL….it’s never too late to save a Data Migration Project

  | Share on Twitter Twitter | Share on Facebook Facebook | Submit to Digg digg it |  Add to delicious  delicious |  Submit to StumbleUpon StumbleUpon |  Share on LinkedIn LinkedIn | Submit to Reddit reddit 
"We read to know we're not alone" - C.S. Lewis

 As referenced in our last blog entry (67% of all data migration projects are not delivered on time). Last year we worked with a  large IT out-sourcing firm that was well on its way to contributing to this unfortunate statistic. They were the lead firm involved in a project for a large government department to capture and convert financial collection data from selected source applications. The requirements were to consolidate and transform them for loading into the new SAP database to create a single collection process. The source data and program logic was spread over multiple applications and government agencies, primarily on IBM mainframes with IMS and DB2 databases. The scope of the project required the data migration work to be split up into two stages. Each stage involved different government departments but, the same types of Mainframe data sources. The conversion included typical data quality and transformation tasks required to move data meaningfully across applications.

The Problem

Typical of many data migration approaches, multiple conversion programs were written in PL1 and COBOL to convert the isolated legacy data into the format required by the SAP data loaders. The initial development of the conversion programs took weeks to months to research and code but were completed on schedule.

 The delays started later in the stage one process with the discovery of many data quality issues and the unexpected and additional programming requirements to transform the data. This resulted in significant delays which caused most of the entire project resources and budget being consumed by the initial stage of the conversion while the equally important second stage had yet to be started. The success of the whole project was contingent on completing both stages.

The project team struggled in three areas:

  • limited abilities to query and understand the data in the source legacy system(s)
  • difficulties separating out the specifically required data from the legacy systems
  • the need to write in PL1 and Cobol to convert and cleanse the isolated legacy data into the format required by the SAP data loaders

The Solution

Under the recommendation of one of the client’s sub-contractors working on the conversion effort, the client agreed to look at the use of agile technologies to speed up data conversion activities. This met with some skepticism because there history was with their known methodologies, especially for such a large data conversion project. As well, since the initial stage of the conversion was already well under way, any further interruption caused by changing the approach at such a late date ran the risk of introducing further delays.

The Results

Within a one month period from start to finish, the project team was able to complete the initial conversion stage and complete the critical second stage conversion process. This included acquiring and installing the agile tool, performing data analysis, cleansing and transformation. They were also able to complete all the writing and refining the automated procedures to perform these tasks and export the final data into a tab delimited format to meet the SAP application data loading requirements.  In the end, the project was completed within the original timeframe and at a fraction of the cost that was budgeted.

The client acknowledged at the end of the project the main reason they were able to successfully save the project was the decision to adopt the agile technology mid-stream. The technology gave them the power and flexibility to streamline and simplify the conversion process and overcome the issues that were causing the delays.

All Posts