Posted by Arbutus Software on Wed, Mar 03, 2010 @ 12:41 PM
"We read to know we're not alone" - C.S. Lewis
As referenced in our last blog entry (67% of all data migration projects are not delivered on time). Last year we worked with a large IT out-sourcing firm that was well on its way to contributing to this unfortunate statistic. They were the lead firm involved in a project for a large government department to capture and convert financial collection data from selected source applications. The requirements were to consolidate and transform them for loading into the new SAP database to create a single collection process. The source data and program logic was spread over multiple applications and government agencies, primarily on IBM mainframes with IMS and DB2 databases. The scope of the project required the data migration work to be split up into two stages. Each stage involved different government departments but, the same types of Mainframe data sources. The conversion included typical data quality and transformation tasks required to move data meaningfully across applications.
The Problem
Typical of many data migration approaches, multiple conversion programs were written in PL1 and COBOL to convert the isolated legacy data into the format required by the SAP data loaders. The initial development of the conversion programs took weeks to months to research and code but were completed on schedule.
The delays started later in the stage one process with the discovery of many data quality issues and the unexpected and additional programming requirements to transform the data. This resulted in significant delays which caused most of the entire project resources and budget being consumed by the initial stage of the conversion while the equally important second stage had yet to be started. The success of the whole project was contingent on completing both stages.
The project team struggled in three areas:
- limited abilities to query and understand the data in the source legacy system(s)
- difficulties separating out the specifically required data from the legacy systems
- the need to write in PL1 and Cobol to convert and cleanse the isolated legacy data into the format required by the SAP data loaders
The Solution
Under the recommendation of one of the client’s sub-contractors working on the conversion effort, the client agreed to look at the use of agile technologies to speed up data conversion activities. This met with some skepticism because there history was with their known methodologies, especially for such a large data conversion project. As well, since the initial stage of the conversion was already well under way, any further interruption caused by changing the approach at such a late date ran the risk of introducing further delays.
The Results
Within a one month period from start to finish, the project team was able to complete the initial conversion stage and complete the critical second stage conversion process. This included acquiring and installing the agile tool, performing data analysis, cleansing and transformation. They were also able to complete all the writing and refining the automated procedures to perform these tasks and export the final data into a tab delimited format to meet the SAP application data loading requirements. In the end, the project was completed within the original timeframe and at a fraction of the cost that was budgeted.
The client acknowledged at the end of the project the main reason they were able to successfully save the project was the decision to adopt the agile technology mid-stream. The technology gave them the power and flexibility to streamline and simplify the conversion process and overcome the issues that were causing the delays.
Posted by Arbutus Software on Fri, Feb 05, 2010 @ 09:26 AM
The harsh statistics are that 67% of all data migration projects are not delivered on time and average a 41% time overrun.
Eleanor Roosevelt said “Learn from the mistakes of others. You can’t live long enough to make them all yourself.”
What we have learned is that the traditional/common approach to data migration projects relies upon developing conversion programs in 3rd GL languages or involved ETL tools that typically take many weeks or months to research and code. Often data quality issues are not found until late in the process when the conversion programs are finally being run and tested. This often results in further delays as missing or additional data conversion requirements are uncovered late that need to be programmed into the conversion process.
As a result, many companies struggle with the complexities of migrating legacy data into new systems like SAP or creating a new data mart. There are a variety of reasons for this including:
- A lack of legacy system skill sets and tools to understand and work with the source data
- An inability to foresee and validate all the data quality issues and business rules within the source data so that data problems are often not identified until the data migration is already well underway
- Rigidity in the overall process that prevents, delays, or makes very costly implementing changes to address unexpected data issues or changing requirements.
Increasingly, companies faced with data migration projects are looking into using more agile technologies to enable a much more proactive approach to data migration and to allow emerging issues or changing needs to be dealt with immediately.
For Agile technologies to be effective in a data migration or data warehousing project they need to:
- Provide access to source data directly from the legacy systems
- Enable data querying/profiling to proactively discover data quality issues and to validate/determine business rules in the source data
- Perform the necessary data transformations required for loading data into the new system
- Facilitate timely changes to the migration process based on emerging issues or changing needs
Is Agile Data Migration possible? In a future post will be a real-life story from last year that would suggest, yes, Agile Data Migration is possible.