Thursday 27 September 2012

Why migrate SAP BW data flows from 3.x to 7.x?

This post starts off with some of the discoveries and useful features made using this tool, and then expands further on the topic of the post: Why bother migrating a SAP BW data flow from the 3.x objects over to 7.x? This question has persisted in the SAP BW community for sometime now, since the introduction of BW 7.0 data flows in 2005. With the recent release of BW 7.3, the case for migration has become even more compelling.
 

Migration Wizard

 

Prior to the BW 7.3 release, one would have to manually migrate each object in a data flow separately from the 3.x version to 7.x. This can include the 3.x update rules, InfoSources, transfer rules and DataSources. However, through the migration wizard it is now possible to automate the migration of entire data flows to 7.x; including the addition of transformations, Data Transfer Processes (DTPs) and 7.x InfoSources (if required). The below screenshot shows the options available in the wizard.



Migration Options

One particularly useful feature of the wizard is the automated update of the migrated loading processes (specifically additional DTPs) into process chains. This has saved from a lot of searching and manual re-work in process chains post migration, adding DTPs in the right spot.

There is also a clear concept in the wizard, of segmenting data flow migrations into projects – thus enabling a more sophisticated management of data flow migration, and also recovery, than previously existed.

Some successes and failures with the migration wizard being able to automate the migration of the entire data flow. Well failures is probably too harsh, but some of the migrations were not successful and did require some manual re-work. Here’s a screenshot of the error log to give you an idea of the ‘look and feel’ of the error reporting:

Migration Wizard - Start Routine Errors

Migration Wizard - Start Routine Errors

Errors were particularly apparent with update rules that had start routines that relied on the old DATA_PACKAGE concept, being automatically migrated into 7.x transformations with SOURCE_PACKAGE start routines. Encountered a few syntax errors there with the COMM_STRUCTURE for some forms missing components – but no major problem, hard to cover all coding scenarios in one migration tool.

One annoying bug that encountered was not related to the migration wizard, but did occur post migration. This resulted in the following error:

Start Routine Syntax Error

Start Routine Syntax Error

This bug centered around the automatic update of the _ty_s_SC_1_full structure (used in the migrated form routine_9998 in the start routine). This structure was not automatically updated to reflect the source InfoSource structure, post the addition of some custom Z InfoObjects to the source InfoSource. Even after a manual update of the structure to include the additional Z InfoObjects and a successful save, the change was unable to be activated. The _ty_s_SC_1_full structure would always revert back automatically to a previous version, sans Z fields (See SAP Note 1052648 for further details on what is supposed to occur). Weird – but we were able to work around the issue without using the _ty_s_SC_1_full structure, while this bug is being fixed.

No tool, particularly a newly released one, is ever perfect.  Found the migration tool to be a great accelerator in setting up the necessary 7.x objects (RSDS, etc.) with only a little bit of extra tinkering around the edges required to get everything up and running.

Another great feature of the migration tool, is the tool itself provides feedback on what objects were not successfully migrated. Thus enabling you to channel your efforts into the right place to fix errors, without having to go through the whole data flow, wondering what successfully migrated, and what didn’t.

Have a look at the BW 7.30: Data Flow Migration tool blog entry on SCN for further detail on step by step screenshots on using the migration tool wizard.

So what’s the point of migrating?

 

The above section pre-supposes that one would wish to migrate a data flow from 3.x to 7.x. This raises the question of why would someone wish to perform such a migration in the first place? To me this situation arises in three categories:

1. Legacy 3.x implementation that used the 3.x transfer/update rules data flow 
    concept;

2. 7.x implementation that used the 3.x data flow concept;

3. Any implementation that has installed a 3.x business content data flow and 
    not migrated it!
 
Let’s look at these categories in further detail:
1. Legacy 3.x implementations
For the first point on legacy 3.x implementations, fair enough, your data flow model was built using the ETL logic that existed at the time. The question now is, does one invest in migrating the logic over to 7.x transformations, or leave things as they are… after all things are working just fine for the moment with transfer/update rules.

a. New Objects in BW 7.0

The arrival of SAP BW 7.0 (also known as SAP NetWeaver 2004s), brought with it a large change in the management of data flows. These changes helped to streamline and clearly segregate data flow objects based on their primary roles, i.e. persisting source data (DataSource), acquiring source data (InfoPackage), transforming data (Transformation) and moving data between persistent storage objects (Data Transfer Processes). 

Improvements were also made in terms of data transfer performance, including parallelisation, reduction of loading steps through the elimination of transfer and update rules, error handling, etc. This is explained more comprehensively in SAP documentation.

b. Faster Data Load in BW 7.3

In the BW 7.3 release, further improvements have been made in the speed and capabilities in transforming data. These include look-ups from DSOs (using the Read from DataStore option) and Master data objects (using the Navigation attributes as a source). The suggested improvement in performance is between 10-20%. Without 7.x transformations in your data flow, you’ll be missing out on these benefits.

As a side note, data activation has also been enhanced in BW 7.3 through the use of a package fetch of the active table, as opposed to single look-ups. The suggested improvement in data activation is between 15-30%.

c. Go-forward ETL strategy

The above benefit available in BW 7.3 serves as an indication as to where the focus of the development team at SAP lies. Legacy 3.x data flow support is rightly being maintained, however the focus is on improving the ‘new’ 7.x data flow with transformation/DTP concepts and not the 3.x data flow. Indeed, in SAP BW 7.3, new capabilities, such as Semantic Partitioning have been introduced, and these require 7.x transformations and InfoSources.

All in all with these factors in mind, the 7.x data flow is correctly classified as the ‘go-forward’ ETL strategy in BW. Furthermore, the migration of existing data flows to 7.x is generally recommended by SAP in cases where one wishes to ‘realise benefits of the new concepts and technology’.

d. Re-implementation Risk

With 7.x data flows eastablished as the go-forward ETL strategy, this raises the ire of the dreaded ‘re-implementation risk’ for customers that continue to use 3.x data flows. PepsiCo’s CTO Javed Hussain, refers to the general SAP upgrade dilemma in a Jan 2012 ASUG news article, in which he states ‘you don’t maintain a certain level of version or capability, then you’re going to fall behind two ways—you’re going to have a support problem with what you’re running today; and you’re going to fall behind your competition because you haven’t upgraded yet.’.

This rings true for the BW ETL strategy, if you stick around with the older 3.x technology, you’re also going to ‘fall behind’ as you’re not going to be able to leverage the development that SAP is placing in modelling and transforming data using the ‘go-forward’ 7.x data flow. Also at some point you’re going to run into a support issue, as either (a) 3.x data flow tech won’t be supported by SAP in the future, and/or (b) Your staff/consultants could potentially become less and less familiar with this ‘obsolete’ data flow over time.

However the period of time until you run into a support issue, could be an extended one. For example, some of the major banks in Australia are still running on good old COBOL and being supported by an aging army of consultants. No doubt the Aussie banks have benefited from a reduced IT CAPEX spend over the years by continuing to support and flog the COBOL horse. This perfectly reflects Hussain’s comment above in that the systems (being core banking) are (a) becoming increasingly difficult to support, and (b) perhaps now lacking in competitive advantage.

Indeed, it could be argued that these support issues are linked to a spate of recent bank IT failures. Some analysts are suggesting that due to this legacy, Australian bank consumers should be prepared for ’15 years of bank IT failures. So guess what the banks are doing… core upgrades, otherwise known as the dreaded ‘re-implementation’. Pretty dramatic, yes… but getting back to SAP BW, I don’t want to be doing a manual re-implementation of an entire data flow sometime down the track.
2. 7.x implementation that used 3.x data flow
My opinion that point number two should basically never occur. Given the benefits of the 7.x data flow, including enhanced performance, functionality and being the ‘go-forward’ ETL strategy, there’s no logical reason to implement an obsolete 3.x data flow from scratch in a 7.x environment.
3. Any implementation that has installed a 3.x business content data flow and not migrated it!
Point number three however, is a bit of a different story. SAP continues to supply a mixture of 3.x and 7.x data flows in their business content. Business content is a collective term for pre-configured templates of objects, canned reports and data modelling scenarios from DataSource to Data Mart that are based on SAP and customer experience. I find business content great and use it often as an implementation accelerator, but have found some drawbacks with the continuing supply of some data flows by SAP in 3.x format. 

So do I take the leap and migrate?

 

So far I’ve extolled the virtues of migrating your data flow from 3.x to 7.x, but is it time to take the leap and migrate? What are the risks and is this wise in a productive environment?

The focus of this conversation should be flipped around, to ‘why shouldn’t you migrate?’. I think given the points raised, that any business needs to be able to justify why they would be implementing or continuing to support older legacy ETL data flows in a 7.x BW system. I’m not saying that the continued use is never justifiable and have seen scenarios where the extensive breadth of productive BI data flows would carry significant cost in a migration project. However in such scenarios, where the use of 3.x data flows is continued, a ‘go forward’ plan should be in place, i.e. is it realistic to still be on 3.x data flows in 5 years time?

In terms of the risks, there is always a risk that the data flow migration may not be successfully completed, with conversion errors occurring in one or more objects in the data flow. This risk still persists even with the use of the mass migration wizard, and I’ve highlighted some of the errors I’ve encountered with the wizard in the first section of this post.

Like any development, the migration work should take place in the development system and never in production. I’d recommend using the migration wizard to automate the conversion to the 7.x data flow as much as possible and to also ensure that you can recover back to the 3.x data flow if the conversion is not successful. To aid in this, each data flow should have a separate migration project. 

Some ABAP knowledge will be required in more complex transformations with start/end routines and ABAP transformation code, to ensure that the conversion utility has properly handled the transformation of the code to ABAP OO where applicable. 

For testing purposes, I’d recommend taking a sample snapshot of data target results prior to the data flow conversion. This should then be compared to the same data sample post data flow conversion, after you’ve loaded data into the data target using a DTP. There are various ways of doing this depending on your comfort level, the main point is that you need to compare the same set of data pre and post conversion. To aid in this you may wish to move the pre data-flow conversion data into a copied data target, or just save the results in a spreadsheet, etc. prior to reloading the data target, post data flow conversion.

To mitigate the risk in transporting the migrated data flow to a production environment, the entire data flow, with process chain and all should be thoroughly tested in a QA environment (if you’re fortunate enough to have one). Also if you are fortunate enough to be able to get away with it, insist that you only finally transport data flows in a modular fashion, flow by flow, into production and don’t go with a big bang approach.

Hope you like this post and useful.


2 comments:

  1. I am William..
    I just browsing through some blogs and came across yours!
    Excellent blog, good to see someone actually uses for quality posts.
    Your site kept me on for a few minutes unlike the rest :)
    Keep up the good work!Thanks for sharing a important information on sharepoint

    ReplyDelete
  2. Dear Venugopala,

    Let me congratulate you on coming so close to the crux of matter. You just missed it un-fortunately.

    Ideally, you must have started with data flows in 3.x and said "they just do not exist" and instead are created on fly to show the entire flow of data. Data flow, being an object now, is SAP term for flow of data and thus mentioning it is a must.

    Following this,
    a) you could have focussed on why data flows exists as objects in 7.0
    a.1) Cant 7.0 do the same as 3.x generate data flows on fly (from all SAP metadata stored)

    b) why and how 7.0 emulates objects from 3.x
    b.1) Any loss of functionality while emulating 3.x within 7.0 system?
    b.2) Is 7.0 any better running in 7.0 sytle over 3.x emulation?
    b.3) Why cant 7.3 emulate 3.x objects like 7.0 did/could?

    c) 7.3 related content could have been explained here

    Lastly, 3.x and 7.0 are both SAP and dis-owning 3.x objects by calling them legacy is just un-professional.

    ReplyDelete