Tuesday 27 March 2012

Write-Optimized DSO

The objective of Write-Optimised DSO is to save data as efficiently as possible to further process it without any activation, additional effort of generating SIDs, aggregation and data-record based delta. This is a staging DataStore used for a faster upload.

Write-Optimized DSO has been primarily designed to be the initial staging of the source system data from where the data could be transferred to the Standard DSO or the InfoCube.



  • The data is saved in the write-optimized Data Store object quickly. Data is stored in at most granular form. Document headers and items are extracted using a DataSource and stored in the DataStore.
  • The data is then immediately written to the further data targets in the architected data mart layer for optimized multidimensional analysis.

The key benefit of using write-optimized DataStore object is that the data is immediately available for further processing in active version. YOU SAVE ACTIVATION TIME across the landscape. The system does not generate SIDs for write-optimized DataStore objects to achive faster upload. Reporting is also possible on the basis of these DataStore objects. However, SAP recommends to use Write-Optimized DataStore as a EDW inbound layer, and update the data into further targets such as standard DataStore objects or InfoCubes.

When is it recommended to use Write-Optimized DataStore

Here are the Scenarios for Write-Optimized DataStore.
  • Fast EDW inbound layer.
  • SAP recommends Write-Optimized DSO to be used as the first layer. It is called Enterprise Data Warehouse layer. As not all business content come with this DSO layer, you may need to build your own. You may check in table RSDODSO for version D and type "Write-Optimized".
  • There is always the need for faster data load. DSOs can be configured to be Write optimized. Thus, the data load happens faster and the load window is shorter.
  • Used where fast loads are essential. Example: multiple loads per day (or) short source system access times (world wide system landscapes).
  • If the DataSource is not delta enabled. In this case, you would want to have a Write-Optimized DataStore to be the first stage in BI and then pull the Delta request to a cube.
  • Write-optimized DataStore object is used as a temporary storage area for large sets of data when executing complex transformations for this data before it is written to the DataStore object. Subsequently, the data can be updated to further InfoProviders. You only have to create the complex transformations once for all incoming data.
  • Write-optimized DataStore objects can be the staging layer for saving data. Business rules are only applied when the data is updated to additional InfoProviders.
  • If you want to retain history at request level. In this case you may not need to have PSA archive; instead you can use Write-Optimized DataStore.
  • If a multi dimensional analysis is not required and you want to have operational reports, you might want to use Write Optimized DataStore first, and then feed data into Standard Datastore. 
  
Functionality of Write-Optimized DataStore 

Only active data table (DSO key: request ID, Packet No, and Record No):
  • No change log table and no activation queue.
  • Size of the DataStore is maintainable.
  • Technical key is unique.
  • Every record has a new technical key, only inserts.
  • Data is stored at request level like PSA table.
No SID generation:
  • Reporting is possible(but not optimized performance)
  • BEx Reporting is switched off.
  • Can be included in InfoSet or Multiprovider.
  • Performence improvement during dataload.
Fully integrated in data flow:
  • Used as data source and data target
  • Export into info providers via request delta
  • Can be included in Process chain without activation step.
  • Partitioned on request ID (automatic).
  • Allows parallel load.
Uniqueness of Data:
  • Checkbox “Do not check Uniqueness of data”.
  • If this indicator is set, the active table of the DataStore object could contain several records with the same key. 
  • You cannot use reclustering for write-optimized DataStore objects since this DataStore data is not meant for querying. You can only use reclustering for standard DataStore objects and the DataStore objects for direct update.
    Write-Optimized DataStore is partitioned on request ID (automatic), you may not need to create additional partition manually on active table. Optimized Write performance has been achieved by request level insertions, similarly like F table in InfoCube. As we are aware, F fact table is write-optimized while the E fact table is read optimized.

    Understanding Write-Optimized DataStore keys:

    Since data is written into Write-optimized DataStore active-table directly, you may not need to activate the request as is necessary with the standard DataStore object. The loaded data is not aggregated; the history of the data is retained at request level. . If two data records with the same logical key are extracted from the source, both records are saved in the DataStore object. The record mode responsible for aggregation remains, however, the aggregation of data can take place later in standard DataStore objects.
    The system generates a unique technical key for the write-optimized DataStore object. The technical key consists of the Request GUID field (0REQUEST), the Data Package field (0DATAPAKID) and the Data Record Number field (0RECORD). Only new data records are loaded to this key.

    The standard key fields are not necessary with this type of DataStore object. Also you can define Write-Optimized DataStore without standard key. If standard key fields exist anyway, they are called semantic keys so that they can be distinguished from the technical key. 
Semantic Keys can be defined as standard keys in further target Data Store. The purpose of the semantic key is to identify error in the incoming records or duplicate records. All subsequent data records with same key are written to error stack along with the incorrect data records. These are not updated to data targets; these are updated to error stack. A maximum of 16 key fields and 749 data fields are permitted. Semantic Keys protect the data quality. Semantic keys won’t appear in database level. In order to process error records or duplicate records, you must have to define Semantic group in DTP (data transfer process) that is used to define a key for evaluation. If you assume that there are no incoming duplicates or error records, there is no need to define semantic group, it’s not mandatory. 
The semantic key determines which records should be detained when processing. For example, if you define "order number" and “item” as the key, if you have one erroneous record with an order number 123456 item 7, then any other records received in that same request or subsequent requests with order number 123456 item 7 will also be detained. This is applicable for duplicate records as well. 
Semantic key definition integrates the write-optimized DataStore and the error stack through the semantic group in DTP. With SAP NetWeaver 2004s BI SPS10, the write-optimized DataStore object is fully connected to the DTP error stack function. 
If you want to use write-optimized DataStore object in BEx queries, it is recommend that you define semantic key and that you run a check to ensure that the data is unique. In this case, the write-optimized DataStore object behaves like a standard DataStore object. If the DataStore object does not have these properties, unexpected results may be produced when the data is aggregated in the query. 
Delta Administration: 
Data that is loaded into Write-Optimized Data Store objects is available immediately for further processing. The activation step that has been necessary up to now is no longer required. Note here that the loaded data is not aggregated. If two data records with the same logical key are extracted from the source, both records are saved in the Data Store object, since the technical key for the both records not unique. The record mode (0RECORDMODE) responsible for aggregation remains, however, the aggregation of data can take place at a later time in standard Data Store objects. Write-Optimized DataStore does not support the image based delta, it supports request level delta, and you will get brand new delta request for each data load.
  • Since write-optimized DataStore objects do not have a change log, the system does not create delta (in the sense of a before image and an after image). When you update data into the connected InfoProviders, the system only updates the requests that have not yet been posted.
  • Write-Optimized Data Store supports request level delta. In order to capture before and after image delta, you must have to post latest request into further targets like Standard DataStore or Infocubes.

    Reporting Write-Optimized DataStore Data:

    For performance reasons, SID values are not created for the characteristics that are loaded. The data is still available for BEx queries. However, in comparison to standard DataStore objects, you can expect slightly worse performance because the SID values have to be created during reporting. However, it is recommended that you use them as a consolidation layer, and update the data to standard DataStore objects or InfoCubes.
       OLAP BEx query perspective, there is no big difference between Write-  
       Optimized DataStore and Standard DataStore, the technical key is not 
       visible for reporting, so the look and feel is just like regular DataStore. If 
       you want to use write-optimized DataStore object in BEx queries, it is 
       recommended that they have a semantic key and that you run a check to 
       ensure that the data is unique. In this case, the write-optimized DataStore 
       object behaves like a standard DataStore object. If the DataStore object 
       does not have these properties, unexpected results may be produced 
       when the data is aggregated in the query. 
       In a nut shell, Write Optimized DSO is not for reporting purpose, it’s a 
       staging DataStore used for faster upload. The direct reporting on this 
       object is also possible without activation but keeping in mind the 
       performance perspective you can use an infoset or multi-provider.

No comments:

Post a Comment