Use ADF and SSIS to Copy Data from SAP BW: the Comparison of Options

Executive Summary

As part of the modern enterprise data analytics journey, many SAP customers look for ways to bring SAP data onto Azure data platform, to take advantage of the storage and compute power of Azure. In the past series of articles, we explained both high level architecture and detailed technical configurations of integrating SAP data with Azure using AecorSoft features.

Specifically, when it comes to SAP BW data, Azure Data Factory already offers native SAP BW connector out of the box, while AecorSoft Integration Service also offers ADF V2 ready component for SAP BW. At high level, both approaches attempt to solve the same BW-to-Azure integration problem, but the product implementation principles and results are vastly different, which will be reviewed and explained thoroughly in this paper.

From the performance, scalability, features, and DevOps experience perspectives, AecorSoft BW connector demonstrates significant advantages over ADF native BW connector.

Test Setup

Systems

SAP IDES Ehp7 with BW, running on i7 quad-core server, with the buffer roll_area is set as default value 2GB.

All Azure services, including ADF, ADF SSIS IR, Azure blob storage, are configured in West US2 region.

SSISDB storing the SSIS packages is in East US region.

Azure Self-hosted IR local runtime runs on i7 quad-core server.

Azure SSIS IR is configured with budget setup of Standard D2_V2 with 7168MB memory, 1 node.

SSIS on-premise is i7 quad-core server.

BW Source Object

BW InfoCube: IDES demo cube

Destination

Azure Blob Storage

Test Cases

  1. InfoCube extract with 7K records

  2. InfoCube extract with 200K records

  3. InfoCube extract with 600K records

Result

Test Details

Test with ADF’s own BW Connector

Prerequisite

One of the prerequisites of using ADF BW connector is the setup of Self-hosted Integration Runtime, which has to be installed either on-premise or on Azure VM as part of the Azure virtual network. It acts as an agent to connect on-premise resources and ADF.

Pipeline Construction

Once the IR is set up and configured, a Copy activity can be defined in ADF portal. We chose Azure US West2 as the region to minimize the network latency. Through Linked Service, the SAP BW system access information can be specified. Notice that only basic system access method is available, which means there is no SAP load balancing, and all the BW data extraction activities will occur on the BW central instance.

Next step is to configure the source object. In the ADF portal, a “Browse SAP” function is provided to navigate the list of BW InfoCubes and BEx QueryCubes. When the cube is finally discovered, dimensions and measures can be selected to form the basic MDX query. There is no advanced navigation experience and configuration option available. The only alternative way to overcome the limitation is to manually build the MDX statement without using the “Browse SAP” function.

The configuration of “Sink”, or the destination, can be done well through the web experience, if the Sink is one of the Azure data storage services.

Execution

Once the pipeline is constructed, the Copy activity can be published and triggered.

The test results are:

Result

Test Cases for ADF native BW Connector


7K records


200K records


Although theoretically the buffer size can be adjusted, there has to be an upper limit. It is important to expose the scalability issue.

600K records


Test with AecorSoft’s BW Source Component

Prerequisite

AecorSoft BW component is part of the AecorSoft Integration Service component set for SSIS, which requires SQL Server Data Tools (SSDT) to develop the SSIS ETL packages. The developed packages can be deployed to on-premise SSIS server, IaaS based SSIS server, or ADF v2 SSIS IR as a complete PaaS experience. In order for SSIS IR to access on-premise resources, Azure virtual network is required.

Pipeline Construction

At design time, SSIS packages are developed in SSDT Visual Studio. The AecorSoft SAP connection manager supports load balancing which is critical to control SAP system load.

Next, in the BW InfoCube and QueryCube source object configuration, the user experience is extremely snappy without latency. Users have the choice to select dimension display in Key or Text, define filters, and specify variables for BEx queries. The support for filters and variables is extremely important for the “push-dwon” query experience. It ensures that only the query result set, instead of the entire source InfoCube or QueryCube content, is sent across the wire.

Here is an example of BEx query view with variable and filter customization interface. Please notice this is not the source object for test comparison. The tests were using BW demo cube.

Through the fluent and feature-rich user interface, the BW source configuration can be completed very productively.

The last step in the data flow design is to configure the destination component. The Azure Blob and Azure Data Lake destinations can be configured by using Azure Feature Pack for SSIS, which is a free download from Microsoft. The Azure SQL Database destination can be configured through the high performing and reliable OLE DB Destination component.

Execution

Once the SSIS data flow is built, it can be deployed onto either local SSIS server or ADF SSIS IR.

Here is the test result from the execution of on-premise SSIS packages.

Result

Test Cases for On-Prem SSIS execution with AecorSoft

7K records


200K records


600K records


Here is the result result from the execution of the packages in ADF SSIS-IR.

Result

Test Cases for ADF SSIS IR with AecorSoft

7K records


200K records


600K records


Conclusion

For light requirement of SAP BW data copy requirement with small amount of data load, simple data models, and no pressing demand for performance, ADF BW connector could satisfy such requirement just fine. If scalability, productivity, load balancing, and performance are you concerns, go for professional solution like AecorSoft BW component.