ADI_ADLS.png

File Formats

When data is sent to Azure Data Lake Storage, these file formats are supported:

  • Flat file: plain text or gzip

  • Parquet: uncompressed or compressed in different compression options

  • AVRO: uncompressed or compressed in different compression options

  • ORC: uncompressed or compressed in different compression options

File Partitioning / Chunking

Optionally, partition size can be configured on target files. For example, when doing an initial full load from a transactional table with 500M rows, we may want to partition the target Parquet files into 50 chunks, each containing 10M rows, in order to speed up the distributed queries. Once configured, such partitioning is done automatically by ADI during data transfer.

Maximum Performance and Scalability

ADI has significant performance and scalability advantage over other comparable tools, in terms of sending SAP data to ADLS. For example, ADI’s Parquet writer is TWICE faster than Microsoft SSIS Azure Feature Pack’s Parquet writer, and FIVE TIMES faster than ADF’s native SAP connector + Parquet writer.