Data Export Service alternative using Dataverse Web API and Azure Data Factory

  • Match the existing DES solution as close as possible in terms of architecture, technology and performance.
  • Keep the changes to the Power BI reporting to a minimum.
  • Keep the costs low considering the fact that DES is virtually free add-on service.
  • Have a facility to track and synchronise meta-data changes between Dataverse and the reporting database

De-facto alternative — Synapse Link

As per Microsoft documentation, it appears that the de-facto, ready-made alternative to DES is the Synapse Link Service (https://powerapps.microsoft.com/en-us/blog/do-more-with-data-from-data-export-service-to-azure-synapse-link-for-dataverse/). The following are some of the pros and cons that was observed after evaluating the Synapse Link Service taking in to consideration the clients requirements stated earlier.

  • No development work required and can be configured and put into production immediately
  • Has all the features of DES including handing of meta-data changes
  • Virtually no changes would be required on the Power BI reporting
  • Synapse Link Service only maintains a near real time data sync between Dataverse and a Azure Storage Account.
  • Additional Azure Data Factory (ADF) pipeline is required to push data to a SQL database although this is readily available via a ADF pipeline template on the pipeline gallery (https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-pipelines?tabs=synapse-analytics).
  • There is a considerable latency between having the data synced to the Storage Account and the subsequent push to the SQL database via the ADF pipeline, at times reaching 1 hour.
  • To take full advantage of this solution, a Synapse Workspace is required with the Power BI reporting sourced from this workspace instead of the SQL database. This could require considerable amount of changes to the Power BI reports.
  • This solution is likely to cost more than any other alternative due to the potential need of a Synapse Workspace and the additional associated compute and storage requirements.

Custom built ADF alternative

There is apparently no like-for-like DES replacement available from Microsoft. Of course there maybe other third party vendors providing packaged solutions in the market, however for this client we only explored options utilising products and services available on the Microsoft data integration platform.

  • Easier to write queries as its similar to regular SQL
  • Performs relatively well
  • There is a apparent limitation of 5000 rows that can be fetched at a time and this may require special handling within ADF
  • Columns with a large number of null values may get dropped if they are not within FetchXML sample set
  • Only a limited set of meta-data is exposed to FetchXML and may not be possible to easily build a meta-data maintenance pipeline to track and handle meta-data changes between Dataverse and SQL database using ADF
  • Queries need to be constructed as part of the URL
  • Performance similar to FetchXML
  • There is a limitation of 5000 rows per page, but this can be handled via OData paging rules within ADF REST API source connector without the need to build any additional looping
  • Columns with null values does not get dropped
  • Able to retrieve all meta-data relating to entities and attributes thereby making it possible to build a metadata maintenance pipeline to handle meta-data changes between Dataverse and SQL database using ADF
  • Also able to retrieve both OptionSetMetadata and GlobalOptionSetMetadata

High Level Design

The proposed solution consists of the two ADF pipelines; one to track and update meta-data changes between Dataverse model and the SQL database to and another to copy the data from Dataverse to SQL database. The first pipeline is the metadata maintenance pipeline which collects and maintains all the metadata to drive the data load pipeline. Although it is possible to run both pipelines together, for the best performance and operational efficiency it is recommended to separate the two and run the metadata pipeline in a controlled manner during a maintenance window where any changes made to the Dataverse model is announced in advance. This also gives a chance for the Reporting side to make any additional changes if required.




Git Repo

The SQL scripts supporting this solution has been placed in the following github repository: https://github.com/rarpal/DynamicsWebAPIToSQL.git

0 comments:

Post a Comment