Azure Synapse Analytics in the Azure Architecture CentreOctober 22, 2021
Microsoft provide an Azure Architecture Centre which contains resources to understand which Azure services are appropriate in certain scenarios and workloads. The centre contains a series of reference architectures which organisations can use as a blueprint to built out solutions. Within the Architecture Centre are reference architectures which include Azure Synapse Analytics. If you are looking to build out a solution using Synapse Analytics then perhaps these reference architectures would prove useful to ratify architecture and workload decisions.
In this blog post we’ll look at 3 architecture patterns Azure Synapse Analytics is include in and the scenario/workload pertaining to those architecture. These 3 architectures have been selected due to being reasonably general across any industry and varying in how Synapse Analytics is being used. There is also a list of reference architectures in the Reference List section.
If you would like to jump straight into a list of reference architectures available that includes Synapse Analytics then click here. There are architectures that are specific to certain industries and use-cases.
What is important to note is that whilst the Architecture Centre lists Synapse Analytics in various scenarios and workloads, and which services can work alongside Synapse Analytics, it will not provide detail into how to implement the service and what mechanisms connect each service. Microsoft specific documentation is available to help implement and work with Synapse Analytics.
Azure Architecture Centre Overview
The Azure Architecture Centre contains the following main categories to provide guidance when building out Azure solutions:
- Reference Architectures
- Microsoft Azure Well-Architected Framework
- Azure Application Architecture Guide
- What’s new in Azure Architecture Centre
The image below shows the architecture reference list under the Azure Categories heading.
The homepage has jumping off points into browsing and searching for Azure architectures alongside the Well-Architected framework and architecture guides.
The Browse Architectures link on the homepage will take you to a search page in which you can define which general products and services you wish to find architectures for. Searching for synapse analytics will return relevant reference architectures and solution ideas.
In the next section we’re focusing on a set of 3 architecture patterns that cover general Synapse Analytics usage. These architectures are not specific to any industry and look to show where Synapse Analytics sits within a data architecture.
Data Warehousing And Analytics
The link to the reference architecture is here. In terms of presenting Synapse Analytics as a storage solution, this architecture provides a reasonably simply view in terms of the usage of Synapse Analytics. From this architecture we can see that:
- Data Factory is used to load data from Relational and Non-Relational sources into Azure Storage.
- Data Factory is used to load data from Azure Storage into Synapse Analytics Dedicated SQL Pools.
- Azure Analysis Services is used as the semantic layer loading data from Synapse Analytics.
- Power BI is used to connect to Azure Analysis Services to provide the reporting and dashboarding.
Of particular interest in this architecture pattern is that it’s the only pattern that features Azure Analysis Services. Whilst there are use-cases for migrating from Azure Analysis Services to Power BI Premium, there are still plenty of use-cases where Azure Analysis Services (AAS) is a better fit. AAS is a separate service which allows the creation and deployment of Tabular models, the service has a maximum tier size of 400GB RAM and you can create replicas to scale-out during busy times.
Another interesting call-out is the use of Data Factory. Synapse Analytics contains the Pipelines service which is Data Factory, however it does not have exact feature parity. For example Data Factory contains the Power Query data flow whilst Pipelines does not. However, perhaps Data Factory in this pattern can be removed and a reference to Synapse Pipelines in its place.
In terms of migrating from an On-Premises SQL Server infrastructure that includes BI stack elements such as SQL Server, Integration Services, Analysis Services, and Reporting Services, this architecture could provide a simpler migration roadmap.
Modern Data Warehouse for Small and Medium Business
The link to the reference architecture is here. This is an interesting architecture in that only the Serverless SQL Pools service within Synapse Analytics is being used. A reasoning for this is that Serverless SQL Pools is a lightweight service to deploy, there is no data loading/movement required like the Dedicated SQL Pools service. Therefore, a platform can be built to connect to source data in a Data Lake and “prove out” the usage of Synapse Analytics. Client tools such as Power BI can connect directly to Serverless SQL Pools and load data for appropriate semantic model creation, report authoring and dashboarding.
The usage of Serverless SQL Pools within this architecture can be defined as:
- Synapse Analytics Pipelines are used to extract data of various types and load into Azure Data Lake Gen2 and structured data into SQL Database.
- Serverless SQL Pools is connected to Azure Data Lake Gen2 to query data.
- Power BI connects to Serverless SQL Pools to load data in for data modelling, reporting, and dashboarding.
In this architecture, small to medium sized businesses have the opportunity to build out a Synapse Analytics infrastructure by using Serverless SQL Pools and Pipelines. If in the future the volume of data grows to a size that Dedicated SQL Pools are required, then the infrastructure is in place.
Analytics End-to-End with Azure Synapse
The link to the reference architecture is here. In this architecture, Synapse Analytics is being used within an enterprise-wide data hub providing a single source of truth. However, this architecture looks complicated as there are a wide variety of services connecting to/being connected from Synapse Analytics. Let’s break this diagram down into smaller chunks.
- Synapse Analytics Pipelines is responsible for loading all types of data into Azure Data Lake Gen2.
- Dedicated SQL Pools are used as a central data warehouse loaded from the Data Lake.
- Serverless SQL Pools are used as an ad-hoc data exploration tool querying data in the Data Lake.
- Event Hubs/IoT Hubs and Streaming Analytics are used to process and load data into the Data Lake (long term storage) and also into Power BI for real-time analysis.
- Spark Pools are used to process data from the Data Lake and integrate with Azure Cognitive Services and Azure Machine Learning to run AI/ML algorithms over this data and return the results.
Although this architecture references a wide variety of services, it does not mean that an architecture must include all these services. Perhaps the Enrich (Azure Cognitive Services, Azure Machine Learning) does not need including in an organisations data infrastructure. At some point in the future it can certainly be added.
The following section includes links to both Architectures and Solution Ideas. These are links under each Azure Category. Architectures lay out each service and which services are connected, Solution Ideas takes those services and provides a real-world example of how the services can be used.
Azure Categories: AI + Machine Learning
- Defect prevention with predictive maintenance using analytics and machine learning: Link
Azure Categories: Analytics
- Automated Enterprise BI: Link
- Analytics end-to-end with Azure Synapse: Link
- Modern data warehouse for small and medium business: Link
- Data warehousing and analytics: Link
- Advanced analytics architecture: Link
- Big data analytics with Azure Data Explorer: Link
- Big data analytics with enterprise-grade security using Azure Synapse: Link
- Discovery Hub with cloud scale analytics: Link
- Deliver highly scalable customer service and ERP applications: Link
- Modern analytics architecture with Azure Databricks: Link
- Oil and Gas Tank Level Forecasting: Link
- Real Time Analytics on Big Data Architecture: Link
Azure Categories: Databases
- Health data consortium on Azure: Link
- DataOps for the modern data warehouse: Link
- Hybrid ETL with Azure Data Factory: Link
- Master data management with Azure and CluedIn: Link
- Master data management with Profisee and Azure Data Factory: Link
Azure Categories: Internet of Things
- IoT connected light, power, and internet for emerging markets: Link
Azure Categories: Mainframe + Midrange
- Integrate IBM mainframe and midrange message queues with Azure: Link