Using Temporary Tables within Serverless SQL PoolsAugust 29, 2021
Temporary (temp) tables have been a feature of Microsoft SQL Server (and other database systems) for many years. Temp tables are supported within Azure Synapse Analytics in both Dedicated SQL Pools and Serverless SQL Pools. However, the Serverless SQL Pools “Polaris” engine is a newly built engine, so can we expect the same support for temp tables as seen in SQL Server, Azure SQL Database, and Synapse Dedicated SQL Pools? The short answer is no, there are caveats to using temp tables within Serverless SQL Pools.
In this blog we’ll walk through supported and unsupported scenarios and look at the limitations of temp tables within Serverless SQL Pools.
Supported Features & Scenarios
Here is a summary table of scenarios for inserting, joining, and filtering using temp tables. We’ll be exploring the main insert and join scenarios in this blog post.
|Create a local temp table||Yes|
|Create a global temp table||No|
|INSERT INTO #temptable VALUES(‘<value>’),(‘value’)||Yes|
|INSERT INTO #temptable SELECT FROM External table/View/OPENROWSET||No|
|INSERT INTO #temptable SELECT FROM Spark External Table||No|
|INSERT INTO #temptable EXEC storedprocedure (External Table/View/OPENROWSET)||Yes|
|INSERT INTO #temptable using a WITH…then INSERT||No|
|JOIN to External tables/Views/OPENROWSET queries||No|
|JOIN to other temp tables||Yes|
|WHERE IN (SELECT FROM #temptable)||No|
|Maximum Size Per Temp Table||100MB|
|Maximum Number of Temp Tables||100|
Creating Temp Tables
When creating a temp table, only local temp tables are supported and run within the same query execution context. You are unable to create a temp table that can then be referenced by another process outside the current execution context.
CREATE TABLE #tempTable ( RowID INT, TextComment VARCHAR(255), CreatedDateTime DATETIME );
Inserting Data into Temp Tables
After we have created a temp table, what are the options available to insert data into the temp table? Not all expected methods are supported.
We can use a simple multi-value insert statement to insert values into a temp table.
--create temp table CREATE TABLE #loopdates ( rownum INT, textvalue VARCHAR(10) ); --insert values INSERT INTO #loopdates( rownum, textvalue ) VALUES (1,'one'), (2,'two'), (3,'three'), (4,'Four');
Inserting using Select from an External Table, View, or OPENROWSET query
In this scenario we are not able to insert the results of a SELECT statement from either an External Table, a View, or an OPENROWSET query. We receive the following error is we attempt this.
--create temp table CREATE TABLE #factsalesaggregate ( OrderDateKey INT, TotalQuantity BIGINT ); --insert from View with OPENROWSET query to Azure Data Lake --insert from View with OPENROWSET query to Azure Data Lake INSERT INTO #factsalesaggregate ( OrderDateKey, TotalQuantity ) SELECT fct.OrderDateKey, SUM(fct.Quantity) FROM OPENROWSET ( BULK 'conformednew/facts/factsales/*/*/*/*.parquet', DATA_SOURCE = 'ExternalDataSourceDataLake', FORMAT = 'Parquet' ) AS fct GROUP BY fct.OrderDateKey;
Inserting using the results of an Exec Proc Statement
However, we are able to use the results of an EXEC Stored Procedure statement to insert into a temp table. We could therefore wrap the SELECT statement in a stored procedure and insert into a temp table. For code brevity, a View has been created for selecting Sales data rather than repeating the OPENROWSET syntax for each select.
--create wrapper procedure CREATE PROC LDW.uspGetFactSales AS BEGIN --aggregate query to select from OPENROWSET View from Azure Data Lake SELECT OrderDateKey, SUM(Quantity) AS TotalQuantity FROM LDW.vwFactSales GROUP BY OrderDateKey END; --create tamp table CREATE TABLE #factsalesaggregate ( OrderDateKey INT, TotalQuantity BIGINT ); --insert results of the procedure INSERT INTO #factsalesaggregate EXEC LDW.uspGetFactSales --show data SELECT * FROM #factsalesaggregate;
This time our efforts are successful:
We can therefore see that although we cannot insert from a select statement directly, we are able to wrap the select within a stored procedure and insert data into a temp table.
Joining Temp Tables
Once we have data within a temp table, what can we do with it? Can we join to other Serverless SQL Pools objects? The following section shows that we are limited in scope to joining temp tables.
Joining Temp Tables with Serverless SQL Objects
When attempting to join a temp table with other Serverless SQL Pools objects such as external tables and views we are unable to do so. This also includes external tables created in Spark and accessible through Serverless SQL Pools.
--create temp table CREATE TABLE #reportdatekeys ( reportdatekey INT ); --insert values INSERT INTO #reportdatekeys ( reportdatekey ) VALUES ('20130401'), ('20130402'), ('20130403') --join to select SELECT fct.OrderDateKey, SUM(fct.Quantity) AS TotalQuantity FROM LDW.vwFactSales fct INNER JOIN #reportdatekeys rdk ON rdk.reportdatekey = fct.OrderDateKey GROUP BY fct.OrderDateKey;
Joining Temp Tables Together
However, we are able to join a temp table to another temp table. Let’s use the wrapper SELECT stored procedure from earlier and join to another temp table.
--create tamp table CREATE TABLE #factsalesaggregate ( OrderDateKey INT, TotalQuantity BIGINT ); --insert results of the procedure INSERT INTO #factsalesaggregate EXEC LDW.uspGetFactSales --create temp table for report dates CREATE TABLE #reportdatekeys ( reportdatekey INT ); --insert values INSERT INTO #reportdatekeys ( reportdatekey ) VALUES ('20130401'), ('20130402'), ('20130403'); --restrict data by joining SELECT fct.OrderDateKey, fct.TotalQuantity FROM #factsalesaggregate fct INNER JOIN #reportdatekeys rdk ON rdk.reportdatekey = fct.OrderDateKey;
According to the official Microsoft documentation there can be a total of 100 temp tables created, with their total size being 100MB. However, during testing this did not seem to be the case with a much larger dataset being inserted successfully into a temp table. A question has been raised with Microsoft to clarify the 100MB limit, you can track the message here.
Looping Process Scenario
We could use temp tables to control a looping process, this could be useful for executing loading logic from a source to a destination controlled by values from a temp table.
--create temp table CREATE TABLE #loopdates ( rownum INT, textvalue VARCHAR(10) ); --insert values INSERT INTO #loopdates2 VALUES (1,'one'), (2,'two'), (3,'three'); --loop through temp table DECLARE @startloop INT = 1, @endloop INT SELECT @endloop = MAX(rownum) FROM #loopdates WHILE @startloop <= @endloop BEGIN SELECT * FROM #loopdates WHERE rownum = @startloop SET @startloop = @startloop +1 END
Although temp tables are supported within Synapse Analytics Serverless SQL Pools their usage is limited and there are caveats around being able to insert data and also join temp tables. Perhaps temp tables may be useful for situations where a small dataset needs holding to provide looping logic, or to hold a small dataset for reporting purposes.