Thursday, June 4, 2009

Temp Tables vs Table Variables

The Need For Temporary Storage

When working with Stored Procedures or scripts, developers / DBAs are frequently required to work on complex SQL operations that are required to save result sets into temporary locations which are then to be pull up and used later. You might need to physically persist the state of your data somewhere. This is where functionalities provided by Temp Tables and Table Variables come into play. Both exist to provide the same functionality: a temporary storage place for result sets yet they are different in many other ways.
Let’s start with Temp tables first.

First, the basics:
How do you use temp tables in SQL Server? Simple. Let’s take a look at the example and then proceed on understanding them even further.
Temp Table:
eg

CREATE TABLE #TempEmail
(
RowId INT IDENTITY (1,1),
EmployeeId INT,
EmployeeName VARCHAR (100)
)
Notice that we use the CREATE keyword instead of DECLARE unlike a normal T-SQL variable. Further, a Temp Table object always starts with either a ‘#’ or a ‘##’. Where ‘##’ refers to a global temp table object and a single ‘#’ refers to a local temp table object. The rest of the syntax is just like a normal permanent table.
Local Temporary Tables And Global Temporary Tables
The use of either ‘#’ indicates to SQL Server that this table object is visible to it only during this particular session of SQL Server i.e., only the connection that has created this table can access it. You are therefore creating it as a private temporary table.
You can create a global temporary table from any connection or database and this table is accessible from any connection to enable with the performing of normal operations on it. Therefore, with this property in mind, you should be careful in creating global temporary table names as this might lead to a conflict when other connections create a similar global temporary table with the same name. A global temporary table will persist only till the creating connection stops.

Question: When do I use what?

Use a global temporary table object if you want the table to persist only during the life time of the connection and want it to be accessible from any database and another other session. A local temporary table will be accessible only by the level that created it (eg: Stored Procedure, Triggers and when they are nested (A stored procedure call from within another stored procedure).
To be frank, I haven’t seen or heard of a real-time scenario where a global temporary table variable could really be put to use because the moment you think of a situation where it could be used, the first thought that might cross your mind will be that it will make more sense to store that information in a permanent table instead.

Temporary Tables and tempdb

A temporary table has a representation within the tempdb. This means that it is physically stored on disk. However, there is a little misconception regarding this aspect and there are exceptions when this does not happen. For example when SQL Server finds out that there is enough memory to spare, their pages are instead contained within cache blocks. This is common when you tend to create, update and delete them pretty quickly and hence, they tend to get stored in memory and a physical operation on disk is avoided by SQL Server as it really is not required.

Features of Temp Tables

We’ll list out features that differentiate a Temp Table between either a Permanent Table or a Table variable. These pointers will be helpful to keep in mind when you consider Table Variables in the next post.

Scope: Within a connection, a temporary table object is visible to the creating level and inner levels (nested). For example, if you create a stored procedure and declare a temporary table object within it, you can call another stored procedure from that stored procedure (a nested stored procedure) and perform operations like inserting, updating and deleting that temporary table object. Once the main creating level terminates, the temp table is automatically destroyed. But don’t be too complacent – you’ll have to wait for the system to perform a clean up and therefore, it is highly recommended that you manually drop your temporary table.

Locking: The prospect of table locking is reduced when it comes to local temporary tables since this table is being used by only one user. One aspect where you might want to keep this in mind is that if you cancel a transaction which contains the creation of a temp table object and then cancel that query, an exclusive and update lock can appear on the tempdb. This lock will persist till the complete transaction has closed with a COMMIT or a ROLLBACK

Logging: There is less logging involved with temporary tables compared to permanent tables.

Transaction: When using a temporary table, a temporary table is an integral part of an outer transaction and therefore, ROLLBACKs must be supported by Logging

Indexing: We can create indexes on temporary tables explicitly on them. Hence, there is scope for performance enhancement when you talk about temp tables.

Constraints: All constraints are available for exploiting on a temp table EXCEPT when it comes to referring a FOREIGN KEY constraint.

Statistics: SQL Server can create Statistics for temp tables just like we do for permanent tables and therefore, the query optimizer has the option of choosing different plans. Hence, with this in mind, be aware of the scope of Stored Procedure Recompiles.

Recompiles: There is scope for a large number of Stored Procedure Recompiles especially when you have DDL statements mixed anywhere within your Stored Procedure.

Temp Table Size: Can hold any volume of data. This will be a strong part of the deciding factor when you want to choose between a temporary table and a table variable.

I hope you’ve got a gist of what SQL Server Temp Tables can do and some of the main pointers regarding them. If you have any points that you feel should be addressed or should be pointed out, please feel free to give a buzz and we shall discuss (and that’s the point of this blog!). I was expecting to write a quickie post since it’s way past midnight but I think I’ve already crossed a couple of pages. In my next post, we’ll do a similar study on Table Variables and establish some solid ground on both SQL Server features before we get to the crux of what I really wanted to convey – When to use what and why!
Coming to the topic of Table Variables, the one thing I’ve noticed so frequently is that when you talk about Table Variables to most developers, they automatically have this preconceived notion that they’re the best option compared to Temp Tables for better performance. The fact of the matter is that this is only true in certain cases and absolutely not in most other cases where variable factors can come into play.

Table Variables
Before we start, let’s review the basics. As we all know, when working on the database front, there are many situations where we might need to persist result sets in a temporary location which would then be required to be retrieved later. SQL Server achieves this through the use of Temp Tables and Table Variables.Let me illustrate the use of a Temp Variable eg:
DECLARE @Employee TABLE
(
RowId INT IDENTITY (1,1),
EmployeeId INT,
EmployeeName VARCHAR (100)
)
Some observations:
1. Unlike a normal Table or a Temp Table, we use the DECLARE keyword instead of CREATE
2. We also use the ‘@’ symbol preceeding our Table Variable name like the use of any other variable.

Features of Table Variables
Now that you’ve got a hang around working with a Table Variable, let me mention the main pointers that we need to keep in mind while working with.

Transactions: Table Variables are not bound to any transaction as they are just like any other variable (in most cases .i.e).
You could try these simple set of queries to prove it:

BEGIN TRAN
DECLARE @Employee TABLE
(
RowId INT IDENTITY (1,1),
EmployeeId INT,
EmployeeName VARCHAR (100)
) INSERT INTO @Employee
SELECT 101, ‘John Carmack’ UNION
SELECT 102, ‘Jay Wilbur’ UNION
SELECT 103, ‘Adrian Carmack’ UNION
SELECT 104, ‘George P. Alexander Jr. ‘ SELECT * FROM @Employee — see the Table Variable values? ROLLBACK — we roll back here. Normal Table or Temp Table would loose everything SELECT * FROM @Employee — we still see an Employee table with John, Jay and Adrian!
If you try the same thing with a Temp Table, doing a SELECT * at the end of everything would give you an error since the Table was never created in the first place (due to a ROLLBACK). Now this is interesting because as there is less logging and Database locking involved, performance can come into consideration.
Minimum Constraints: A Table Variable permits us to use only the PRIMARY KEY, UNIQUE KEY and NULL constraint only. What this implies behind the scenes is that we can have unique indexes. The only possibility of creating a non-unique index is if we add attributes and make that blend unique and have a PRIMARY KEY or a UNIQUE KEY on the combination we just made.
And yeah, this point also implies that we don’t have FOREIGN KEY constraints incase you asked..
No SELECT INTO: We cannot use a SELECT INTO with Table Variables in SQL Server 2000 though the feature is available with Table Variables in SQL Server 2005. Likewise, we can also have INSERT INTO working with Table Variables against a SELECT but not against an EXEC StoredProcedure.
Eg:
INSERT INTO @Employee SELECT EmployeeId, EmployeeName FROM TempTable will work while an
INSERT INTO @Employee StoredProcedure will not work.
Similarly, SELECT ‘4005′,’John Carmack’ INTO @Employee won’t work either.
No ALTER TABLE Variable: We cannot ALTER a Table once it has been declared. This may look a little rigid but remember that recompilations can come out like wild fire when there are DDL (Data Definition Language .i.e Schema) changes and therefore, this helps to avoid recompilations.
Table Variable Name!=Table Variable Column Name: err�I don’t know a better and shorter way to say that a Table Variable’s name cannot be the same as a Table Variable’s column name.
Scope: Just like any other variable, a Table Variable’s scope exists only within the context of the current level. Therefore, unlike Temp Tables, it is not accessible to sub levels (of Stored Procedures)

Table Variables and The TempDb: Okay, now I’ll touch upon one of the most common myths that exist among developers: that Table Variables have nothing to do with TempDb and therefore, they have no physical representation in the TempDb and therefore, they reside in ONLY memory� and therefore they’re the best option for efficient processing.
Not entirely true. Table Variables do indeed have a physical representation within the TempDb and this can proved with a simple query in your database against the TempDb:

SELECT * FROM tempdb.INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME LIKE ‘%#%’ — no result in TABLE_NAME column
GO DECLARE @Employee TABLE
(
RowId INT IDENTITY (1,1),
EmployeeId INT,
EmployeeName VARCHAR (100)
) SELECT * FROM tempdb.INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME LIKE ‘%#%’ — notice that a new #something exists in TABLE_NAME columnGOSo yes, Table Variables do have a physical representation within the TempDb. So how does this affect our understanding of Temp Tables residing in TempDb and Table Variables also residing in TempDb?
The fact of the matter is that this depends entirely on the scenario and in certain cases, Temp Variables are faster and in certain cases, Temp Tables are faster. Table Variable’s content will reside in memory when it is available but when that limit gets crossed, it will resort to physical I/O.
No Statistics: When it comes to Table Variables, the SQL Server optimizer does not create distribution statistics. Therefore, you run the risk of referring not-so-good query plans when the SQL Server optimizer selects after checking up with histograms. And if you consider this aspect with Tables Variables that contain huge amounts of data, we fall into serious I/O thrashing. Hence, as stated in the closing section of the last point, we have to have a thorough understanding of our scenario to choose a temporary object for the context.
So I hope you got a gist of more than a simple “what Table Variables are”. Essentially, you might have noticed that their usage has a lot of strings attached also though, at the same time, there are seemingly certain performance gains while in certain situations, the tables (no pun intended) can be turned and invariable we can end up with inefficient queries that may lead to a performance hit.
Feel free to raise any point or drop in something that I might have missed or anything from your experience that might help us glean even further into this beautiful functionality that SQL Server developers have at their fingertips while building the back end of their applications.
We’ll finish of this series with one more post that will compare both Table Variables and Temp Tables so that we can understand the best possible scenario to use each one.

Ref : http://sqlpractices.wordpress.com

No comments:

Post a Comment