Friday, March 20, 2009

Accessing a Data from an Application

Before I describe dynamic SQL, I like to briefly discuss the various ways you can access data from an application to give an overview of what I'll be talking about in this article.
(Note: all through this text I will refer to client as anything that accesses SQL Server from the outside. In the overall application architecture that may in fact be a middle tier or a business layer, but as that is of little interest to this article, I use client in the sake of brevity.)
There are two main roads to go, and then there are forks and sub-forks.
1. Send SQL statements from the client to SQL Server.
2. Rely on SQL generated by the client API, using options like CommandType.TableDirect and methods like .Update.
3. Compose the SQL strings in the client code.
4. Build the entire SQL string with parameter values expanded.
5. Use parameterised queries.
6. Perform access through stored procedures.
7. Stored procedures in T-SQL
8. Use static SQL only.
9. Use dynamic SQL together with static SQL.
10. Stored procedures in a CLR language such as C# or VB .Net. (SQL 2005 only.)
Fork 1-a may be good for simple tasks, but you are likely to find that you outgrow it as the complexity of your application increases. In any case, this approach falls entirely outside the scope of this article.
Many applications are built along the principles of fork 1-b, and as long as you take the sub-fork 1-b-ii, it does not have to be bad. (Why 1-b-i is bad, is something I will come back to. Here I will just drop two keywords: SQL Injection and Query-Plan Reuse.) Nonetheless, in many shops the mandate is that you should use stored procedures. When you use stored procedures with only static SQL, users do not need direct permissions to access the tables, only permissions to execute the stored procedures, and thus you can use the stored procedure to control what users may and may not do.
The main focus for this text is sub-fork 2-a-ii. When used appropriately, dynamic SQL in stored procedures can be a powerful addition to static SQL. But some of the questions on the newsgroups leads to dynamic SQL in stored procedures that is so meaningless, that these people would be better off with fork 1-b instead.
Finally, fork 2-b, stored procedures in the CLR, is in many regards very similar to fork 1-b, since all data access from CLR procedures is through generated SQL strings, parameterised or unparameterised. If you have settled on SQL procedures for your application, there is little point in rewriting them into the CLR. However, CLR code can be a valuable supplement for tasks that are difficult to perform in T-SQL, but you yet want to perform server-side.
Introducing Dynamic SQL
In this chapter I will first look at some quick examples of dynamic SQL and point out some very important implications of using dynamic SQL. I will then describe sp_executesql and EXEC() in detail, the two commands you can use to invoke dynamic SQL from T-SQL.
A First Encounter
Understanding dynamic SQL itself is not difficult. Au contraire, it's rather too easy to use. Understanding the fine details, though, takes a little longer time. If you start out using dynamic SQL casually, you are bound to face accidents when things do not work as you have anticipated.
One of the problems listed in the introduction was how to write a stored procedure that takes a table name as its input. Here are two examples, based on the two ways to do dynamic SQL in Transact-SQL:
CREATE PROCEDURE general_select1 @tblname sysname,
@key varchar(10) AS
DECLARE @sql nvarchar(4000)
SELECT @sql = ' SELECT col1, col2, col3 ' +
' FROM dbo.' + quotename(@tblname) +
' WHERE keycol = @key'
EXEC sp_executesql @sql, N'@key varchar(10)', @key
CREATE PROCEDURE general_select2 @tblname nvarchar(127),
@key varchar(10) AS
EXEC('SELECT col1, col2, col3
FROM ' + @tblname + '
WHERE keycol = ''' + @key + '''')
Before I say anything else, permit me to point out that these are examples of bad usage of dynamic SQL. Passing a table name as a parameter is not how you should write stored procedures, and one aim of this article is to explain this in detail. Also, the two examples are not equivalent. While both examples are bad, the second example has several problems that the first does not have. What these problems are will be apparent as you read this text.
Whereas the above looks very simple and easy, there are some very important things to observe. The first thing is permissions. You may know that when you use stored procedures, users do not need permissions to access the tables accessed by the stored procedure. This does not apply when you use dynamic SQL! For the procedures above to execute successfully, the users must have SELECT permission on the table in @tblname. In SQL 2000 and earlier this is an absolute rule with no way around it. SQL 2005 provides alternative ways, something I will come back to in the section The Permission System.
Next thing to observe is that the dynamic SQL is not part of the stored procedure, but constitutes its own scope. Invoking a block of dynamic SQL is akin to call a nameless stored procedure created ad-hoc. This has a number of consequences:
• Within the block of dynamic SQL, you cannot access local variables (including table variables) or parameters of the calling stored procedure. But you can pass parameters – in and out – to a block of dynamic SQL if you use sp_executesql.
• Any USE statement in the dynamic SQL will not affect the calling stored procedure.
• Temp tables created in the dynamic SQL will not be accessible from the calling procedure since they are dropped when the dynamic SQL exits. (Compare to how temp tables created in a stored procedure go away when you exit the procedure.) The block of dynamic SQL can however access temp tables created by the calling procedure.
• If you issue a SET command in the dynamic SQL, the effect of the SET command lasts for the duration of the block of dynamic SQL only and does not affect the caller.
• The query plan for the stored procedure does not include the dynamic SQL. The block of dynamic SQL has a query plan of its own.
As you've seen there are two ways to invoke dynamic SQL, sp_executesql and EXEC(). sp_executesql was added in SQL 7, whereas EXEC() has been around since SQL 6.0. In application code, sp_executesql should be your choice 95% of the time for reasons that will prevail. For now I will only give two keywords: SQL Injection and Query-Plan Reuse. EXEC() is mainly useful for quick throw-away things and DBA tasks, but also comes to the rescue in SQL 2000 and SQL 7 when the SQL string exceeds 4000 characters. And, obviously, in SQL 6.5, EXEC() is the sole choice. In the next two sections we will look at these two commands in detail.
sp_executesql
sp_executesql is a built-in stored procedure that takes two pre-defined parameters and any number of user-defined parameters.
The first parameter @stmt is mandatory, and contains a batch of one or more SQL statements. The data type of @stmt is ntext in SQL 7 and SQL 2000, and nvarchar(MAX) in SQL 2005. Beware that you must pass an nvarchar/ntext value (that is, a Unicode value). A varchar value won't do.
The second parameter @params is optional, but you will use it 90% of the time. @params declares the parameters that you refer to in @stmt. The syntax of @params is exactly the same as for the parameter list of a stored procedure. The parameters can have default values and they can have the OUTPUT marker. Not all parameters you declare must actually appear in the SQL string. (Whereas all variables that appear in the SQL string must be declared, either with a DECLARE inside @stmt, or in @params.) Just like @stmt, the data type of @params in SQL 7/2000 is ntext and nvarchar(MAX) in SQL 2005.
The rest of the parameters are simply the parameters that you declared in @params, and you pass them as you pass parameters to a stored procedure, either positional or named. To get a value back from your output parameter, you must specify OUTPUT with the parameter, just like when you call a stored procedure. Note that @stmt and @params must be specified positionally. You can provide the parameter names for them, but these names are blissfully ignored.
Let's look at an example. Say that in your database, many tables have a column LastUpdated, which holds the time a row last was updated. You want to be able to find out how many rows in each table that were modified at least once during a period. This is not something you run as part of the application, but something you run as a DBA from time to time, so you just keep it as a script that you have a around. Here is how it could look like:
DECLARE @tbl sysname,
@sql nvarchar(4000),
@params nvarchar(4000),
@count int

DECLARE tblcur CURSOR STATIC LOCAL FOR
SELECT object_name(id) FROM syscolumns WHERE name = 'LastUpdated'
ORDER BY 1
OPEN tblcur

WHILE 1 = 1
BEGIN
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0
BREAK

SELECT @sql =
N' SELECT @cnt = COUNT(*) FROM dbo.' + quotename(@tbl) +
N' WHERE LastUpdated BETWEEN @fromdate AND ' +
N' coalesce(@todate, ''99991231'')'
SELECT @params = N'@fromdate datetime, ' +
N'@todate datetime = NULL, ' +
N'@cnt int OUTPUT'
EXEC sp_executesql @sql, @params, '20060101', @cnt = @count OUTPUT

PRINT @tbl + ': ' + convert(varchar(10), @count) + ' modified rows.'
END

DEALLOCATE tblcur
I've put the lines that pertain directly to the dynamic SQL in bold face. You can see that I have declared the @sql and @params variables to be of the maximum length for nvarchar variables in SQL 2000. In SQL 2005, you may want to make it a routine to make @sql nvarchar(MAX), more about this just below.
When I assign the @sql variable, I am careful to format the statement so that it is easy to read, and I leave in spaces to avoid that two concatenated parts are glued together without space in between, which could cause a syntax error. I put the table name in quotename() in case a table name has any special characters in it. I also prefix the table name with "dbo.", which is a good habit, as we will see when we look at dynamic SQL and query plans. Overall, I will cover this sort of good practices more in detail later in the text. Note also the appearance of '' around the date literal – the rule in T-SQL is that to include the string delimiter in a string, you must double it.
In this example, the dynamic SQL has three parameters: one mandatory input parameter, one optional input parameter, and one output parameter. I've assumed that this time the DBA wanted to see all changes made after 2006-01-01, which is why I've left out @todate in the call to sp_executesql. Since I left out one variable, I must specify the last, @cnt by name – the same rules as when you call a stored procedure. Note also that the variable is called @cnt in the dynamic SQL, but @count in the surrounding script. Normally, you might want to use the same name, but I wanted to stress that the @cnt in the dynamic SQL is only visible within the dynamic SQL, whereas @count is not visible there.
You may note that I've prepend the string literals with N to denote that they are Unicode strings. As @sql and @params are declared as nvarchar, technically this is not necessary (as long as you stick your 8-bit character set). However, would you provide any of the strings directly in the call to sp_executesql, you must specify the N, as in this fairly silly example:
EXEC sp_executesql N'SELECT @x', N'@x int', @x = 2
If you remove any of the Ns, you will get an error message. Since sp_executesql is a built-in stored procedure, there is no implicit conversion from varchar.
You may wonder why I do not pass @tbl as a parameter as well. The answer is that you can't. Dynamic SQL is just like any other SQL. You can't specify a table name through a variable in T-SQL, that's the whole story. Thus, when you need to specify things like table names, column names etc dynamically, you must interpolate them into the string.
If you are on SQL 2000 or SQL 7, there is a limitation with sp_executesql when it comes to the length of the SQL string. While the parameter is ntext, you cannot use this data type for local variables. Thus, you will have to stick to nvarchar(4000). In many cases this will do fine, but it is not uncommon to exceed that limit. In this case, you will need to use EXEC(), described just below.
On SQL 2005, this is not an issue. Here you can use the new data type nvarchar(MAX) which can hold as much data as ntext, but without the many restrictions of ntext.
EXEC()
EXEC() takes one parameter which is an SQL statement to execute. The parameter can be a concatenation of string variables and string literals, but cannot include calls to functions or other operators. For very simple cases, EXEC() is less hassle than sp_executesql. For instance, say that you want to run UPDATE STATISTICS WITH FULLSCAN on some selected tables. It could look like this:
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0 BREAK
EXEC('UPDATE STATISTICS [' + @tbl + '] WITH FULLSCAN')
In the example with sp_executesql, I used qoutename(), but here I've let it suffice with adding brackets, in case there is a table named Order Details (which there is in the Northwind database). Since EXEC only permits string literals and string variables to be concatenated and not arbitrary expressions, this is not legal:
EXEC('UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN')
Best practice is to always use a variable to hold the SQL statement, so the example would better read:
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0 BREAK
SELECT @sql = 'UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN'
EXEC(@sql)
The fact that you can concatenate strings within EXEC() can permit you to make very quick things, which can be convenient at times, but can lead to poor habits in application code. However, there are situations where this is an enormous blessing. As I mentioned, in SQL 7 and SQL 2000, you can in practice only use 4000 characters in your SQL string with sp_executesql. EXEC does not have this limitation, since you can say
EXEC(@sql1 + @sql2 + @sql3)
Where all of @sql1, @sql2 and @sql3 can be 4000 characters long – or even 8000 characters as EXEC() permits you to use varchar.
Since you cannot use parameters, you cannot as easily get values out from EXEC() as you can with sp_executesql. You can, however, use INSERT-EXEC to insert the result set from EXEC() into a table. I will show you an example later on, when I also show you how you can use EXEC() to pass longer strings than 4000 characters to sp_executesql.
In SQL 2005, EXEC() permits impersonation so that you can say:
EXEC(@sql) AS USER = 'mitchell'
EXEC(@sql) AS LOGIN = 'CORDOBA\Miguel'
This is mainly a syntactical shortcut that saves you from embedding the invocation of dynamic SQL in EXECUTE AS and REVERT. (I discuss these statements more in detail in my article Granting Permissions Through Stored Procedures.)
SQL 2005 adds a valuable extension to EXEC(): you can use it to execute strings on linked servers. I will cover this form of EXEC() in a separate section later in this Blog.

No comments:

Post a Comment