Sunday, March 29, 2009

New Features Of SQL2005

The fanfare from the US TechEd 2005 this past June was vintage Microsoft: lots of happy noise about miraculous products with tongue-in-cheek delivery dates, as many of us suppressed a yawn. But let's give credit where it's due; we've waited a long time for a new SQL Server, and the features Microsoft is delivering with its upgraded database technology are what we've asked for and then some. Here's a rundown on the features we'll find most useful.
1. There is now an XML data type
If there's any feature of SQL Server 2005 to jump up and down about, it's the new native XML data type. Why? Apart from the giant leap forward of an already Web-friendly agenda, the new type offers us design options that are atypical of Microsoft, which generally likes to do our designing for us.
The new XML data type:
• can be used in a table column
• can be used in a stored procedure, as a parameter or as a variable
• can store untyped data
• can check against a schema to see if data stored in a column typed as XML matches that associated schema (if there's no schema, the data is considered untyped)
And the mapping between XML data and relational data is bidirectional.
2. Distributed Management Objects (DMO) becomes SQL Server Management Objects (SMO)
SQL Server Management Objects (SMO) is a .NET Framework-based management framework that lets you create custom applications for server management. SMO (like DMO before it) allows you to handle columns, tables, databases, and servers as objects, programmatically -- and SMO supports SQL Server 2005's new features, like Service Broker.
SMOs are optimised, not instantiating objects fully (with all the properties retrieved) until the object is explicitly reference. You can also batch SQL commands, and create scripts to create objects.
Your custom server management apps can be used to manage SQL Server 7 in SQL Server 2000 systems as well.
3. Common Table Expresssions (CTEs) -- recursive queries
A common table expression (CTE) enables queries to be recursive. A CTE can be self-referential, with an upper limit on the incursions. You can use the CTE as a part of a WITH, in a SELECT, UPDATE, INSERT or DELETE command.
4. The Service Broker makes SQL Server traffic asynchronous
There's a front-end queuing system, and it changes everything. You can now manage SQL Server traffic by rendering it asynchronous with the new Service Broker feature. It enhances scalability by enabling your system to handle more traffic logically that it can handle physically. The Service Broker can be accessed via SQL commands and allows transactions to include queued events.
Those who know me well would never accuse me of being a Microsoft disciple, but this feature impresses me in no small measure and I'm pleased to call attention to it. Adding easily-configured asynchronicity to the data layer of an enterprise system is a boon to developers and opens up huge possibilities for Web apps. The economy with which those apps can now scale can't be overstated. Service Broker alone is a reason to consider upgrading to SQL Server 2005.
5. Create .NET triggers
SQL Server 2005 is .NET-integrated to a promising degree (it has distressed us for some time that Microsoft's commitment to .NET is as hedged as it is), and one useful consequence of this integration is the ability to create user-defined triggers (UDTs) through Visual Studio 2005.
The Trigger option can be pulled from the template list in Visual Studio, generating a file for the code to be triggered. The mechanism tying this code to SQL is a SqlPipe. It's deployed in your Build | Deploy. You can work it in the other direction (i.e., from CLR) by referencing the Trigger object in a T-SQL CREATE TRIGGER command.
6. SQL Server 2005 configuration is dynamic
If you're running SQL Server 2005 on Windows Server 2003, its configuration is fully dynamic -- you can change configuration values on-the-fly without restarting the server, and get immediate response (the same is true for Address Windowing Extensions).
7. Define your own data types
The user-defined type, enabled by the integration of SQL Server 2005 and the .NET CLR, is a consolidation of previous practices, allowing you to create application- or environment-specific types. You can extend more general types into variations that only except values you define -- no more triggering or constraints. Validation is built into the field.
8. Many active result sets, one connection
This is another feature not just to make note of, but to get excited about. MARS (Multiple Active Result Sets) enables you to execute multiple queries yielding multiple results, over a single connection. An application can move between open result sets as needed. The performance and scalability benefits are obvious.
This new trick is courtesy of the new ADO.NET, in tandem with SQL Server 2005's ability to accommodate multiple active commands. Since MARS is part SQL Server 2005 and part ADO.NET 2.0, it is only available if you're using both.
9. WAITFOR ... RECEIVE
In previous versions of SQL, WAITFOR was static. We fed it some wait-time value, and that was what it could do. Now WAITFOR is dynamic; tell it to wait for a RECEIVE statement's results, whenever that might be delivered.
Beyond the usual this-is-cool, we can appreciate this feature because of the manner in which it accommodates the new Service Broker (see #2). Since Service Broker makes database query ability asynchronous via queuing (and therefore extremely dynamic), and a particular database query may sit in a queue for an undetermined period, the new dynamic WAITFOR his ideal for responding to RECEIVE results that will emerge at the discretion of Service Broker.
10. DTS is now Integration Services
There's a new architecture underlying data transformation. The very popular and widely used DTS is now Integration Services, and consists of a Data Transformation Pipeline and a Data Transformation Runtime.
The pipeline connects data source to data target by means of data adapters, with transformations between them. It's a conventional structure, but implemented in such a way as to enable considerable complexity: for instance, you can do one-to-many mappings, and create columns with output derived from a transform.
The Data Transformation Runtime gives you components for organising data loading and transformation processes into production-oriented operations, within which you can manage connections and manipulate variables. It's basically a run-time object framework that can be bundled into managed .NET apps.
DTP and DTR components are used to create Integration Services packages, similar in principle to the familiar DTS packages but with much greater levels of configurability and control, particularly in the area of workflow.

Tuesday, March 24, 2009

Clustered and Non-Clustered Index

Non-Clustered Index
Creating a Table
To better explain SQL Server non-clustered indexes; let’s start by creating a new table and populating it with some sample data using the following scripts. I assume you have a database you can use for this. If not, you will want to create one for these examples.
Create Table DummyTable1
(
EmpId Int,
EmpName Varchar(8000)
)
When you first create a new table, there is no index created by default. In technical terms, a table without an index is called a “heap”. We can confirm the fact that this new table doesn’t have an index by taking a look at the sysindexes system table, which contains one for this table with an of indid = 0. The sysindexes table, which exists in every database, tracks table and index information. “Indid” refers to Index ID, and is used to identify indexes. An indid of 0 means that a table does not have an index, and is stored by SQL Server as a heap.
Now let’s add a few records in this table using this script:
Insert Into DummyTable1 Values (4, Replicate ('d',2000))
GO
Insert Into DummyTable1 Values (6, Replicate ('f',2000))
GO
Insert Into DummyTable1 Values (1, Replicate ('a',2000))
GO
Insert Into DummyTable1 Values (3, Replicate ('c',2000))
GO
Now, let’s view the contests of the table by executing the following command in Query Analyzer for our new table.
Select EmpID From DummyTable1
GO
Empid
4
6
1
3
As you would expect, the data we inserted earlier has been displayed. Note that the order of the results is in the same order that I inserted them in, which is in no order at all.
Now, let’s execute the following commands to display the actual page information for the table we created and is now stored in SQL Server.
dbcc ind(dbid, tabid, -1) – This is an undocumented command.
DBCC TRACEON (3604)
GO
Declare @DBID Int, @TableID Int
Select @DBID = db_id(), @TableID = object_id('DummyTable1')
DBCC ind(@DBID, @TableID, -1)
GO
This script will display many columns, but we are only interested in three of them, as shown below.
PagePID IndexID PageType
26408 0 10
26255 0 1
26409 0 1
Here’s what the information displayed means:
PagePID is the physical page numbers used to store the table. In this case, three pages are currently used to store the data.
IndexID is the type of index,
Where:
0 – Datapage
1 – Clustered Index
2 – Greater and equal to 2 is an Index page (Non-Clustered Index and ordinary index),
PageType tells you what kind of data is stored in each database,
Where:
10 – IAM (Index Allocation MAP)
1 – Datapage
2 – Index page
Now, let us execute DBCC PAGE command. This is an undocumented command.
DBCC page(dbid, fileno, pageno, option)
Where:
dbid = database id.
Fileno = fileno of the page. Usually it will be 1, unless we use more than one file for a database.
Pageno = we can take the output of the dbcc ind page no.
Option = it can be 0, 1, 2, 3. I use 3 to get a display of the data. You can try yourself for the other options.
Run this script to execute the command:
DBCC TRACEON (3604)
GO
DBCC page(@DBID, 1, 26408, 3)
GO
The output will be page allocation details.
DBCC TRACEON (3604)
GO
dbcc page(@DBID, 1, 26255, 3)
GO
The data will be displayed in the order it was entered in the table. This is how SQL stores the data in pages. Actually, 26255 & 26409 both display the data page.
I have displayed the data page information for page 26255 only. This is how MS SQL stores the contents in data pages as such column name with its respective value.
Record Type = PRIMARY_RECORD
EmpId = 4
EmpName = ddddddddddddddddddddddddddddddddddddddddddddddddddd
ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
Record Type = PRIMARY_RECORD
EmpId = 6
EmpName = ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
Record Type = PRIMARY_RECORD
EmpId = 1
EmpName = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
This displays the exact data storage in SQL, without any index on table. Now, let’s go and create a unique non-clustered index on the EmpID column.

Ref: http://www.sql-server-performance.com/articles/per/index_data_structures_p1.aspx

Monday, March 23, 2009

New features of Microsoft SQL Server 2008

Transparent Data Encryption
Enable encryption of an entire database, data files, or log files, without the need for application changes. Benefits of this include: Search encrypted data using both range and fuzzy searches, search secure data from unauthorized users, and data encryption without any required changes in existing applications.
Extensible Key Management
SQL Server 2005 provides a comprehensive solution for encryption and key management. SQL Server 2008 delivers an excellent solution to this growing need by supporting third-party key management and HSM products.
Auditing
Create and manage auditing via DDL, while simplifying compliance by providing more comprehensive data auditing. This enables organizations to answer common questions, such as, "What data was retrieved?"
Enhanced Database Mirroring
SQL Server 2008 builds on SQL Server 2005 by providing a more reliable platform that has enhanced database mirroring, including automatic page repair, improved performance, and enhanced supportability.
Automatic Recovery of Data Pages
SQL Server 2008 enables the principal and mirror machines to transparently recover from 823/824 types of data page errors by requesting a fresh copy of the suspect page from the mirroring partner transparently to end users and applications.
Log Stream Compression
Database mirroring requires data transmissions between the participants of the mirroring implementations. With SQL Server 2008, compression of the outgoing log stream between the participants delivers optimal performance and minimizes the network bandwidth used by database mirroring.
Resource Governor
Provide a consistent and predictable response to end users with the introduction of Resource Governor, allowing organizations to define resource limits and priorities for different workloads, which enable concurrent workloads to provide consistent performance to their end users.
Predictable Query Performance
Enable greater query performance stability and predictability by providing functionality to lock down query plans, enabling organizations to promote stable query plans across hardware server replacements, server upgrades, and production deployments.
Data Compression
Enable data to be stored more effectively, and reduce the storage requirements for your data. Data compression also provides significant performance improvements for large I/O bound workloads, like data warehousing.
Hot Add CPU
Dynamically scale a database on demand by allowing CPU resources to be added to SQL Server 2008 on supported hardware platforms without forcing any downtime on applications. Note that SQL Server already supports the ability to add memory resources online.
Policy-Based Management
Policy-Based Management is a policy-based system for managing one or more instances of SQL Server 2008. Use this with SQL Server Management Studio to create policies that manage entities on the server, such as the instance of SQL Server, databases, and other SQL Server objects.
Streamlined Installation
SQL Server 2008 introduces significant improvements to the service life cycle for SQL Server through the re-engineering of the installation, setup, and configuration architecture. These improvements separate the installation of the physical bits on the hardware from the configuration of the SQL Server software, enabling organizations and software partners to provide recommended installation configurations.
Performance Data Collection
Performance tuning and troubleshooting are time-consuming tasks for the administrator. To provide actionable performance insights to administrators, SQL Server 2008 includes more extensive performance data collection, a new centralized data repository for storing performance data, and new tools for reporting and monitoring.
Language Integrated Query (LINQ)
Enable developers to issue queries against data, using a managed programming language, such as C# or VB.NET, instead of SQL statements. Enable seamless, strongly typed, set-oriented queries written in .NET languages to run against ADO.NET (LINQ to SQL), ADO.NET DataSets (LINQ to DataSets), the ADO.NET Entity Framework (LINQ to Entities), and to the Entity Data Service Mapping provider. Use the new LINQ to SQL provider that enables developers to use LINQ directly on SQL Server 2008 tables and columns.
ADO.NET Data Services
The Object Services layer of ADO.NET enables the materialization, change tracking, and persistence of data as CLR objects. Developers using the ADO.NET framework can program against a database, using CLR objects that are managed by ADO.NET. SQL Server 2008 introduces more efficient, optimized support that improves performance and simplifies development.

DATE/TIME
SQL Server 2008 introduces new date and time data types:
o DATE—A date-only type
o TIME—A time-only type
o DATETIMEOFFSET—A time-zone-aware datetime type
o DATETIME2—A datetime type with larger fractional seconds and year range than the existing DATETIME type
The new data types enable applications to have separate data and time types while providing large data ranges or user defined precision for time values.
HIERARCHY ID
Enable database applications to model tree structures in a more efficient way than currently possible. New system type HierarchyId can store values that represent nodes in a hierarchy tree. This new type will be implemented as a CLR UDT, and will expose several efficient and useful built-in methods for creating and operating on hierarchy nodes with a flexible programming model.
FILESTREAM Data
Allow large binary data to be stored directly in an NTFS file system, while preserving an integral part of the database and maintaining transactional consistency. Enable the scale-out of large binary data traditionally managed by the database to be stored outside the database on more cost-effective storage without compromise.
Integrated Full Text Search
Integrated Full Text Search makes the transition between Text Search and relational data seamless, while enabling users to use the Text Indexes to perform high-speed text searches on large text columns.
Sparse Columns
NULL data consumes no physical space, providing a highly efficient way of managing empty data in a database. For example, Sparse Columns allows object models that typically have numerous null values to be stored in a SQL Server 2005 database without experiencing large space costs.
Large User-Defined Types
SQL Server 2008 eliminates the 8-KB limit for User-Defined Types (UDTs), allowing users to dramatically expand the size of their UDTs.
Spatial Data Types
Build spatial capabilities into your applications by using the support for spatial data.
o Implement Round Earth solutions with the geography data type. Use latitude and longitude coordinates to define areas on the Earth's surface.
o Implement Flat Earth solutions with the geometry data type. Store polygons, points, and lines that are associated with projected planar surfaces and naturally planar data, such as interior spaces.

Backup Compression
Keeping disk-based backups online is expensive and time-consuming. With SQL Server 2008 backup compression, less storage is required to keep backups online, and backups run significantly faster since less disk I/O is required.
Partitioned Table Parallelism
Partitions enable organizations to manage large growing tables more effectively by transparently breaking them into manageable blocks of data. SQL Server 2008 builds on the advances of partitioning in SQL Server 2005 by improving the performance on large partitioned tables.
Star Join Query Optimizations
SQL Server 2008 provides improved query performance for common data warehouse scenarios. Star Join Query optimizations reduce query response time by recognizing data warehouse join patterns.
Grouping Sets
Grouping Sets is an extension to the GROUP BY clause that lets users define multiple groupings in the same query. Grouping Sets produces a single result set that is equivalent to a UNION ALL of differently grouped rows, making aggregation querying and reporting easier and faster.
Change Data Capture
With Change Data Capture, changes are captured and placed in change tables. It captures complete content of changes, maintains cross-table consistency, and even works across schema changes. This enables organizations to integrate the latest information into the data warehouse.
MERGE SQL Statement
With the introduction of the MERGE SQL Statement, developers can more effectively handle common data warehousing scenarios, like checking whether a row exists, and then executing an insert or update.
SQL Server Integration Services (SSIS) Pipeline Improvements
Data Integration packages can now scale more effectively, making use of available resources and managing the largest enterprise-scale workloads. The new design improves the scalability of runtime into multiple processors.
SQL Server Integration Services (SSIS) Persistent Lookups
The need to perform lookups is one of the most common ETL operations. This is especially prevalent in data warehousing, where fact records need to use lookups to transform business keys to their corresponding surrogates. SSIS increases the performance of lookups to support the largest tables.

Analysis Scale and Performance
SQL Server 2008 drives broader analysis with enhanced analytical capabilities and with more complex computations and aggregations. New cube design tools help users streamline the development of the analysis infrastructure enabling them to build solutions for optimized performance.
Block Computations
Block Computations provides a significant improvement in processing performance enabling users to increase the depth of their hierarchies and complexity of the computations.
Writeback
New MOLAP enabled writeback capabilities in SQL Server 2008 Analysis Services removes the need to query ROLAP partitions. This provides users with enhanced writeback scenarios from within analytical applications without sacrificing the traditional OLAP performance.
Enterprise Reporting Engine
Reports can easily be delivered throughout the organization, both internally and externally, with simplified deployment and configuration. This enables users to easily create and share reports of any size and complexity.
Internet Report Deployment
Customers and suppliers can effortlessly be reached by deploying reports over the Internet.
Manage Reporting Infrastructure
Increase supportability and the ability to control server behaviour with memory management, infrastructure consolidation, and easier configuration through a centralized store and API for all configuration settings.
Report Builder Enhancements
Easily build ad-hoc and author reports with any structure through Report Designer.
Forms Authentication Support
Support for Forms authentication enables users to choose between Windows and Forms authentication.
Report Server Application Embedding
Report Server application embedding enables the URLs in reports and subscriptions to point back to front-end applications.
Microsoft Office Integration
SQL Server 2008 provides new Word rendering that enables users to consume reports directly from within Microsoft Office Word. In addition, the existing Excel renderer has been greatly enhanced to accommodate the support of features, like nested data regions, sub-reports, as well as merged cell improvements. This lets users maintain layout fidelity and improves the overall consumption of reports from Microsoft Office applications.
Predictive Analysis
SQL Server Analysis Services continues to deliver advanced data mining technologies. Better Time Series support extends forecasting capabilities. Enhanced Mining Structures deliver more flexibility to perform focused analysis through filtering as well as to deliver complete information in reports beyond the scope of the mining model. New cross-validation enables confirmation of both accuracy and stability for results that you can trust. Furthermore, the new features delivered with SQL Server 2008 Data Mining Add-ins for Office 2007 empower every user in the organization with even more actionable insight at the desktop.

Friday, March 20, 2009

Accessing a Data from an Application

Before I describe dynamic SQL, I like to briefly discuss the various ways you can access data from an application to give an overview of what I'll be talking about in this article.
(Note: all through this text I will refer to client as anything that accesses SQL Server from the outside. In the overall application architecture that may in fact be a middle tier or a business layer, but as that is of little interest to this article, I use client in the sake of brevity.)
There are two main roads to go, and then there are forks and sub-forks.
1. Send SQL statements from the client to SQL Server.
2. Rely on SQL generated by the client API, using options like CommandType.TableDirect and methods like .Update.
3. Compose the SQL strings in the client code.
4. Build the entire SQL string with parameter values expanded.
5. Use parameterised queries.
6. Perform access through stored procedures.
7. Stored procedures in T-SQL
8. Use static SQL only.
9. Use dynamic SQL together with static SQL.
10. Stored procedures in a CLR language such as C# or VB .Net. (SQL 2005 only.)
Fork 1-a may be good for simple tasks, but you are likely to find that you outgrow it as the complexity of your application increases. In any case, this approach falls entirely outside the scope of this article.
Many applications are built along the principles of fork 1-b, and as long as you take the sub-fork 1-b-ii, it does not have to be bad. (Why 1-b-i is bad, is something I will come back to. Here I will just drop two keywords: SQL Injection and Query-Plan Reuse.) Nonetheless, in many shops the mandate is that you should use stored procedures. When you use stored procedures with only static SQL, users do not need direct permissions to access the tables, only permissions to execute the stored procedures, and thus you can use the stored procedure to control what users may and may not do.
The main focus for this text is sub-fork 2-a-ii. When used appropriately, dynamic SQL in stored procedures can be a powerful addition to static SQL. But some of the questions on the newsgroups leads to dynamic SQL in stored procedures that is so meaningless, that these people would be better off with fork 1-b instead.
Finally, fork 2-b, stored procedures in the CLR, is in many regards very similar to fork 1-b, since all data access from CLR procedures is through generated SQL strings, parameterised or unparameterised. If you have settled on SQL procedures for your application, there is little point in rewriting them into the CLR. However, CLR code can be a valuable supplement for tasks that are difficult to perform in T-SQL, but you yet want to perform server-side.
Introducing Dynamic SQL
In this chapter I will first look at some quick examples of dynamic SQL and point out some very important implications of using dynamic SQL. I will then describe sp_executesql and EXEC() in detail, the two commands you can use to invoke dynamic SQL from T-SQL.
A First Encounter
Understanding dynamic SQL itself is not difficult. Au contraire, it's rather too easy to use. Understanding the fine details, though, takes a little longer time. If you start out using dynamic SQL casually, you are bound to face accidents when things do not work as you have anticipated.
One of the problems listed in the introduction was how to write a stored procedure that takes a table name as its input. Here are two examples, based on the two ways to do dynamic SQL in Transact-SQL:
CREATE PROCEDURE general_select1 @tblname sysname,
@key varchar(10) AS
DECLARE @sql nvarchar(4000)
SELECT @sql = ' SELECT col1, col2, col3 ' +
' FROM dbo.' + quotename(@tblname) +
' WHERE keycol = @key'
EXEC sp_executesql @sql, N'@key varchar(10)', @key
CREATE PROCEDURE general_select2 @tblname nvarchar(127),
@key varchar(10) AS
EXEC('SELECT col1, col2, col3
FROM ' + @tblname + '
WHERE keycol = ''' + @key + '''')
Before I say anything else, permit me to point out that these are examples of bad usage of dynamic SQL. Passing a table name as a parameter is not how you should write stored procedures, and one aim of this article is to explain this in detail. Also, the two examples are not equivalent. While both examples are bad, the second example has several problems that the first does not have. What these problems are will be apparent as you read this text.
Whereas the above looks very simple and easy, there are some very important things to observe. The first thing is permissions. You may know that when you use stored procedures, users do not need permissions to access the tables accessed by the stored procedure. This does not apply when you use dynamic SQL! For the procedures above to execute successfully, the users must have SELECT permission on the table in @tblname. In SQL 2000 and earlier this is an absolute rule with no way around it. SQL 2005 provides alternative ways, something I will come back to in the section The Permission System.
Next thing to observe is that the dynamic SQL is not part of the stored procedure, but constitutes its own scope. Invoking a block of dynamic SQL is akin to call a nameless stored procedure created ad-hoc. This has a number of consequences:
• Within the block of dynamic SQL, you cannot access local variables (including table variables) or parameters of the calling stored procedure. But you can pass parameters – in and out – to a block of dynamic SQL if you use sp_executesql.
• Any USE statement in the dynamic SQL will not affect the calling stored procedure.
• Temp tables created in the dynamic SQL will not be accessible from the calling procedure since they are dropped when the dynamic SQL exits. (Compare to how temp tables created in a stored procedure go away when you exit the procedure.) The block of dynamic SQL can however access temp tables created by the calling procedure.
• If you issue a SET command in the dynamic SQL, the effect of the SET command lasts for the duration of the block of dynamic SQL only and does not affect the caller.
• The query plan for the stored procedure does not include the dynamic SQL. The block of dynamic SQL has a query plan of its own.
As you've seen there are two ways to invoke dynamic SQL, sp_executesql and EXEC(). sp_executesql was added in SQL 7, whereas EXEC() has been around since SQL 6.0. In application code, sp_executesql should be your choice 95% of the time for reasons that will prevail. For now I will only give two keywords: SQL Injection and Query-Plan Reuse. EXEC() is mainly useful for quick throw-away things and DBA tasks, but also comes to the rescue in SQL 2000 and SQL 7 when the SQL string exceeds 4000 characters. And, obviously, in SQL 6.5, EXEC() is the sole choice. In the next two sections we will look at these two commands in detail.
sp_executesql
sp_executesql is a built-in stored procedure that takes two pre-defined parameters and any number of user-defined parameters.
The first parameter @stmt is mandatory, and contains a batch of one or more SQL statements. The data type of @stmt is ntext in SQL 7 and SQL 2000, and nvarchar(MAX) in SQL 2005. Beware that you must pass an nvarchar/ntext value (that is, a Unicode value). A varchar value won't do.
The second parameter @params is optional, but you will use it 90% of the time. @params declares the parameters that you refer to in @stmt. The syntax of @params is exactly the same as for the parameter list of a stored procedure. The parameters can have default values and they can have the OUTPUT marker. Not all parameters you declare must actually appear in the SQL string. (Whereas all variables that appear in the SQL string must be declared, either with a DECLARE inside @stmt, or in @params.) Just like @stmt, the data type of @params in SQL 7/2000 is ntext and nvarchar(MAX) in SQL 2005.
The rest of the parameters are simply the parameters that you declared in @params, and you pass them as you pass parameters to a stored procedure, either positional or named. To get a value back from your output parameter, you must specify OUTPUT with the parameter, just like when you call a stored procedure. Note that @stmt and @params must be specified positionally. You can provide the parameter names for them, but these names are blissfully ignored.
Let's look at an example. Say that in your database, many tables have a column LastUpdated, which holds the time a row last was updated. You want to be able to find out how many rows in each table that were modified at least once during a period. This is not something you run as part of the application, but something you run as a DBA from time to time, so you just keep it as a script that you have a around. Here is how it could look like:
DECLARE @tbl sysname,
@sql nvarchar(4000),
@params nvarchar(4000),
@count int

DECLARE tblcur CURSOR STATIC LOCAL FOR
SELECT object_name(id) FROM syscolumns WHERE name = 'LastUpdated'
ORDER BY 1
OPEN tblcur

WHILE 1 = 1
BEGIN
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0
BREAK

SELECT @sql =
N' SELECT @cnt = COUNT(*) FROM dbo.' + quotename(@tbl) +
N' WHERE LastUpdated BETWEEN @fromdate AND ' +
N' coalesce(@todate, ''99991231'')'
SELECT @params = N'@fromdate datetime, ' +
N'@todate datetime = NULL, ' +
N'@cnt int OUTPUT'
EXEC sp_executesql @sql, @params, '20060101', @cnt = @count OUTPUT

PRINT @tbl + ': ' + convert(varchar(10), @count) + ' modified rows.'
END

DEALLOCATE tblcur
I've put the lines that pertain directly to the dynamic SQL in bold face. You can see that I have declared the @sql and @params variables to be of the maximum length for nvarchar variables in SQL 2000. In SQL 2005, you may want to make it a routine to make @sql nvarchar(MAX), more about this just below.
When I assign the @sql variable, I am careful to format the statement so that it is easy to read, and I leave in spaces to avoid that two concatenated parts are glued together without space in between, which could cause a syntax error. I put the table name in quotename() in case a table name has any special characters in it. I also prefix the table name with "dbo.", which is a good habit, as we will see when we look at dynamic SQL and query plans. Overall, I will cover this sort of good practices more in detail later in the text. Note also the appearance of '' around the date literal – the rule in T-SQL is that to include the string delimiter in a string, you must double it.
In this example, the dynamic SQL has three parameters: one mandatory input parameter, one optional input parameter, and one output parameter. I've assumed that this time the DBA wanted to see all changes made after 2006-01-01, which is why I've left out @todate in the call to sp_executesql. Since I left out one variable, I must specify the last, @cnt by name – the same rules as when you call a stored procedure. Note also that the variable is called @cnt in the dynamic SQL, but @count in the surrounding script. Normally, you might want to use the same name, but I wanted to stress that the @cnt in the dynamic SQL is only visible within the dynamic SQL, whereas @count is not visible there.
You may note that I've prepend the string literals with N to denote that they are Unicode strings. As @sql and @params are declared as nvarchar, technically this is not necessary (as long as you stick your 8-bit character set). However, would you provide any of the strings directly in the call to sp_executesql, you must specify the N, as in this fairly silly example:
EXEC sp_executesql N'SELECT @x', N'@x int', @x = 2
If you remove any of the Ns, you will get an error message. Since sp_executesql is a built-in stored procedure, there is no implicit conversion from varchar.
You may wonder why I do not pass @tbl as a parameter as well. The answer is that you can't. Dynamic SQL is just like any other SQL. You can't specify a table name through a variable in T-SQL, that's the whole story. Thus, when you need to specify things like table names, column names etc dynamically, you must interpolate them into the string.
If you are on SQL 2000 or SQL 7, there is a limitation with sp_executesql when it comes to the length of the SQL string. While the parameter is ntext, you cannot use this data type for local variables. Thus, you will have to stick to nvarchar(4000). In many cases this will do fine, but it is not uncommon to exceed that limit. In this case, you will need to use EXEC(), described just below.
On SQL 2005, this is not an issue. Here you can use the new data type nvarchar(MAX) which can hold as much data as ntext, but without the many restrictions of ntext.
EXEC()
EXEC() takes one parameter which is an SQL statement to execute. The parameter can be a concatenation of string variables and string literals, but cannot include calls to functions or other operators. For very simple cases, EXEC() is less hassle than sp_executesql. For instance, say that you want to run UPDATE STATISTICS WITH FULLSCAN on some selected tables. It could look like this:
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0 BREAK
EXEC('UPDATE STATISTICS [' + @tbl + '] WITH FULLSCAN')
In the example with sp_executesql, I used qoutename(), but here I've let it suffice with adding brackets, in case there is a table named Order Details (which there is in the Northwind database). Since EXEC only permits string literals and string variables to be concatenated and not arbitrary expressions, this is not legal:
EXEC('UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN')
Best practice is to always use a variable to hold the SQL statement, so the example would better read:
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0 BREAK
SELECT @sql = 'UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN'
EXEC(@sql)
The fact that you can concatenate strings within EXEC() can permit you to make very quick things, which can be convenient at times, but can lead to poor habits in application code. However, there are situations where this is an enormous blessing. As I mentioned, in SQL 7 and SQL 2000, you can in practice only use 4000 characters in your SQL string with sp_executesql. EXEC does not have this limitation, since you can say
EXEC(@sql1 + @sql2 + @sql3)
Where all of @sql1, @sql2 and @sql3 can be 4000 characters long – or even 8000 characters as EXEC() permits you to use varchar.
Since you cannot use parameters, you cannot as easily get values out from EXEC() as you can with sp_executesql. You can, however, use INSERT-EXEC to insert the result set from EXEC() into a table. I will show you an example later on, when I also show you how you can use EXEC() to pass longer strings than 4000 characters to sp_executesql.
In SQL 2005, EXEC() permits impersonation so that you can say:
EXEC(@sql) AS USER = 'mitchell'
EXEC(@sql) AS LOGIN = 'CORDOBA\Miguel'
This is mainly a syntactical shortcut that saves you from embedding the invocation of dynamic SQL in EXECUTE AS and REVERT. (I discuss these statements more in detail in my article Granting Permissions Through Stored Procedures.)
SQL 2005 adds a valuable extension to EXEC(): you can use it to execute strings on linked servers. I will cover this form of EXEC() in a separate section later in this Blog.

Thursday, March 19, 2009

The Curse and Blessings of Dynamic SQL

An SQL text by Erland Sommarskog, SQL Server MVP.
If you follow the various newsgroups on Microsoft SQL Server, you often see people asking why they can't do:
SELECT * FROM @tablename
SELECT @colname FROM tbl
SELECT * FROM tbl WHERE x IN (@list)
For all three examples you can expect someone to answer Use dynamic SQL and give a quick example on how to do it. Unfortunately, for all three examples above, dynamic SQL is a poor solution. On the other hand, there are situations where dynamic SQL is the best or only way to go.
In this article I will discuss the use of dynamic SQL in stored procedures and to a minor extent from client languages. To set the scene, I start with a very quick overview on application architecture for data access. I then proceed to describe the feature dynamic SQL as such, with a quick introduction followed by the gory syntax details. Next, I continue with a discussion on SQL injection, a security issue that you absolutely must have good understanding of when you work with dynamic SQL. This is followed by a section where I discuss why we use stored procedures, and how that is affected by the use of dynamic SQL. I carryon with a section on good practices and tips for writing dynamic SQL. I conclude by reviewing a number of situations where you could use dynamic SQL and whether it is a good or bad idea to do it.
The article covers all versions of SQL Server from SQL 6.5 to SQL 2005, with emphasis on SQL 2000 and SQL 2005.
Contents:
Accessing a Data from an Application
Introducing Dynamic SQL
A First Encounter
sp_executesql
EXEC()
SQL Injection – a Serious Security Issue
Dynamic SQL and Stored Procedures
The Permission System
Caching Query Plans
Reducing Network Traffic
Encapsulating Logic
Keeping Track of what Is Used
Easiness of Writing SQL Code
Addressing Bugs and Problems
Good Coding Practices and Tips for Dynamic SQL
Use Debug Prints!
Nested Strings
Spacing and Formatting
Dealing with Dynamic Table and Column Names
Quotename, Nested Strings and Quotestring
QUOTED_IDENTIFIER
sp_executesql and Long SQL Strings in SQL 2000
Dynamic SQL in User-Defined Functions
Cursors and Dynamic SQL
EXEC() at Linked Server
Common Cases when to (Not) Use Dynamic SQL
SELECT * FROM @tablename
SELECT * FROM sales + @yymm
UPDATE tbl SET @colname = @value WHERE keycol = @keyval
SELECT * FROM @dbname + '..tbl'
SELECT * FROM tbl WHERE col IN (@list)
SELECT * FROM tbl WHERE @condition
Dynamic Search Conditions
SELECT * FROM tbl ORDER BY @col
SELECT TOP @n FROM tbl
CREATE TABLE @tbl
CREATE TABLE with Unknown Columns
Linked Servers
OPENQUERY
Dynamic Column Widths
Dynamic SQL and Maintenance Tasks
Acknowledgements and Feedback
Revision History
Note: many of the code samples in this text works against the pubs and Northwind databases that ship with SQL 2000 and SQL 7, but not with SQL 2005. You can download these databases from Microsoft's web site.