Server Configuration:
The performance gains that were seen after applying these techniques and
configuration parameters were applied on a Database server with the following
configuration
Windows Server 2008 64 bit
32 Gig of Ram
8 Processors
The performance gains that were seen after applying these techniques and
configuration parameters were applied on a Database server with the following
configuration
Windows Server 2008 64 bit
32 Gig of Ram
8 Processors
Configuration Parameters
MaxDop – Otherwise known as Maximum Degree of Parallelism is what I
consider the needle in the haystack. This parameter is used to designate how
many processors can be used on a single SQL statement execution. By changing this
one parameter we saw significant increases of performance. This parameter can be either an option that
is used on an individual query or a global server setting. Not having access to the product SQL
statements, I have always set the MAXDOP setting at the server level to 1. The
default for MAXDOP is 4.
To verify or change your MaxDop setting in the SQL Server Management
Studio (SSMS) right click on the server, and select properties. This will bring up the following popup window
on which you should select the advanced topic to see something similar to this:
After changing this parameter, you will need to restart the server
before the change will take effect.
If you are not able to change this server based parameter then is can be
specified on an individual SQL statement as seen below.
Select *
From Lombardi.Product
Order by productId
OPTION (MAXDOP 1)
I do not
suggest adding hints to embedded SQL statements as even though it could be
helping now, it may come back to hinder you later on. In an environment where
removing a hint from a SQL statement could involve a snapshot deployment and
migration, SQL statements should be written in a way to avoid having to supply
hints to the compiler.
Another option that I like to adjust is the
parameterization option. Different
people have different opinions on this one but I like to force
parameterization. This causes the engine
to parse each statement and all of the variables are converted into parameters.
The engine then configures the execution plan based on the parameterized
queries. Based on the queries written, the time spent
parsing and parameterizing them is gained in the reuse of the execution plans,
in other cases it becomes costly and your performance will degrade a little.
To turn
this on right click on the DB in SSMS and select properties. Select the option
group to see a screen looking like this:
If it is not already
set to forced, change the parameterization parameter from simple to forced and
restart the server to have this take effect.
MaxDop – Otherwise known as Maximum Degree of Parallelism is what I
consider the needle in the haystack. This parameter is used to designate how
many processors can be used on a single SQL statement execution. By changing this
one parameter we saw significant increases of performance. This parameter can be either an option that
is used on an individual query or a global server setting. Not having access to the product SQL
statements, I have always set the MAXDOP setting at the server level to 1. The
default for MAXDOP is 4.
To verify or change your MaxDop setting in the SQL Server Management
Studio (SSMS) right click on the server, and select properties. This will bring up the following popup window
on which you should select the advanced topic to see something similar to this:
After changing this parameter, you will need to restart the server
before the change will take effect.
If you are not able to change this server based parameter then is can be
specified on an individual SQL statement as seen below.
Select *
From Lombardi.Product
Order by productId
OPTION (MAXDOP 1)
I do not
suggest adding hints to embedded SQL statements as even though it could be
helping now, it may come back to hinder you later on. In an environment where
removing a hint from a SQL statement could involve a snapshot deployment and
migration, SQL statements should be written in a way to avoid having to supply
hints to the compiler.
Another option that I like to adjust is the
parameterization option. Different
people have different opinions on this one but I like to force
parameterization. This causes the engine
to parse each statement and all of the variables are converted into parameters.
The engine then configures the execution plan based on the parameterized
queries. Based on the queries written, the time spent
parsing and parameterizing them is gained in the reuse of the execution plans,
in other cases it becomes costly and your performance will degrade a little.
To turn this on right click on the DB in SSMS and select properties. Select the option group to see a screen looking like this:
To turn this on right click on the DB in SSMS and select properties. Select the option group to see a screen looking like this:
If it is not already
set to forced, change the parameterization parameter from simple to forced and
restart the server to have this take effect.
Indexing
Contrary
to what many believe, indexes are not the answer to every performance
problem. In reality an index is just a
table with a key and a pointer to the record in the table that the index is
applied on. Indexes are primarily used
for reading and finding data but will actually hinder writing the actual data.
Selecting rows from that table will use the “best” available index if one
exists, otherwise it will do a table scan which is very costly. I cannot emphasis enough that indexes are
good but too much of a good thing can be bad.
At a high
level the best place to look for what should be indexed is how the columns in
the table are used and how the data will be referenced. Frequently, some of the best columns to index
on are foreign key columns and columns that are used in the where clause to
select the data.
Many
people love to run a missing index report after a performance test and then
just blindly apply each and every index that has been suggested. Frequently the
SQL can be modified to use the proper joins to take advantage of the proper
indexes.
Contrary
to what many believe, indexes are not the answer to every performance
problem. In reality an index is just a
table with a key and a pointer to the record in the table that the index is
applied on. Indexes are primarily used
for reading and finding data but will actually hinder writing the actual data.
Selecting rows from that table will use the “best” available index if one
exists, otherwise it will do a table scan which is very costly. I cannot emphasis enough that indexes are
good but too much of a good thing can be bad.
At a high
level the best place to look for what should be indexed is how the columns in
the table are used and how the data will be referenced. Frequently, some of the best columns to index
on are foreign key columns and columns that are used in the where clause to
select the data.
Many
people love to run a missing index report after a performance test and then
just blindly apply each and every index that has been suggested. Frequently the
SQL can be modified to use the proper joins to take advantage of the proper
indexes.
Covering Indexes
You can
also use the SQL server concept of ‘covering indexes’ to reduce the cost and
increase the speed of SQL statements. In
a simple index, the rows in the index are ordered by the values in the column
and each row has a pointer to the row in the actual table. In a covering index, the index definition
uses the keyword ‘include’ which will include other fields in the index but not
index them. This way, instead of the
system finding the row in the index and then using the index pointer to get to
the row in the actual table, the system can find the row it is looking for in
the index but then read the additional returned column(s) in the query from the index and never have to
go to the table.
You can
also use the SQL server concept of ‘covering indexes’ to reduce the cost and
increase the speed of SQL statements. In
a simple index, the rows in the index are ordered by the values in the column
and each row has a pointer to the row in the actual table. In a covering index, the index definition
uses the keyword ‘include’ which will include other fields in the index but not
index them. This way, instead of the
system finding the row in the index and then using the index pointer to get to
the row in the actual table, the system can find the row it is looking for in
the index but then read the additional returned column(s) in the query from the index and never have to
go to the table.
Filtered Indexes
Another
index type that can be used in special situations is called ‘filtered indexes’.
These indexes are a combination of indexing and a where clause. This keeps the index smaller and faster to
use.
The syntax
used to create a filtered index is:
Create
Nonclustered index IX_product_name on Lombardi.product (name) where
sales_region is null
Another
index type that can be used in special situations is called ‘filtered indexes’.
These indexes are a combination of indexing and a where clause. This keeps the index smaller and faster to
use.
The syntax
used to create a filtered index is:
Create
Nonclustered index IX_product_name on Lombardi.product (name) where
sales_region is null
Index Column Order
The column
order in the index is an important part of index creation. The most selective column should be placed
first in the index. This should be followed by the second most selective and
the third, etc. Index on as few of the
columns as needed to make it easier for the SQL engine to use the index.
The column
order in the index is an important part of index creation. The most selective column should be placed
first in the index. This should be followed by the second most selective and
the third, etc. Index on as few of the
columns as needed to make it easier for the SQL engine to use the index.
Looking for Issues
SQL Server Management Studio (SSMS) can help you identify potential
issues after you have narrowed down the suspect query to a single SQL
statement. Before you start evaluating
it, you need to turn on some features in SSMS.
To do this bring up a query window that you will be working with. These options are window specific so make
sure you are setting them on the window you will be using.
To set these options select Query and then Query Options… to open the
popup window. After selecting the
advanced property you should see a window like this
Check the ‘Set Showplan_Text’, ‘Set Statistics Time’, and ‘Set
Statistics IO’ options on and then select OK.
These options will do the following
Set Showplan Text – This option will turn on the display of the
execution plan that the optimizer selected for the query. The plan will show you how the optimizer has
decided to execute the query.
Set Statistics Time – This option will display the times each set takes
from the CPU perspective.
Set Statistics IO – This option will show you the quantity of reads and
writes for each step of the execution. Reducing the number of reads and writes
will free up disk access and lower the execution costs.
Once you have identified a slow query, you need to figure out why it is
slow. The three most common items to
look as are:
Execution
plans
I/O
Statistics
Execution
time
SQL Server Management Studio (SSMS) can help you identify potential
issues after you have narrowed down the suspect query to a single SQL
statement. Before you start evaluating
it, you need to turn on some features in SSMS.
To do this bring up a query window that you will be working with. These options are window specific so make
sure you are setting them on the window you will be using.
To set these options select Query and then Query Options… to open the
popup window. After selecting the
advanced property you should see a window like this
Check the ‘Set Showplan_Text’, ‘Set Statistics Time’, and ‘Set
Statistics IO’ options on and then select OK.
These options will do the following
Set Showplan Text – This option will turn on the display of the
execution plan that the optimizer selected for the query. The plan will show you how the optimizer has
decided to execute the query.
Set Statistics Time – This option will display the times each set takes
from the CPU perspective.
Set Statistics IO – This option will show you the quantity of reads and
writes for each step of the execution. Reducing the number of reads and writes
will free up disk access and lower the execution costs.
Once you have identified a slow query, you need to figure out why it is
slow. The three most common items to
look as are:
Execution
plans
I/O
Statistics
Execution
time
Execute Plans
SSMS is
able to show you the actual execution plan that will be used when the query is
run on the server. In the graphical
representation the following query
Select name,
financialSystemId,
clientConfirmationDate
from
gdm_request
join gdm_client on gdm_request.clientId
= gdm_client.id
Will present a plan that looks like
This is
the plan to be used when the SQL is executed.
A quick look can tell you what part of the query is taking the longest.
A rule of thumb is that Seek activities are better and faster than scan activities. Clicking on any of the activities will give
you information on that activity. A few
things to look for are the number of actual rows vs the Estimated number of rows,
Scans vs Seeks, and any special functions that the optimizers has to use to
join properly (data conversions). There
is a lot of information on the Web on how to evaluate a plan so I will not
cover it here.
Many
people assume that this is the best plan but there is not always the case. SQL Server tries to balance the time it takes
to produce the best plan and the actual time it takes to execute the SQL. If it can look at all of the available
options before time run out it will pick the best plan and use it. However if the query has so many ways that it
can be executed (i.e. lots of joins or sub-queries), time may run out before it
finds the best plan and it will just pick one and go. To check for this right click on the first
box (Select in this case) and select properties. To the right of the window the following will
appear:
Looking at
the QueryHash and QueryHashPlan you can see that it selected one to use. Looking at the Reason for early termination
property you can see that if found one good enough so it stopped looking and
executed that plan.
Sometimes with complex queries with multiple
tables, embedded sub queries and where clauses, time will run out on finding a
plan and the engine will use the best plan so far. The way to spot that is that the reason for
early termination will be ‘Timed out’ and the hash plan field will be empty.
SSMS is
able to show you the actual execution plan that will be used when the query is
run on the server. In the graphical
representation the following query
Select name,
financialSystemId,
clientConfirmationDate
from
gdm_request
join gdm_client on gdm_request.clientId
= gdm_client.id
Will present a plan that looks like
This is
the plan to be used when the SQL is executed.
A quick look can tell you what part of the query is taking the longest.
A rule of thumb is that Seek activities are better and faster than scan activities. Clicking on any of the activities will give
you information on that activity. A few
things to look for are the number of actual rows vs the Estimated number of rows,
Scans vs Seeks, and any special functions that the optimizers has to use to
join properly (data conversions). There
is a lot of information on the Web on how to evaluate a plan so I will not
cover it here.
Many
people assume that this is the best plan but there is not always the case. SQL Server tries to balance the time it takes
to produce the best plan and the actual time it takes to execute the SQL. If it can look at all of the available
options before time run out it will pick the best plan and use it. However if the query has so many ways that it
can be executed (i.e. lots of joins or sub-queries), time may run out before it
finds the best plan and it will just pick one and go. To check for this right click on the first
box (Select in this case) and select properties. To the right of the window the following will
appear:
Looking at
the QueryHash and QueryHashPlan you can see that it selected one to use. Looking at the Reason for early termination
property you can see that if found one good enough so it stopped looking and
executed that plan.
Sometimes with complex queries with multiple
tables, embedded sub queries and where clauses, time will run out on finding a
plan and the engine will use the best plan so far. The way to spot that is that the reason for
early termination will be ‘Timed out’ and the hash plan field will be empty.
I/O Statistic
Execution Time
On the messages tab, you will see the about of
IO and the timing of each section. It
will look like this:
Keep an
eye out for large number of scans, physical reads and read ahead. Always try to limit the amount of data you
are dealing with as soon as possible to reduce data merging for data that is
not going to be used.
Tuning the query is a complex issue and not
really to be addressed here, but it can be as simple as re-arranging the joins,
adding or changing an index.
On the messages tab, you will see the about of
IO and the timing of each section. It
will look like this:
Tuning the query is a complex issue and not
really to be addressed here, but it can be as simple as re-arranging the joins,
adding or changing an index.
Keep an
eye out for large number of scans, physical reads and read ahead. Always try to limit the amount of data you
are dealing with as soon as possible to reduce data merging for data that is
not going to be used.
Slowest Query Report
After a performance test run and before the servers are reset, you can
run the following query:
To get the results by the top IO users
SELECT
TOP 20
qs.execution_count,
(qs.total_physical_reads + qs.total_logical_reads + qs.total_logical_writes)
AS [Total IO],
(qs.total_physical_reads + qs.total_logical_reads +
qs.total_logical_writes) /qs.execution_count AS [Avg IO],
SUBSTRING(qt.[text], qs.statement_start_offset/2, (
CASE
WHEN qs.statement_end_offset = -1 THEN
LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2
) AS query_text,
qt.[dbid],
qt.objectid,
tp.query_plan
FROM
sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text (qs.[sql_handle]) AS qt
OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
ORDER BY [Total IO] DESC
To
get the results by the top Avg Exec Time in ms
SELECT
(total_elapsed_time/execution_count)/1000 AS [Avg Exec Time in ms]
, max_elapsed_time/1000 AS [MaxExecTime in ms]
, min_elapsed_time/1000 AS [MinExecTime in ms]
, (total_worker_time/execution_count)/1000 AS [Avg CPU Time in ms]
, qs.execution_count AS NumberOfExecs
, (total_logical_writes+total_logical_Reads)/execution_count AS [Avg Logical
IOs]
, max_logical_reads AS MaxLogicalReads
, min_logical_reads AS MinLogicalReads
, max_logical_writes AS MaxLogicalWrites
, min_logical_writes AS MinLogicalWrites
,(
SELECT SUBSTRING(text,statement_start_offset/2,
(
CASE WHEN statement_end_offset = -1
then LEN(CONVERT(nvarchar(max), text)) * 2
ELSE statement_end_offset
end -statement_start_offset)/2
)
FROM sys.dm_exec_sql_text(sql_handle)
) AS query_text
FROM sys.dm_exec_query_stats qs
ORDER BY [Avg Exec Time in ms] DESC
These two queries will show you which sql statements are impacting your
DB performance the most
After a performance test run and before the servers are reset, you can
run the following query:
To get the results by the top IO users
SELECT
TOP 20
qs.execution_count,
(qs.total_physical_reads + qs.total_logical_reads + qs.total_logical_writes) AS [Total IO],
(qs.total_physical_reads + qs.total_logical_reads + qs.total_logical_writes) /qs.execution_count AS [Avg IO],
SUBSTRING(qt.[text], qs.statement_start_offset/2, (
CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2
) AS query_text,
qt.[dbid],
qt.objectid,
tp.query_plan
FROM
sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text (qs.[sql_handle]) AS qt
OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
ORDER BY [Total IO] DESC
qs.execution_count,
(qs.total_physical_reads + qs.total_logical_reads + qs.total_logical_writes) AS [Total IO],
(qs.total_physical_reads + qs.total_logical_reads + qs.total_logical_writes) /qs.execution_count AS [Avg IO],
SUBSTRING(qt.[text], qs.statement_start_offset/2, (
CASE
WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), qt.[text])) * 2
ELSE qs.statement_end_offset
END - qs.statement_start_offset)/2
) AS query_text,
qt.[dbid],
qt.objectid,
tp.query_plan
FROM
sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text (qs.[sql_handle]) AS qt
OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
ORDER BY [Total IO] DESC
To
get the results by the top Avg Exec Time in ms
SELECT
(total_elapsed_time/execution_count)/1000 AS [Avg Exec Time in ms]
, max_elapsed_time/1000 AS [MaxExecTime in ms]
, min_elapsed_time/1000 AS [MinExecTime in ms]
, (total_worker_time/execution_count)/1000 AS [Avg CPU Time in ms]
, qs.execution_count AS NumberOfExecs
, (total_logical_writes+total_logical_Reads)/execution_count AS [Avg Logical IOs]
, max_logical_reads AS MaxLogicalReads
, min_logical_reads AS MinLogicalReads
, max_logical_writes AS MaxLogicalWrites
, min_logical_writes AS MinLogicalWrites
,(
SELECT SUBSTRING(text,statement_start_offset/2,
(
CASE WHEN statement_end_offset = -1
then LEN(CONVERT(nvarchar(max), text)) * 2
ELSE statement_end_offset
end -statement_start_offset)/2
)
FROM sys.dm_exec_sql_text(sql_handle)
) AS query_text
FROM sys.dm_exec_query_stats qs
ORDER BY [Avg Exec Time in ms] DESC
(total_elapsed_time/execution_count)/1000 AS [Avg Exec Time in ms]
, max_elapsed_time/1000 AS [MaxExecTime in ms]
, min_elapsed_time/1000 AS [MinExecTime in ms]
, (total_worker_time/execution_count)/1000 AS [Avg CPU Time in ms]
, qs.execution_count AS NumberOfExecs
, (total_logical_writes+total_logical_Reads)/execution_count AS [Avg Logical IOs]
, max_logical_reads AS MaxLogicalReads
, min_logical_reads AS MinLogicalReads
, max_logical_writes AS MaxLogicalWrites
, min_logical_writes AS MinLogicalWrites
,(
SELECT SUBSTRING(text,statement_start_offset/2,
(
CASE WHEN statement_end_offset = -1
then LEN(CONVERT(nvarchar(max), text)) * 2
ELSE statement_end_offset
end -statement_start_offset)/2
)
FROM sys.dm_exec_sql_text(sql_handle)
) AS query_text
FROM sys.dm_exec_query_stats qs
ORDER BY [Avg Exec Time in ms] DESC
These two queries will show you which sql statements are impacting your
DB performance the most
Missing Index Report
You can also run a query after a performance run to see a list and
content of suggested indexes. This list
is based on the queries that were run. DO NOT BLINDLY ADD THEM ALL. You will need to evaluate each index
suggested and tie it to the query that used it.
Then you will need to evaluate if tuning the query will eliminate the
need for the suggested index. Remember,
an index is just another table. If a
table has 8 indexes on it and you insert a row into the table, you are actually
updating 9 table.
The query to get the list of suggested indexes is
SELECT
CAST(SERVERPROPERTY('ServerName') AS [nvarchar](256)) AS [SQLServer],
db.[database_id] AS [DatabaseID],
db.[name] AS [DatabaseName],
id.[object_id] AS [ObjectID],
id.[statement] AS
[FullyQualifiedObjectName],
id.[equality_columns] AS [EqualityColumns],
id.[inequality_columns] AS
[InEqualityColumns],
id.[included_columns] AS [IncludedColumns],
gs.[unique_compiles] AS [UniqueCompiles],
gs.[user_seeks] AS [UserSeeks],
gs.[user_scans] AS [UserScans],
gs.[last_user_seek] AS [LastUserSeekTime],
gs.[last_user_scan] AS [LastUserScanTime],
gs.[avg_total_user_cost] AS
[AvgTotalUserCost],
gs.[avg_user_impact] AS [AvgUserImpact],
gs.[system_seeks] AS [SystemSeeks],
gs.[system_scans] AS [SystemScans],
gs.[last_system_seek] AS
[LastSystemSeekTime],
gs.[last_system_scan] AS
[LastSystemScanTime],
gs.[avg_total_system_cost] AS
[AvgTotalSystemCost],
gs.[avg_system_impact] AS
[AvgSystemImpact],
gs.[user_seeks] * gs.[avg_total_user_cost]
* (gs.[avg_user_impact] * 0.01) AS [IndexAdvantage],
'CREATE INDEX [Missing_IXNC_' +
OBJECT_NAME(id.[object_id], db.[database_id]) + '_'
+
REPLACE(REPLACE(REPLACE(ISNULL(id.[equality_columns], ''), ', ', '_'), '[',
''), ']', '')
+ CASE
WHEN id.[equality_columns] IS NOT NULL
AND id.[inequality_columns] IS NOT NULL
THEN '_'
ELSE ''
END +
REPLACE(REPLACE(REPLACE(ISNULL(id.[inequality_columns], ''), ', ', '_'), '[',
''), ']', '') + '_'
+ LEFT(CAST(NEWID() AS [nvarchar](64)),
5) + ']' + ' ON ' + id.[statement] + ' (' + ISNULL(id.[equality_columns], '')
+ CASE
WHEN id.[equality_columns] IS NOT NULL
AND id.[inequality_columns] IS NOT
NULL
THEN ','
ELSE ''
END + ISNULL(id.[inequality_columns],
'') + ')' + ISNULL(' INCLUDE (' + id.[included_columns] + ')', '') AS
[ProposedIndex],
CAST(CURRENT_TIMESTAMP AS [smalldatetime])
AS [CollectionDate]
FROM
[sys].[dm_db_missing_index_group_stats] gs WITH (NOLOCK)
INNER JOIN
[sys].[dm_db_missing_index_groups] ig WITH (NOLOCK)
ON gs.[group_handle] =
ig.[index_group_handle]
INNER JOIN
[sys].[dm_db_missing_index_details] id WITH (NOLOCK)
ON ig.[index_handle] = id.[index_handle]
INNER JOIN
[sys].[databases] db WITH (NOLOCK)
ON db.[database_id] = id.[database_id]
WHERE id.[database_id] >
4 -- Remove this to see for entire instance
ORDER BY [IndexAdvantage]
DESC
OPTION (RECOMPILE);
You can also run a query after a performance run to see a list and
content of suggested indexes. This list
is based on the queries that were run. DO NOT BLINDLY ADD THEM ALL. You will need to evaluate each index
suggested and tie it to the query that used it.
Then you will need to evaluate if tuning the query will eliminate the
need for the suggested index. Remember,
an index is just another table. If a
table has 8 indexes on it and you insert a row into the table, you are actually
updating 9 table.
The query to get the list of suggested indexes is
SELECT
CAST(SERVERPROPERTY('ServerName') AS [nvarchar](256)) AS [SQLServer],
db.[database_id] AS [DatabaseID],
db.[name] AS [DatabaseName],
id.[object_id] AS [ObjectID],
id.[statement] AS
[FullyQualifiedObjectName],
id.[equality_columns] AS [EqualityColumns],
id.[inequality_columns] AS
[InEqualityColumns],
id.[included_columns] AS [IncludedColumns],
gs.[unique_compiles] AS [UniqueCompiles],
gs.[user_seeks] AS [UserSeeks],
gs.[user_scans] AS [UserScans],
gs.[last_user_seek] AS [LastUserSeekTime],
gs.[last_user_scan] AS [LastUserScanTime],
gs.[avg_total_user_cost] AS
[AvgTotalUserCost],
gs.[avg_user_impact] AS [AvgUserImpact],
gs.[system_seeks] AS [SystemSeeks],
gs.[system_scans] AS [SystemScans],
gs.[last_system_seek] AS
[LastSystemSeekTime],
gs.[last_system_scan] AS
[LastSystemScanTime],
gs.[avg_total_system_cost] AS
[AvgTotalSystemCost],
gs.[avg_system_impact] AS
[AvgSystemImpact],
gs.[user_seeks] * gs.[avg_total_user_cost]
* (gs.[avg_user_impact] * 0.01) AS [IndexAdvantage],
'CREATE INDEX [Missing_IXNC_' +
OBJECT_NAME(id.[object_id], db.[database_id]) + '_'
+
REPLACE(REPLACE(REPLACE(ISNULL(id.[equality_columns], ''), ', ', '_'), '[',
''), ']', '')
+ CASE
WHEN id.[equality_columns] IS NOT NULL
AND id.[inequality_columns] IS NOT NULL
THEN '_'
ELSE ''
END +
REPLACE(REPLACE(REPLACE(ISNULL(id.[inequality_columns], ''), ', ', '_'), '[',
''), ']', '') + '_'
+ LEFT(CAST(NEWID() AS [nvarchar](64)),
5) + ']' + ' ON ' + id.[statement] + ' (' + ISNULL(id.[equality_columns], '')
+ CASE
WHEN id.[equality_columns] IS NOT NULL
AND id.[inequality_columns] IS NOT
NULL
THEN ','
ELSE ''
END + ISNULL(id.[inequality_columns],
'') + ')' + ISNULL(' INCLUDE (' + id.[included_columns] + ')', '') AS
[ProposedIndex],
CAST(CURRENT_TIMESTAMP AS [smalldatetime])
AS [CollectionDate]
FROM
[sys].[dm_db_missing_index_group_stats] gs WITH (NOLOCK)
INNER JOIN
[sys].[dm_db_missing_index_groups] ig WITH (NOLOCK)
ON gs.[group_handle] =
ig.[index_group_handle]
INNER JOIN
[sys].[dm_db_missing_index_details] id WITH (NOLOCK)
ON ig.[index_handle] = id.[index_handle]
INNER JOIN
[sys].[databases] db WITH (NOLOCK)
ON db.[database_id] = id.[database_id]
WHERE id.[database_id] >
4 -- Remove this to see for entire instance
ORDER BY [IndexAdvantage]
DESC
OPTION (RECOMPILE);
Other Tricks
Data Volume
Data Volume
One of the things that you should remember is that the smaller the
amount of data we are touching, the faster our response will be and the less
stress will be put on the optimizer. The
smaller quantity of data will take less memory, less disk space, less time and
less CPU to process. So always restrict
your data as much as possible as soon as you can.
One way of doing this is to create a temporary table to hold the records
that narrow your data the most, then
joining your main table to this temporary table to eliminate the rows that
cannot be joined.
One of the things that you should remember is that the smaller the
amount of data we are touching, the faster our response will be and the less
stress will be put on the optimizer. The
smaller quantity of data will take less memory, less disk space, less time and
less CPU to process. So always restrict
your data as much as possible as soon as you can.
One way of doing this is to create a temporary table to hold the records
that narrow your data the most, then
joining your main table to this temporary table to eliminate the rows that
cannot be joined.
Indexed Views
When a set of tables are joined together on a regular basis, it might
become easier to create a view of the joined tables and have your queries
reference the simple view instead of all of the joined tables. A normal view will use the indexes associated
with each of the individual tables as it sees fit when the view is
queried. In the case of a normal view
the processing will be very close to the same as referencing all of the tables
individually as a view is a virtual representation that does not exists on disk
until it is needed.
In some cases the creation of an indexed view can provide a performance
gain. An indexed view is a view that is
actually created on the disk when the view syntax is executed. The SQL server
engine will keep the view up to date in a real time so you do not need to worry
about that aspect. The benefits of an indexed view is that you can create
indexes on the view and the view definition can be such that you do not have to
have rows in the table that you know will not be referenced (i.e. historical
data). An example of this is when you
have long running processes with many tasks but you have a custom dashboard
that only list tasks in progress or waiting to be selected. In this case you
could create an indexed view only looking at tasks for your application and
only in those two statuses. This will
create a smaller and faster table for your use and can eliminate the overhead
of joins as the view definition will already include the join. Some of the
drawbacks of this method is that the view has to be schema bound to the tables
it references. Because of this, I always
suggest that you drop any indexed views that reference system tables before
upgrading as tables that are schema bound to a view cannot be altered and without
dropping the views, it might cause your upgrade to fail.
When a set of tables are joined together on a regular basis, it might
become easier to create a view of the joined tables and have your queries
reference the simple view instead of all of the joined tables. A normal view will use the indexes associated
with each of the individual tables as it sees fit when the view is
queried. In the case of a normal view
the processing will be very close to the same as referencing all of the tables
individually as a view is a virtual representation that does not exists on disk
until it is needed.
In some cases the creation of an indexed view can provide a performance
gain. An indexed view is a view that is
actually created on the disk when the view syntax is executed. The SQL server
engine will keep the view up to date in a real time so you do not need to worry
about that aspect. The benefits of an indexed view is that you can create
indexes on the view and the view definition can be such that you do not have to
have rows in the table that you know will not be referenced (i.e. historical
data). An example of this is when you
have long running processes with many tasks but you have a custom dashboard
that only list tasks in progress or waiting to be selected. In this case you
could create an indexed view only looking at tasks for your application and
only in those two statuses. This will
create a smaller and faster table for your use and can eliminate the overhead
of joins as the view definition will already include the join. Some of the
drawbacks of this method is that the view has to be schema bound to the tables
it references. Because of this, I always
suggest that you drop any indexed views that reference system tables before
upgrading as tables that are schema bound to a view cannot be altered and without
dropping the views, it might cause your upgrade to fail.
Parameterized Queries
When a SQL statement is sent to the engine, one of the first things it
does is to look to see if it has already calculated an execution plan for this
query in the hopes that it does not have to take time to do it again. With the goal of having a plan already cached
in memory it is much better to send the engine the statement
Select name from product where
product_Id = ?
And send in a parameter of ‘12345’ rather than sending in the statement
of
Select name from product where
product_Id = 12345
Now when you want the name for product 67890, the SQL engine will
already have the best plan in memory and will not have to regenerate one if you
use parameterized queries. If you had
not used parameterized queries, the engine would have to look at all of the
possible plans again to calculate the best one and would come up with the same
plan, but waste the resources and time of finding it. You can make this the standard DB setting. (see
the configuration setting above)
When a SQL statement is sent to the engine, one of the first things it
does is to look to see if it has already calculated an execution plan for this
query in the hopes that it does not have to take time to do it again. With the goal of having a plan already cached
in memory it is much better to send the engine the statement
Select name from product where
product_Id = ?
And send in a parameter of ‘12345’ rather than sending in the statement
of
Select name from product where
product_Id = 12345
Now when you want the name for product 67890, the SQL engine will
already have the best plan in memory and will not have to regenerate one if you
use parameterized queries. If you had
not used parameterized queries, the engine would have to look at all of the
possible plans again to calculate the best one and would come up with the same
plan, but waste the resources and time of finding it. You can make this the standard DB setting. (see
the configuration setting above)
Stored Procedures
Large complex queries or those that do processing on multiple individual
rows can be moved to stored procedures to increase performance. A few of the advantages of using a stored
procedure would be that you have separated the SQL logic from the application
allowing you to make changes without updating the application. Also a stored procedure is precompiled code
and will run faster than the same SQL issued straight to the DB. Using a stored procedure opens up the
possibility of what you can do in a single call to the database including
recursive processing and any data manipulation.
Large complex queries or those that do processing on multiple individual
rows can be moved to stored procedures to increase performance. A few of the advantages of using a stored
procedure would be that you have separated the SQL logic from the application
allowing you to make changes without updating the application. Also a stored procedure is precompiled code
and will run faster than the same SQL issued straight to the DB. Using a stored procedure opens up the
possibility of what you can do in a single call to the database including
recursive processing and any data manipulation.
Common Table Expressions
As part of a stored procedure (but they can be used in submitted query
as well) is the Common Table Expression (CTE).
A CTE is temporary view that can be referenced by the next SQL Statement
in the same batch. This allows for
easier reading of complex queries, two pass processing and not having to repeat
multiple joins in different part of Unioned SQL statements. While the use of a
CTE really has no effect on performance, they can make very complex SQL
statement more readable and easier to find performance issues.
As part of a stored procedure (but they can be used in submitted query
as well) is the Common Table Expression (CTE).
A CTE is temporary view that can be referenced by the next SQL Statement
in the same batch. This allows for
easier reading of complex queries, two pass processing and not having to repeat
multiple joins in different part of Unioned SQL statements. While the use of a
CTE really has no effect on performance, they can make very complex SQL
statement more readable and easier to find performance issues.
No Lock
Another
trick that I use on a regular basis is the nolock hint. Use of this hint tells SQL server to read the
data for read only purposes and not to apply a lock on the rows that are being
read. Of course, in SQL server row lock
could become page or table locks, and session blocking could occur. The draw
back here is that you can get what are considered ‘Dirty’ reads, which are rows
written to disk that have not been committed yet and could be rolled back. Most systems are ok doing dirty reads but you
will want to examine your system requirements to be sure. If it is ok with system wide dirty reads this
option can also be set at a connection level by using the following statement:
SET TRANSACTION ISOLATION LEVEL READ
UNCOMMITTED
To use the
no lock hint on the statement by statement basis, use the ‘with (Nolock)’
syntax in the SQL statement. Examples of a single table a multiple table select
statements with the nolock hint in place.
Select *
From Lombardi.Product with
(Nolock)
Order by productId
Select *
From Lombardi.Product P with
(Nolock)
Left Outer Join Lombardi.PImage PI
with (Nolock) on p.productId = PI.productId
Order by productId.
Another
trick that I use on a regular basis is the nolock hint. Use of this hint tells SQL server to read the
data for read only purposes and not to apply a lock on the rows that are being
read. Of course, in SQL server row lock
could become page or table locks, and session blocking could occur. The draw
back here is that you can get what are considered ‘Dirty’ reads, which are rows
written to disk that have not been committed yet and could be rolled back. Most systems are ok doing dirty reads but you
will want to examine your system requirements to be sure. If it is ok with system wide dirty reads this
option can also be set at a connection level by using the following statement:
SET TRANSACTION ISOLATION LEVEL READ
UNCOMMITTED
To use the
no lock hint on the statement by statement basis, use the ‘with (Nolock)’
syntax in the SQL statement. Examples of a single table a multiple table select
statements with the nolock hint in place.
Select *
From Lombardi.Product with
(Nolock)
Order by productId
Select *
From Lombardi.Product P with
(Nolock)
Left Outer Join Lombardi.PImage PI
with (Nolock) on p.productId = PI.productId
Order by productId.
No comments:
Post a Comment