Query Performance Insights
Find query hashes which are timing out
Compare query plans for timing out query hash
Look for avg_cpu_millisec column. The query plan with lower value is better. If there is only one query plan, then you need to change your query to use the right indexes.
Analyze the query plan
Once you have the query plan hash for which the query is timing out, you can view and analyze it by running this query
Force a query plan
If you have multiple query plans for a query, where one query plan is performing far better than the other, you can force that query plan. You need to analyze both query plans before you decide a force a specific plan so that it will work for both large and small volume of data.
The most common place to look for SQL Azure issues is Query
Performance Insights on Azure portal https://docs.microsoft.com/en-us/azure/sql-database/sql-database-query-performance
Troubleshoot SQL query timeouts
There are various Data Management Views (DMV’s) created by
SQL Azure team https://docs.microsoft.com/en-us/azure/sql-database/sql-database-monitoring-with-dmvs
I tweaked them a little below:
Find query hashes which are timing out
Look for Aborted_Execution_Count column.
--
Top 15 CPU consuming queries by query hash
--
note that a query hash can have many
query id if not parameterized or not parameterized properly
--
it grabs a sample query text by min
WITH
AggregatedCPU AS (SELECT q.query_hash, p.query_plan_hash, SUM(count_executions
* avg_cpu_time / 1000.0) AS total_cpu_millisec, SUM(count_executions *
avg_cpu_time / 1000.0)/ SUM(count_executions) AS avg_cpu_millisec,
MAX(rs.max_cpu_time / 1000.00) AS max_cpu_millisec, MAX(max_logical_io_reads)
max_logical_reads, COUNT(DISTINCT p.plan_id) AS number_of_distinct_plans,
COUNT(DISTINCT p.query_id) AS number_of_distinct_query_ids, SUM(CASE WHEN rs.execution_type_desc='Aborted'
THEN count_executions ELSE 0 END) AS Aborted_Execution_Count, SUM(CASE WHEN
rs.execution_type_desc='Regular' THEN count_executions ELSE 0 END) AS
Regular_Execution_Count, SUM(CASE WHEN rs.execution_type_desc='Exception' THEN
count_executions ELSE 0 END) AS Exception_Execution_Count,
SUM(count_executions) AS total_executions, MIN(qt.query_sql_text) AS
sampled_query_text
FROM
sys.query_store_query_text AS qt
JOIN
sys.query_store_query AS q ON qt.query_text_id=q.query_text_id
JOIN
sys.query_store_plan AS p ON q.query_id=p.query_id
JOIN
sys.query_store_runtime_stats AS rs ON rs.plan_id=p.plan_id
JOIN
sys.query_store_runtime_stats_interval AS rsi ON
rsi.runtime_stats_interval_id=rs.runtime_stats_interval_id
WHERE
rs.execution_type_desc IN ('Regular', 'Aborted', 'Exception')AND
rsi.start_time>=DATEADD(HOUR, -24, GETUTCDATE())
GROUP BY q.query_hash,
p.query_plan_hash), OrderedCPU AS (SELECT query_hash, query_plan_hash,
total_cpu_millisec, avg_cpu_millisec, max_cpu_millisec, max_logical_reads,
number_of_distinct_plans, number_of_distinct_query_ids, total_executions,
Aborted_Execution_Count, Regular_Execution_Count, Exception_Execution_Count,
sampled_query_text, ROW_NUMBER() OVER (ORDER BY total_cpu_millisec DESC,
query_hash ASC) AS RN
FROM AggregatedCPU)
SELECT
OD.query_hash, OD.query_plan_hash, OD.total_cpu_millisec, OD.avg_cpu_millisec,
OD.max_cpu_millisec, OD.max_logical_reads, OD.number_of_distinct_plans,
OD.number_of_distinct_query_ids, OD.total_executions, OD.Aborted_Execution_Count,
OD.Regular_Execution_Count, OD.Exception_Execution_Count,
OD.sampled_query_text, OD.RN
FROM
OrderedCPU AS OD
WHERE
OD.RN<=15
ORDER
BY total_cpu_millisec DESC;
Compare query plans for timing out query hash
Look for avg_cpu_millisec column. The query plan with lower value is better. If there is only one query plan, then you need to change your query to use the right indexes.
--
Top 15 CPU consuming queries by query hash
--
note that a query hash can have many
query id if not parameterized or not parameterized properly
--
it grabs a sample query text by min
WITH
AggregatedCPU AS (SELECT q.query_hash, p.query_plan_hash, SUM(count_executions
* avg_cpu_time / 1000.0) AS total_cpu_millisec, SUM(count_executions *
avg_cpu_time / 1000.0)/ SUM(count_executions) AS avg_cpu_millisec,
MAX(rs.max_cpu_time / 1000.00) AS max_cpu_millisec, MAX(max_logical_io_reads)
max_logical_reads, COUNT(DISTINCT p.plan_id) AS number_of_distinct_plans,
COUNT(DISTINCT p.query_id) AS number_of_distinct_query_ids, SUM(CASE WHEN
rs.execution_type_desc='Aborted' THEN count_executions ELSE 0 END) AS
Aborted_Execution_Count, SUM(CASE WHEN rs.execution_type_desc='Regular' THEN
count_executions ELSE 0 END) AS Regular_Execution_Count, SUM(CASE WHEN rs.execution_type_desc='Exception'
THEN count_executions ELSE 0 END) AS Exception_Execution_Count,
SUM(count_executions) AS total_executions, MIN(qt.query_sql_text) AS
sampled_query_text
FROM
sys.query_store_query_text AS qt
JOIN sys.query_store_query AS q ON
qt.query_text_id=q.query_text_id
JOIN
sys.query_store_plan AS p ON q.query_id=p.query_id
JOIN
sys.query_store_runtime_stats AS rs ON rs.plan_id=p.plan_id
JOIN
sys.query_store_runtime_stats_interval AS rsi ON
rsi.runtime_stats_interval_id=rs.runtime_stats_interval_id
WHERE
rs.execution_type_desc IN ('Regular', 'Aborted', 'Exception')AND
rsi.start_time>=DATEADD(HOUR, -24, GETUTCDATE())
GROUP BY q.query_hash,
p.query_plan_hash), OrderedCPU AS (SELECT query_hash, query_plan_hash,
total_cpu_millisec, avg_cpu_millisec, max_cpu_millisec, max_logical_reads,
number_of_distinct_plans, number_of_distinct_query_ids, total_executions,
Aborted_Execution_Count, Regular_Execution_Count, Exception_Execution_Count,
sampled_query_text, ROW_NUMBER() OVER (ORDER BY total_cpu_millisec DESC,
query_hash ASC) AS RN
FROM AggregatedCPU)
SELECT
OD.query_hash, OD.query_plan_hash, OD.total_cpu_millisec, OD.avg_cpu_millisec,
OD.max_cpu_millisec, OD.max_logical_reads, OD.number_of_distinct_plans,
OD.number_of_distinct_query_ids, OD.total_executions, OD.Aborted_Execution_Count,
OD.Regular_Execution_Count, OD.Exception_Execution_Count,
OD.sampled_query_text, OD.RN
FROM
OrderedCPU AS OD
WHERE
OD.query_hash= query_hash
ORDER
BY total_cpu_millisec DESC;
Analyze the query plan
Once you have the query plan hash for which the query is timing out, you can view and analyze it by running this query
select
qsq.query_hash
,qsp.query_plan_hash
,qsq.query_id
,qsp.plan_id
,qsq.query_text_id
,qsq.is_internal_query
,qsrts.first_execution_time
,qsrts.last_execution_time
,qsqt.query_sql_text
,CAST(qsp.query_plan AS XML)
,qsrts.count_executions
from
[sys].[query_store_runtime_stats] qsrts
inner join [sys].[query_store_plan] qsp on
qsrts.plan_id = qsp.plan_id
inner join [sys].[query_store_query] qsq on
qsp.query_id = qsq.query_id
inner join [sys].[query_store_query_text] qsqt
on qsq.query_text_id = qsqt.query_text_id
where qsp.query_plan_hash = query_plan_hash
Force a query plan
If you have multiple query plans for a query, where one query plan is performing far better than the other, you can force that query plan. You need to analyze both query plans before you decide a force a specific plan so that it will work for both large and small volume of data.
https://docs.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-query-store-force-plan-transact-sql?view=sql-server-2017
select
* from sys.query_store_plan where is_forced_plan=1
EXEC
sp_query_store_force_plan query_id, plan_id;
No comments:
Post a Comment