A closer look at everything related to SQL Server

TempDB Bottlenecks and More

There has been many articles, documents and blog posts written about various problems and resolutions of TempDB. This is my effort at understanding and gathering relevant information about TempDB bottlenecks in one place for myself and any audience of this blog.

INTRODUCTION OF TEMPDB

Tempdb is a SQL Server temporary system database. Everytime SQL Server restarts, new Tempdb database is created. Whatever data was stored in the tempdb is lost. When TempDB gets recreated its data and log files reset to their last configured size. SQL Server records only enough information in the tempdb transaction log to roll back a transaction, but not to redo transactions during database recovery. This feature increases the performance of INSERT statements in tempdb. Tempdb database is always in Simple recovery mode. If you try to change it to Full Recovery mode, you will receive the following error message.

If you try to backup TempDB, you will get another error saying backup and restore is not allowed on TempDB database.

This system database is used by all the other user databases on SQL Server instance for their temporary work. It is the most busy database on almost all production SQL Servers. Here is a short list of few of the features that use tempdb.

  • Temporary tables, table variables, and table-valued functions.
  • Work tables associated with Cursors, Order By, Group by and Hash Query Plans.
  • Version stores used by Read Committed Snapshot Isolation
  • Index creation, Online index creation
  • DBCC CHECKDB
  • User-defined functions

These objects stored in TempDB are categorized as following:

USER OBJECTS – such as temporary local and global tables, table variables, table-valued functions etc.

INTERNAL OBJECTS – such as objects created to support indexing operations and query execution like work tables.

VERSION STORES – such as created when RSCI or SI isolation levels are used by transactions.

Following TSQL code uses DMV sys.dm_db_file_space_usage to get the TempDB space usage, separating out the above 3 categories.

— tempdb data file usage
SELECT GETDATE() AS runtime,
SUM(user_object_reserved_page_count) * 8 AS usr_obj_kb,
SUM(internal_object_reserved_page_count) * 8 AS internal_obj_kb,
SUM(version_store_reserved_page_count) * 8 AS version_store_kb,
SUM(unallocated_extent_page_count) * 8 AS freespace_kb,
SUM(mixed_extent_page_count) * 8 AS mixedextent_kb
FROM sys.dm_db_file_space_usage;

TEMPDB PROBLEMS

Problem 1: TempDB database grows out of disk space and only option left is to restart the SQL Server service.

There could be many ways to approach the resolution. But I believe monitoring the space usage is the best weapon. Here is a great post by William Durkin, who creates an Extended Event session to monitor space usage of data and log files of TempDB database.

— XEvent Session to track tempdb file growths
CREATE EVENT SESSION TempDBFileGrowth
ON SERVER
ADD EVENT sqlserver.database_file_size_change(
SET collect_database_name = ( 0 )
ACTION (sqlserver.client_app_name, sqlserver.client_hostname,
sqlserver.database_id, sqlserver.session_nt_username,
sqlserver.sql_text)
WHERE ([database_id] = ( 2 )))
ADD TARGET package0.event_file
( SET filename = N’TempDBFileGrowth.xel’,
max_file_size = ( 10 ), max_rollover_files = ( 5 ))
WITH (
MAX_MEMORY = 16384KB,
EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY = 1 SECONDS,
MEMORY_PARTITION_MODE = PER_NODE,
TRACK_CAUSALITY = OFF,
STARTUP_STATE = ON
);
GO
ALTER EVENT SESSION TempDBFileGrowth ON SERVER STATE = START;
GO

Create this Extended Event session for monitoring tempdb space usage. But be careful, this is for troubleshooting only. You can leave it running continuously but first change the dispatch_latency to 30 seconds which is the default. Dispatch latency controls how often XEvent data in buffer pool will be flushed and written to the disk. If this frequency is very low, on busy systems, it could affect the performance of SQL Server.

Problem 2: Tempdb PFS and SGAM page contentions.

Imagine a particular SQL Server instance processing concurrent queries. Each one of them creating tiny temp tables, use them and then drop them. These temporary tables are stored in Tempdb database. PFS page in the tempdb data file keeps track of all the page allocations and deallocations. Only one thread at a time can make changes in PFS and as a result you can see it creates a bottleneck. This behavior of Tempdb is known as PFS Page Contention. This issue can be resolved by having multiple Tempdb data files. The guidance is that you should start with 1 data file per CPU core till you reach 8 files. Add more files in chunk of 4 if you are still seeing PFS page contentions. You also need to enable trace flag 1117 to have these multiple files grow at same rate. This is a server wide setting so it changes the behavior of all user databases on the instance and not only TempDB.

Sometimes Tempdb reports contention in SGAM page. This happens when SQL Server searches for a mixed pages to allocate to a new object created and scans SGAM page. This results in severe contention on the SGAM page when multiple mixed-page allocations are underway. Deallocation activities also modify this page and create more contentions. In this situation, trace flag 1118 can be implemented to allow SQL Server to allocate full extents to the new objects. This is a server wide setting so it changes the behavior of all user databases on the instance and not only TempDB. Monitor this behavior using DMV sys.dm_os_waiting_tasks. You will notice wait types along with resource_description such as 2:1:1 and 2:1:3 (PFS and SGAM Pages in TempDB database in File 1).

SELECT
[owt].[session_id], [owt].[exec_context_id],
[owt].[wait_duration_ms], [owt].[wait_type],
[owt].[blocking_session_id], [owt].[resource_description],
CASE [owt].[wait_type]
WHEN N’CXPACKET’ THEN
RIGHT ([owt].[resource_description],
CHARINDEX (N’=’, REVERSE ([owt].[resource_description])) – 1)
ELSE NULL
END AS [Node ID],
[es].[program_name], [est].text,
[er].[database_id], [eqp].[query_plan],
[er].[cpu_time]
FROM sys.dm_os_waiting_tasks [owt]
INNER JOIN sys.dm_exec_sessions [es] ON
[owt].[session_id] = [es].[session_id]
INNER JOIN sys.dm_exec_requests [er] ON
[es].[session_id] = [er].[session_id]
OUTER APPLY sys.dm_exec_sql_text ([er].[sql_handle]) [est]
OUTER APPLY sys.dm_exec_query_plan ([er].[plan_handle]) [eqp]
WHERE
[es].[is_user_process] = 1
ORDER BY
[owt].[session_id], [owt].[exec_context_id];
GO

The above query is from Paul Randal. Another useful blog about Tempdb is from Pamela Lahoud of Microsoft.

You will see later in this blog post that since SQL Server 2016, trace flags 1117 and 1118 are not needed because they are now the default behavior of TempDB database.

Problem 3: Tempdb Metadata contentions

Third type of bottleneck we see in Tempdb is from system tables. This is also known as metadata contention and it is occurring because of hardware improvements which enables creation and deletion of temporary objects at a faster rate in TempDB database. This is a good thing because we want TempDB to be faster but it created another type of contention which has to do with keeping track of all these metadata changes in the system table. SQL Server 2019 has pretty much resolved this issue by providing an option to place the TempDB system tables on In-Memory files. In-memory tables do not face any latches or locks as it is in RAM and not on the disk. Since these system tables only contain tempdb metadata their footprint is not that big on the SQL Server memory.

You can use the following script to view all the sessions that are waiting for page-related wait types and get information about the objects that the pages belong to. sys.fn_PageResCracker is a new function available in SQL Server 2019 that takes the Page_resource value from sys.dm_db_exec_requests dmv and give you the detail of locks and latches at the page level.

USE master
GO

SELECT
er.session_id, er.wait_type, er.wait_resource,
OBJECT_NAME(page_info.[object_id],page_info.database_id) as [object_name],
er.blocking_session_id,er.command,
SUBSTRING(st.text, (er.statement_start_offset/2)+1,
((CASE er.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text)
ELSE er.statement_end_offset
END – er.statement_start_offset)/2) + 1) AS statement_text,
page_info.database_id,page_info.[file_id], page_info.page_id, page_info.[object_id],
page_info.index_id, page_info.page_type_desc
FROM sys.dm_exec_requests AS er
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) AS st
CROSS APPLY sys.fn_PageResCracker (er.page_resource) AS r
CROSS APPLY sys.dm_db_page_info(r.[db_id], r.[file_id], r.page_id, ‘DETAILED’) AS page_info
WHERE er.wait_type like ‘%page%’
GO

Problem 4: Version Store growth in read-only secondary replica

If you have Availability Group setup for your SQL Server instance, make sure to monitor version store growth in Tempdb database on secondary replica. In addition to following query, you can also monitor this with performance counters like Version Store Size (KB), Version Cleanup rate (KB/s) and Version Generation rate (KB/s). Here is an interesting write up about this problem on AG secondaries.

— Version store usage
SELECT DB_NAME(database_id) AS [database_name], reserved_page_count, reserved_space_kb
FROM sys.dm_tran_version_store_space_usage
ORDER BY reserved_space_kb DESC;

IMPROVEMENTS IN TEMPDB

  1. Optimizing SQL Server Eager Writes

One long standing issue with TempDB in the past was that when you perform SELECT INTO temporary table operation, poor performance on disk I/O occurred in tempdb database. This was resolved in SQL Server 2014 RTM and back ported to SQL Server 2012 SP2 CU1. SQL Server has a concept of Eager Writes that prevent flooding the buffer pool with pages that are newly created from bulk activities and need to be written to disk. SQL Server 2014 enhancements now relaxes the need to flush these pages to the tempdb data files as quickly as the older versions because it recognizes objects created in tempdb would be temporary.

2. Tempdb Installation Defaults

Having multiple tempdb data files to remove PFS and SGAM page contentions has been long standing recommendations from Microsoft. Prior to SQL Server 2016, DBAs would manually add the additional TempDB data files after installing SQL Servers. and enable trace flags 1118 and 1117 to evenly grow these files. SQL Server 2016 installation now includes this step in the setup. It recognizes the number of CPU cores and based on that gives the option of creating multiple files. You can change the no. of files, file location etc. Also there is no need to use the previously mentioned trace flags, as this has now became the default behavior of TempDB database.

3. Tempdb system tables move to In-Memory OLTP

Once again SQL Server 2019 brought a huge improvement to TempDB performance. You can enable “Optimize for Tempdb Metadata” feature and it will put all the TempDB metadata (system tables) in memory. This takes away all type of pagelatch_XX wait types. When temporary objects are created and removed at super fast pace, due to huge improvements in hardware, disk is no longer the bottleneck for these operations. These metadata objects reside in RAM (memory) now.

TEMPDB BEST PRACTICES

  1. Configure multiple tempdb data files. Start with 8 if you have 8 or more logical cores. Then add more in chunks of 4 if you still see PFS contention. Multiple files are beneficial even if all are on the same drive which is dedicated for Tempdb.
  2. Enable trace flag 1118 as a startup trace flag for SQL Server prior to SQL Server 2016. This prevents SGAM contentions.
  3. Enable trace flag 1117 as a startup trace flag for SQL Server prior to SQL Server 2016. This makes multiple tempdb files to grow equally.
  4. Always put the tempdb data and log files on fastest disk. Have it local to the machine rather than in SAN. It keeps your SAN path free for user database traffic
  5. Put Tempdb database on a separate volume then your user databases.
  6. Just pre-allocate the tempdb files as large as possible, with enough room to give you time to react to unforeseen scenarios such as SQL Server encounters long running query.
  7. Do not explicitly drop temp tables at the end of a stored procedure, they will get cleaned up when the session that created them ends.
  8. Do not alter temp tables after they have been created.
  9. Do not truncate temp tables.
  10. Move index creation statements on temp tables to the new inline index creation syntax that was introduced in SQL Server 2014.

SUMMARY

The key to keep your TempDB healthy is proactive and continuous monitoring with DMVs and XEvents. Setup up proper alerting to notify when space usage reaches certain threshold. Document the TempDB configurations into your SOP (Standard Operating Procedure) documentation to be consistent across all your SQL Servers. Keep an eye on version store usage and be careful about their growth on read-only secondary replicas. Consider using Memory Optimized Tempdb Metadata feature of SQL Server 2019.

Automatic Soft NUMA

NUMA is basically a memory and CPU architecture inside the computer machine. NUMA stands for Non-Uniform Memory Access. The purpose of NUMA is for Scheduler (CPU) to have faster access to memory. Each CPU talks to memory controller to get the memory. When there are more than 8 CPUs, single memory controller becomes  bottleneck. This became a scalability issue. NUMA is a hardware configuration which was built to scale beyond the processor and memory limitations of traditional Symmetric-Multi-Process (SMP) systems. If there are 16 CPUs, there will 2 NUMA nodes (node 0 and 1). Each node will have its own memory controller and serves up to 8 CPU. CPU and Memory are partitioned at the hardware level to improve the access to memory. SQL Server is NUMA aware application. You don’t have to enable it. Almost all modern computer hardware now has this hardware NUMA.

I would highly recommend anyone new to NUMA to read this article written by SQLSkills team.

Following is the image of SMP configuration where all processors have equal access to memory and I/O. As you can imagine, increasing the number of CPU in this architecture will create a bottleneck because there is only one memory controller and one I/O controller.

Fig 1

In Fig 2 below you are seeing 2 NUMA nodes and CPU and Memory are partitioned. Each NUMA node gets its own memory and I/O controller.

Fig 2

Automatic Soft NUMA:

Modern processors have multiple cores per socket. Each socket is represented, usually, as a single NUMA node. The SQL Server database engine partitions various internal structures and partitions service threads per NUMA node. With processors containing 10 or more cores per socket, using software NUMA to split hardware NUMA nodes generally increases scalability and performance. Prior to SQL Server 2014 (12.x) SP2, software-based NUMA (soft-NUMA) required you to edit the registry to add a node configuration affinity mask, and was configured at the host level, rather than per instance. Starting with SQL Server 2014 (12.x) SP2 and SQL Server 2016 (13.x), soft-NUMA is configured automatically at the database-instance level when the SQL Server Database Engine service starts. Please read this  documentation and this documentation for more understanding.

Note 1: Hot-add processors are not supported by soft-NUMA

Note 2: Starting with SQL Server 2014 (12.x) SP2, use trace flag 8079 to allow SQL Server to use Automatic Soft-NUMA. Starting with SQL Server 2016 (13.x) this behavior is controlled by the engine and trace flag 8079 is not required.

The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs. Soft NUMA allows you to further divide schedulers in smaller group. You can read a really good description about how NUMA works here

NUMA can match memory with CPUs through specialized hardware (hardware NUMA) or by configuring SQL Server memory (soft-NUMA). During startup, SQL Server configures itself based on underlying operating system and hardware configuration or the soft-NUMA setting. For both hardware and soft-NUMA, when SQL Server starts in a NUMA configuration, the SQL Server log records a multimode configuration message for each node, along with the CPU mask.

Here are couple of useful queries I use to find out about NUMA layout of my customer’s SQL Server. I am using Glenn’s Performance Queries modified on September 13, 2020.

— Get instance-level configuration values for instance  (Query 4) (Configuration Values)

SELECT name, value, value_in_use, minimum, maximum, [description], is_dynamic, is_advanced FROM sys.configurations WITH (NOLOCK) ORDER BY name OPTION (RECOMPILE);

In above query result,  focus on these settings:

  1. automatic soft-NUMA disabled (should be 0 in most cases)
  2. backup checksum default (should be 1)
  3. backup compression default (should be 1 in most cases)
  4. clr enabled (only enable if it is needed)
  5. cost threshold for parallelism (depends on your workload)
  6. lightweight pooling (should be zero)
  7. max degree of parallelism (depends on your workload and hardware)
  8. max server memory (MB) (set to an appropriate value, not the default)
  9. optimize for ad hoc workloads (should be 1)
  10. priority boost (should be zero)
  11. remote admin connections (should be 1)
  12. tempdb metadata memory-optimized (0 by default, some workloads may benefit by enabling)

Following query gives you some useful information about the composition and relative load on your NUMA nodes. You want to see an equal number of schedulers on each NUMA node.  Watch out if SQL Server 2019 Standard Edition has been installed on a physical or virtual machine with more than four sockets or more than 24 physical cores.

–SQL Server NUMA Node information  (Query 12) (SQL Server NUMA Info)

SELECT node_id, node_state_desc, memory_node_id, processor_group, cpu_count, online_scheduler_count,   idle_scheduler_count, active_worker_count, avg_load_balance, resource_monitor_state FROM sys.dm_os_nodes WITH (NOLOCK)WHERE node_state_desc <> N’ONLINE DAC’ OPTION (RECOMPILE);

Automatic Soft-NUMA is enabled by default. It is recommended best practice to leave it enabled.

To disable Automatic Soft-NUMA, run the following command.

USE master; 

GO 

EXEC sp_configure ‘automatic soft-NUMA disabled’, ‘1’; 

RECONFIGURE WITH OVERRIDE; 

I attended PASS Summit 2019 in November last year, 6 months ago. Nice thing was that I took all my notes in OneNote which I had just recently started using. Then life got into my way and I totally forgot about it. Then COVID-19 happened and I still did not remember till today. So here it is.

This time, I didn’t purchase any PreCon. It is not because there were no good ones but the opposite that there were so many that I worried I will pay the steep price for one and it will not be as expected or I would wish later on that I had attended the other one. So I made a cowardly decision and went to none.

But I did enjoy spending time with several friends and met lot of people from my previous work places. No breakfast this time at PASS was a bummer. They gave us $25 Starbucks gift card in lieu of 3 day of breakfasts. Just to let organizer know, if they are reading this, a modest breakfast in Seattle was priced around $10 in 2019. Another suggestion, if prices for PASS merchandises are lowered, for example a $60 t-shirt down to $30 (not sacrificing the quality), I think more people would buy them and take souvenirs home from PASS summit and net profit in the end will come out to be the same. I will keep the rest of the suggestions to myself.

Disclaimer: These are my notes taken while attending sessions and any inaccuracies are my own.

Session: SQL Server 2019 new tools

Speaker: Vicky Harp (SQL Server tools team in Microsoft)

This was all demos, no slides session.

Since 2016 tools and Engine connection was broken when SSMS became available as a separate install from database engine installation.

Azure Data Studio (ADS) is a new tool to communicate with SQL Server engine and is more powerful than SSMS in some ways. But SSMS is going to stay. It is purely for SQL Server on Windows platform and it is a 32 bit application.

ADS supports hybrid environment and platform. Your job is to secure, HA and DR your data and not learn every new technology that comes your way. ADS serves this purpose very well. It is a one stop shop kind of tool. You can connect to windows, Linux and can even connect from your non windows client such as a Mac Book. It natively supports beside TSQL, PowerShell, Python, Scala, Spark, Java etc.

Notebook in ADS is another great feature. You can run the query typed in notebook and result and comments will be saved in the same notebook. You can create complete solution in it and can email this notebook to your colleague. You can open notebook in any other compatible tools such as SSMS. You can create SOP documents for your helpdesk team. It is not compatible with OneNote yet. But discussions on this subject is in progress.

There are lot of extensions available in ADS which further improves its usefulness. Extensions such as SQK server Agent, Admin Pack for SQL Server and SQL Server Profiler to name few are available. Each extension has multiple tasks.

SQL Server Edge running on IOT devices can analyze IOT data on the device itself instead of sending it to some server to analyze. Stream Analytics is used for this purpose.

Session: Working with SQL Server Containers

Speaker: Carlos Robles

Demos: C:\Users\amlalani\OneDrive\Training\PASS Summit 2019\SQL Server Containers

Why containers?

  • Containers are standard unit of software.
  • They provide security and isolation.
  • They are lightweight, portable and cross platform.
  • They are consistent and predictable and no surprises
  • Continuous Integration (CI) and Continuous Deliver (CD)
  • eliminates shared environments and resource contentions.

SQL Server as Application:

Before difficult to integrate, operational complexity,

Now Docker Compose uses YAML file for deployment

CI – Commit -> Trigger -> Notify -> Test

CD – Deliver -> Deploy

Jenkins is open source Automation Server. Pipelines are set of jobs.

Session: Intelligent Query and Performance

Speaker: Pedro Lopez and Joe Sack

  1. Intelligent DB vision:
    1. Anxiety free upgrades
    2. Don’t worry about regressions
    3. Query tuning assistance tools
  2. Performance IQP
    1. Resource Governor
    2. Soft NUMA became default in SQL Server 2019
    3. Many new improvements in newer version are not advertized so by just upgrading you get benefits out of the box.
  3. Management by default
    1. Accelerated Database Recovery ADR
    2. Online Resumable index creation
    3. Lightweight Profiling
    4. Automatic SQL
    5. Threat Protection
    6. GDPR questions can be answered with classification and vulnerability assessment tools

Memory Grant Feedback MGF:

Available in compatibility mode 140+

Memory misestimation can result in spills to disk and overestimation hurt concurrency.

If less than 50% memory need is granted, resulting in spills, then MGF will fix and add memory plus some buffer.

In 2017, Batch mode was introduced and was available for column-store indexes only.

In 2019, row-store Batch mode is available for all index types; row and column.

  1. Does Query optimizer has to do lot of work to first check if Query is eligible for Batch mode execution?

MFG has to learn again if plan is kicked out of cache for some reason.

Batch Mode Columnstore:

Built for Analytical workload and IO bound queries. Columnstore provides effective compression over traditional rowstore.

Rowstore Batch Mode:

If you cannot add Columnstore for some reason, maybe vendor does not support or table has trigger then Rowstore Batch Mode will come in handy. Adds new scan operator. Same for Heap and B-trees tables. Great for CPU bound queries.

Interleave MSTVFs:

Query processor treats it as a black box because it materializes at runtime only and QP cannot make any decision about the plan. SQL 2016 and prior version guesstimated 100 rows. Inefficient if 1000 or 10,000 rows actually returned. SQL 2017 + assumes 500 rows for 1st run and then adjusts. Downstream operations will benefit.

Batch Mode Adaptive Join (AJ):

If cardinality Estimate is wrong, wrong join operator will be selected in a plan which will be cached.

AJ will defer the choice of Hash or Nested join until after the first join input has been scanned.

Resource: aka.ms/IQP

Session: Data Classification

Speaker: Grant Fritchey

Security and compliance requirements are undeniable. It should be first and foremost.

Regulatory Landscape:

GDPR: European. Marriott and British Airways were fined. Half of UK firms are still not GDPR compliant.

FERPA: Department of Education law in US to protect adult student from parents. They don’t share any information even if the student is dependent and in hospital or sick.

HIPPA: Health regulation for medical industry. If a person has shared PII data knowingly they could go to jail.

CCPA: California flavor of GDPR regulation and will be enforced on Jan 1, 2020.

All of them have room for interpretation and none of them is perfect.

Session: Successfully communicate with Customer

Speaker: Dennise McInevney

Results of Bad communications:

Project Fails

Career hurt or ends

Frustration and stress

Time and Money wasted

  1. Empathy: Walk in your customer shoes. See their pain from their eyes. See how they use your product. See where can you help them.
  2. Us vs them mentality: Be one team. Make sure customer understands that you are on their side by you language and attitude.
  3. Talk/Speech: Be concise, clear and to the point in written and verbal communication
  4. Writing Skills:

Who are your customers? Identify them and identify their customers. Understand goals of both groups.

Requirements (How of things)  <  >   Goals (What of things)

  1. Understand customer goals first.
  2. Desired outcome second
  3. Understand business context

Don’t

Do

Jump to Solution

Listen carefully

Make assumptions

Ask Questions

Start designing

Be precise

Diagramming/coding

Identify approver first

Write down everything in a place easily accessible.

Don’t

Do

Email

Wiki\SharePoint

Chat

Agile Tools like Jira which keeps the history

Hallway conversation

Shared document

Get Alignment

Get Sign off

Ensure clarity

Don’t skip

Pick up phone when necessary

Joint accountability

Record the outcome in writing

If you are bilingual and can speak customer language, that is a great skill.

What you can do?

  • Practice Active listening
  • Don’t multitask
  • Check for understanding
  • Summarize
  • Review the facts

Change is inevitable

Don’t

Do

Stress Out

Take a deep breath

Get Mad

Ask questions

Cast blame

Update requirement

Managing Expectation:

After things are signed off, we need to manage the expectations such as how often we will have meetings, how status of projects will be reported and such.

Reporting Status:

  • Consistent format and time
  • If you want others to follow the reporting tool, make sure you are seriously using it.
  • No surprises
  • Alert if not working as expected, missed dependencies or duration.

Show as you Go:

Sketches, diagrams, demos at regular cadence and be open to feedback.

Deadlines:

Keep calm, communicate progress, what is okay to release , negotiate, trade off can happen.

Session: Adaptive QP to Intelligent QP

Speaker: Hugo Kornelis

SQLServerFast.com

http://sqlserverfast.com/blog

@Hugo_Kornelis

Query Processing (QP)

  • Phase 1 – Optimization (Cardinality Estimates, Execution Plan)
  • Phase 2 – Execution (Actual Estimates)

Merge is better if data is properly sorted. Puts everything in first input in memory.

Nested loop reads 1st row and start finding matching row in second source.

AJ is new join type and it has 3 inputs.

Only problem is AJ for Batch mode which was for columnstore index.

Interleaved Execution:

MSTVF – don’t know how big result is going to be

1 or 100 is hard-coded value and can never be right. In 2017, as soon as QP detect MSTVF, it paused the execution and starts another process that materialize the MSTVF first. You get the number of rows and now QP can give you the correct plan. But this plan is now cached and next time MSTVF comes with different parameter, it may not work efficiently. So the problem remains.

QP in SQL Server 2019:

Memory Grant Feedback (MGF):

Determines during compile time base on cardinality estimate. If underestimate, spills to the disk (tempdb). If overestimates, wasted memory and impacts concurrency.

First execution of MGF improves 2nd execution.

  • Adjusts down if > 50% memory unused the first time
  • Adjusts up after any SPILL + some buffer.

Stores last exec plan, forgets after recompile. After adapting, certain time for example x threshold. If more than that it stops adapting.

Extended events are available to track this info.

*** Check this demo out and put it in PTO workshop ***

UDF Inlining:

FROID is a framework that plugs in optimizer before optimization takes place. It looks at all the statements in UDF and converts into relational table and sends to QP which then generates the plan.

Session: Batch Mode on Rowstore

Speaker: Nikko Neuberger

Speaker Tip #1: Telling story and showing pictures and cartoons and bold words on screen worked very good with audience. He got laughs and attention (5 minutes)

Speaker Tip #2: If demo fails, keep talking while working and may refer to Demo GOD.

Read execution plant Right to left and also left to Right.

Exec plan query executes from Left to Right.

Nikko displayed drama script of row execution. He got 3 attendees and had piles of pages in different color.

Sort operator asks for data page from table one at a time.

In batch mode, all data pages are given at the same time to sort operator saving time.

Rt click + Act Ex rows + Est Exec rows -> shows row mode

*** Ask MS to remove est execution because not adding value and cluttering the GUI.

Rt click + see Bath execution mode and storage as Rowstore

He made an interesting analogy comparing Batch mode vs Row mode. Fast cars like Ferrari or Mustang vs a bicycle. Batch mode need little more space to work as fast car need highway to go fast.

In a Demo1 of 150 vs 140, timing was 12 sec and 18 sec respectively. CPU in both execution was high but in 140 it was more IO.

In a Demo2, Batch mode on Rowstore did 11 sec and Batch mode on columnstore did 6.8 sec. Fantastic! Since table had columnstore index, batch mode kicks in by default.

Demo3 was about Batch Mode Heuristics. You can see waittype of Batch_mode_scan_on_rowstore in an extended event sessions for Full batch when your run following queries.

  • Select count of rows in a table
  • Query table of 1 GB size
  • Query table of 10 GB size
  • Disable Batch mode and run query
  • Run query with TOP(20) and order by

In 2019, you will need columnstore because

  • Batch mode w/o columnstore index will not have compression
  • No advance storage engine features. Makes a difference in Logical Reads. Does the aggregation in memory instead of reading and bringing pages to query optimizer.
  • Columnstore index will always result in Optimization level = FULL
  • Hekaton disables Batch mode scan when columns with LOB types are present.

Session: Hash Match, the Operator (2.5 hour session)

Speaker: Hugo Kornelis

Speaker tip: He engaged audience just like Nikko did in his session. The dramatization in the beginning really catches audience attention. He had roller coaster rides picture and a story to go with it and he gave credits to all the pictures he used. Nice way to introduce the topic.

He said he is going to discuss 500 lever content of deep dive internals of operators and some of them are not documented. Such as hashing, hashing function, hashing table etc.

*** Download demos from fastsqlserver.com

Hashing … Laundry that needs sorting. But in Query optimizer world hashing is not sorting it is categorizing or bucketizing similar things.

  • Input  — Any
  • Output — Integer 0 – 9999
  • Deterministic — same input gets same output values
  • Hash Collision — Different inputs getting same output values.

Hashing function — organized by hash value or heap or bucket. It stores in memory entirely.

  1. There are 7 pointers for each hash value.
  2. Store 6 pointers in each structure
  3. Create and connect new 7-pointer structure

3 phases for hash Table match

  1. Build — Always used, process top input, builds hash table
  2. Probe — not always used, process second input if there is one match, detects has collision.
  3. Final — Returns data from Hash Table

Look at hash match operator properties in execution plan, you will see

Probe residual — will appear only if there are hash collision. Upto 4 byte of data, SQL Server gaurantees to return without Probe residual. It is needed when any operation on the column is done, when any non-integer data is present and when integer data is less than 4 byte. Impact of Probe residual on performance is not expensive. It is just one extra test.

Build Phase — Assuming no spills to the disk and all hash table fits in the memory.

  • Inner Join — Most simplest. Computes the hash and goes into bucket and finds the match, returns a joined row and goes to work with another row.
  • Rt. Outer Join — If no matching row is found, it returns rows from the outer table.
  • Both join type cause blocking in the build phase. There is no Final phase for them. Order of row is mostly retained but not guaranteed in real world where spills may occur.

Probe Phase

  • Lt. outer Join\Full outer Join — Marks the matched rows and returns the unmatched rows from the hash table. Adds a Final Phase for “Left returned” rows. No order for these rows is retained.
  • Lt. Semi Join (Exists) — Returns all the marked and matched rows from the hash table. No Final phase for them.
  • Lt. Anti Semi Join (not Exists) — returns all the unmarked and unmatched rows. Both semi and anti semi incur blocking in build and probe phase. All the output is produced in the Final Phase.
  • Rt. Semi Join — returns rows from Probe. Will not search for more rows when a first match is found.
  • Rt. Anti Semi Join — No match found return and no additional search is done. No final phase.

Hash match can also have 1 input.

Build Phase: (for aggregate)

Instead of adding to bucket, it searches if row already exists and does only update to the bucket. If no row found, adds a row and update the aggregation. This row is a place holder.

Final Phase — returns all aggregated rows from the hash table. There is no probe phase for it.

Example:

  • Select distinct(city) from customer — has match flow of distinct
  • Select city, count(*) from customer group by city — does hash aggregate

Hash Mach Scalar Aggregate:

Official name is Hash Match (Aggregate).

Stream aggregate more suitable for Select count(*) from table type of queries and is not supported in Batch mode. There is no Build Phase.

Hash match partial Aggregate:

Labelled differently in parallel parts of execution plan. There is no Build Phase. In compatibility mode 120 you can see this type of aggregate. No more visible in Modern SQL versions.

Hash Memory Usage:

Memory requested at the beginning of execution. Engine reserves at least requested memory. If requested memory not available, it waits. Tries to reserve up to requested memory before beginning execution. It does not provide a way to request additional memory during execution and hence when memory is insufficient, operators spell to tempdb on disk.

  • SQL Server 2016, batch mode may get extra memory during execution.
  • SQL Server 2017, batch mode on columnstore may get extra memory during execution
  • SQL Server 2019, all execution mode get extra memory during query execution.

Dynamic De-staging:

Happens when hash operator spills to disk.

  1. In-memory HJ — Never short of memory and everything fits nicely.
  2. Hybrid HJ — combines in-memory and Grace HJ. When out of memory, partitions buckets. If it is complex, then after 1st build and probe phases, does same as Grace HJ.
  3. Grace HJ — Algorithm is used when amount of data is more than the known memory. Divides set of buckets into partitions. Eg; buckets = 1 – 20,000. 5000 buckets at a time (hash partition function 1 – 4), 2 files for each partition is created. SQL server uses tempdb. They are actually structures in tempdb. Reads both inputs, stores rows in appropriate file. 1 file for build, 1 file for probe. Process each file individually.
  4. Recursive HJ — When Grace HJ mis-estimates partition size.
  • Dynamic Staging HM starts in memory.
  • Bail out happens when bucket size is larger than memory available. It abandons  further processing. HM then goes to Nested Loops or Sort and Merge join. Both are inefficient in this case.
  • Bit Vector filtering — is good. Always used when spills happen. In-memory bit map for has values of spilled partitions.
    1. Build: set bit for hash result of rows
    2. Probe: Check if bit is set for hash
      1. If set – write to file on disk
      2. If not – handle unmatched, don’t write to file

Memory Grant Computation:

Est size (row count + row size)

Note: For Var(char) data type it assumes to be 50% full

Est size (row count + row size) of probe input

Execution plan with multiple match operator.

Session: Query Store In-Depth

Speaker: Ola Hallengren

Query Store (QS) available since SQL Server 2016. It is like a flight recorder of query execution.

DBCC CLONEDATABASE includes query store Data.

Force a last know good plan after changing the compatibility level or cardinality estimation.

Disclaimer: These are my notes taken while attending sessions and any inaccuracies are my own.

I attended the PASS (Professional Association of SQL Server) Summit in Early November. I attended 2 pre-conferences on Monday and Tuesday.

Monday: SQL Server Security by Denny Cherry

Tuesday: Fixing Query Performance Problems from Estimates, Statistics, Heuristics, and Cardinality by Kimberly Tripp

This time I also participated in Speaker Idol competition.

I attended the regular sessions on Wednesday, Thursday and Friday.

 

Here is the brief summary of what I learned.

Monday 11-05-2018 (pre-conference)

SQL Server Security

Speaker: Denny Cherry

SQL Server security does not begin with SQL Server itself. It begins at your organization’s infrastructure.

Network Design:

This depends on what a company want to use; public IP or private IP addresses. Private IPs became more important when cloud infrastructure came into the picture. In the past, hackers used to exploit the public IP\DNS names and would broadcast messages but this problem is mostly gone. Additional protection has been put on.

Private IPs: 192.168.0.0 typically used for home networks, 172.16.0.0 and, 10.0.0.0  typically used by business networks.

Public IPs: Any IP address not part of the Private IP address is a public IP.

VLANS: Virtual Local Area Networks. This is great for segmenting SQL Server network and saves money on switches. A downside is that things need to be configured correctly the first time. A good strategy is to block all ports on web servers, only 1433 open to talking to SQL Server.

ACL (access control list) works in tandem with firewalls. ACL works on new access, not on the existing one. If there is a lot of access to configure, it is better to use the firewall to block everything and poke holes (swiss cheese) to let the stuff through.

Switches have ports which are connected to email servers, database servers and web servers. On top of the firewall is NAT (network address translation) layer to allow to talk to machine outsized of your network.

There are 3 types of network design. Each has its own benefits and shortcomings.

  1. Everything behind the router
  2. Web servers in front of the router
  3. Web server in DMZ (VLAN)

On each server, configured to block all access but required services. Inbound firewall is blocked by default. For best practice block public IP from outbound traffic. No matter what port number you are using for SQL Server. Web server web config will have that information so hacker scanning will get it. If your web config is encrypted only then it is true protection from a hacker.

Accessing SQL Server via the internet is not recommended. There are lots of VPN options available. Desktop sharing app should be regularly monitored because an unknowing employee can leave a back door open after one of the session. Your first layer of security is VPN.

Physical Server Security:

How are people allowed into your data center located at your business? The cleaning crew has access? Is there dual authentication to enter? E.g.; pin + biometric. Does alarm sound if accessed outside of defined and approved times?

Screensaver on company computers to have the company’s log (enforced by group policy) is a good practice.

Social Engineering is the easiest way to gain access to data. Need to train your people who have access to data.

Cloud security is as secure as you want it to be. Inherently cloud is not more or less secure than on-premises environments. Some companies require firewalls from 2 different companies if one fails other kicks in. Some companies require logins to the cloud only work from the office. You can bypass default Azure routing and can define your own. Microsoft or any other vendor (AWS, Google) cannot see your data in the cloud. Data is encrypted at the storage level. Active Directory account is required to access your data. Microsoft especially is taking data security very seriously. The engineers who designed the system are not allowed in Data Centers where they can identify which servers hold which company’s data.

With increased security comes challenges. For example, California is implementing GDPR kind regulation in 2019. Everyone has right to be forgotten after website visit. It is a double edge sword. Now forgotten customer cannot pay their bills online.

Data Encryption Options:

  • Application Encryption: Could be .net encryption or could be Always Encrypted. AE is the best, data is protected all the way. Introduced in SQL server 2016. Only small change in the application connection string “Column Encryption Setting=enabled”
  • SSL Encryption: SQL Server offers SSL protection for transit data. No client change is needed. It is configured via SQL Server Configuration Manager. Availability group can use different certificates on each replica.
  • Database Encryption: This is commonly known as column level encryption. This type does not require application code change. All encryption workload is centralized on the database server. Great solution for protecting PII data.
  • Storage Encryption: TDE (transparent data encryption) encrypts data at rest (at disk level). Data within the table is still plaintext. Backup files are protected and cannot be restored on another server without the proper encryption key. TDE disables instant file initialization. TDE will not encrypt Filestream data. To protect Filestream use bit locker.

Data Theft is not a matter of if, it’s a matter of when. Check Ihavebeenpawned.com. The biggest concern for the company is the reputation.

Data Classification is based on your business and based on what data is stored.

  • General Data
  • Low Business Value data
  • Medium Business Value data
  • High Business Value data
  • PII

Business users have to identify the data classification. DBA should not be driving it. DBA help business to accomplish it. SSMS 17.5 has data classification available. It only looks at column names and does not comb the data.

How can you protect the data if someone takes the picture or screenshot of PII data? None of the encryption methods we discussed so far will work in this case.

 

Protecting Password:

.NET apps can encrypt the password in config file when the first time app connects to DB.

Domain policies are guaranteed to protect Windows accounts. There is no guarantee domain policies will protect SQL accounts.

SQL CLR runs in a separate memory space from SQL Server binaries. MS supports to run it in SAFE mode external. An unsafe assembly can call any managed code. CLR may have security issues, therefore, this feature is gone from Azure. Text manipulation works better in CLR then SQL.

To protect TDE keys, keep on-prem preferably few copies, easier central location. Other copy put on USB and give it HR to keep in a fire-proof safe. Never store with backup.

Anti-Virus:

SQL Server should not need an anti-virus running on the server. Anti-virus product can cause CPU issues, IO issues and RAM issues. Their behavior may change over time. Affects SQL server performance. To check if anti-virus is installed, Select * from sys.dm_os_loaded_module.

 

 

Tuesday, 11-06-2018 (pre-conference):

Fixing Query Performance Problems from Estimates, Statistics, Heuristics, and Cardinality

Speaker: Kimberly Tripp (of SQLSkills.com)

The first step is to find out that Query really has a problem. There are many tools out there to do that. This session is about what to do after that. The first step is to identify if it is a statistics not updated on the column issue or plan caching issue. This is very important because based on that you will apply the remedy.

In lots of instances, statistics are not a problem but gets blamed anyway. Sometimes caching and parametrizing is the problem.

Generalized and Simplified Query execution steps:

  • Parsing (syntax checking)
  • Standardization /algebrization (Optimizer looks at the predicate and pushes down the tree to select the correct index if exists)
  • Cost-based optimization (chooses the good plan faster not the best plan)
  • Compilation
  • Execution

Cost-Based Optimization: More statistics on a table is not a bad thing. Optimizer’s job is not to find the best plan but to find good plan faster. It sometimes re-writes your query to lower the cost. Calculates and estimates how much data it is going to process before processing it.

In the following example, the optimizer will use an index for the second query.

  • Where monthly_salary X  12 = Value  (maybe written in haste in an Adhoc query, cost higher)
  • Where monthly_salary = Value / 12 (uses index)

How the plan caching occurs? One way is when you are testing your procedure with certain parameters over and over and after your final change forget to clear the plan. The optimizer chooses (faster) existing plan in cache for different parameters and results in the performance problem.

Solution: use Execute procedure WITH Recompile. It recompiles the whole procedure, not just the statements inside the proc having trouble. Instead, use OPTION (Recompile) at the statement level. Less harsh.

A single query can have > 10 plans just because ANSI settings or spaces (SET OPTIONS) are different.

To find out session settings run this. Select * from sys.dm_exec_sessions where user_sessions =1

Selectivity: plan also varies based on what percent of data a query is returning. Lots of narrow non-clustered indexes are bad because they are non-selective. The tipping point between an index scan and seek is anywhere between ¼ pages to 1/3 pages of the table.

How to see the statistics?

  • Dbcc show_statistics
  • Sp_autostats tablename

Index density: If there are 10,000 rows and index is on last name column, say 400 rows return for a particular last name. But if an index is not narrow and contains last name and first name combo maybe 5 rows returned. It is more selective.

The histogram has 201 total steps (included 1 row for NULLs)

When the statistics were last updated?

Use stats_DATE (object_Id, index_id) in a query.

Demo: use the Credit database and Kimberly’s script to see the usage of statistics. If you see statistics has been used a lot then maybe a good idea to create an index on that column.

Do update statistics using FULL in off hours if possible. Sometimes updating on SAMPLE create more problems for the query. Sometimes full update is not possible on a Very Large Table. In that case, use filtered statistics. Never turn off auto update stats. But do run your own update statistics job regularly in a maintenance window. Auto update is there to catch if your job fails for some reason.

An index rebuild will update statistics and will invalidate plans in the cache. So be careful when you do it.

If you use a filtered index, use option recompile to use this index. You have to manage its update and it cannot depend on auto stats update.

 

Wednesday, 11-07-2018 sessions:

Session 2: Getting Better Query Plans by Improving SQL’s Estimates

Speaker: Brent Ozar

Abstract: You’ve been writing T-SQL queries for a few years now, and when you have performance issues, you’ve been updating stats and using OPTION (RECOMPILE). It’s served you well, but every now and then, you hit a problem you can’t solve. Your data’s been growing larger, your queries are taking longer to run, and you’re starting to wonder: how can I start getting better query plans? The secret is often comparing the query plan’s estimated number of rows to the actual number of rows. If they’re different, it’s up to you – not the SQL Server engine – to figure out why the guesses are wrong. To improve ’em, you can change your T-SQL, the way the data’s structured and stored, or how SQL Server thinks about the data. This session won’t fix every query – but it’ll give you a starting point to understand what you’re looking at, and where to go next as you learn about the Cardinality Estimator.

Prerequisites: You’re comfortable writing queries to get the data you want, but once you get a bad query plan, your only fix is OPTION RECOMPILE.

Error: spilling to disk- estimates are wrong maybe >10X magnitude.

Maxdop =1 option in the query for a single threaded demo.

SQL 2019 (latest)

Stackoverflow2013 DB. Open source scripts

Go to old compatibility mode. Get an estimated plan.

Statistics update help correct estimation.

If change the compatibility to SQL2017 still did not solve estimation error. It has new cardinality estimator (CE).

Changing to SQL2019 compatibility level, the same query resolves the spill problem. Estimate still wrong but memory grant is greater because it adds batch mode operator.

How you resolve it in 2017? Create statistics. Did not resolve. Because statistics was created on sample size, not full scan. Do full scan statistics, but Exec plan is same. So clean plan cache. Finally, it removes the spill error and correct estimate. But the query is still slow.

If you have a choice to create a statistic or index on a column, create index instead and will save you lot of work. Plus it will give you a sorted list.

Stream aggregate does the computation faster in buckets.

Read: Thomas Kejser, Ascending key problem

 

Thursday, 11-08-2018 sessions: 

Session 1: Containers, Pods, and Databases- The Future of Infrastructure

Speakers: Joseph D’Antoni

Notes:

Containers have quietly been taking over the world of infrastructure, especially amongst developers and CI/CD practitioners. However, in the database space, container adoption has been lower.

SQL Server 2017 introduced the concept of deploying databases into Docker containers. In this session, you will learn the fundamentals of creating containers, learning about Kubernetes for management, and how to further your learning in this new and emerging space.

Kubernates is still young but robust. MS and Google contributing. It is v 1.0 product, not beta. A code will have to be stateful so it can keep track of which node it ran. Kubernates pos can contain spark + SQL Server. (0% deployment will be on Linux.

A container is a standardized unit of software that includes everything needed to run it. Fast to deploy, self-contained, upgrades are easy, scalable and multi-tenant.

In SQL 2019, AG is also available in Kubernates. Virtualization was not easily accepted in 2003 when it was new. Containers are same in 2018 now. But this is the future of infrastructure.

Containers:

They are isolated but share OS, bins\libraries. VM has full copy of Host OS. Host don’t share OS. With containers you can get more density on less usage utilization. Faster to deploy than VMs. Storage is persisted. Better destroy than VMs. They are licensed just as VMs. Either license separately or license one host and install as many.

Compute and data are separated. Hard to do it in RDBMS. You can scale them independently.

Demo: SQL server 2019 and Docker

  1. Docker pull (path)
  2. Sudo Docker run

Worked in SQL operational Studio. Not a replacement for SSMS.

 Kubernates:

Initially developed by Google. Runs containerized application with HA. Greek word meaning a person who veers the ship. Frequently deployed in conjunction with Docker using contained service and Docker registry.

SQL Server and Kubernates:

Nodes per Kubernates. Each sql server and agent in its pod and in its node. There will be a master node in the cluster. Master runs the cluster. Nodes are individual hosts like VMs.

 

 

Session 3: Principle of Least Privilege: The Key to Strong Security

Speaker: Ed Leighton-Dick

Until now we have been putting more emphasis on network security so nobody can come inside and get our data. This is not enough anymore. 95% data breach happens with phishing. Next high is crypto jacking. This is not going to stop. Not only bad people outside but Inside Company who are malicious or going to work for competitors or just plain naïve. Till 2-3 years ago security was afterthought. Now it is forefront. In 2017, there were about 1500 reported data breach incidents in the US.

GDPR assumes that you will implement security and privacy built in the product not added afterwards. Requires huge investment.

Ed took 10 minutes, to explain why it is important and then started with what we can do. Gave an example of his experience where he was consulting, about a failure of security because everyone had full rights.

Watch for application service account with higher permissions.

Ideal security goal is that all accounts should be given permissions need to do the job and not more.

Applying least privilege for applications:

  • Require granular permissions, find out what it does

Applying least privileges for Developers:

  • They should not have access to production.
  • Should not have access to administrative rights.
  • Non-prod environment they should not have access to prod data.

Applying least permissions for DBA:

DBA should audit their own activity and ideal to have the least account and high account.

Ideally, there should be a security officer, not DBA and not a developer but manages keys. Even system administrator also should not have access to data.

Step 1: Analyze

Find out what vendor app does, what your app do, what your regulation requires you to do. Find out what is needed to be done not wanted to be done. What your users need to do. You may find them not to align.

Do Regular internal and external audits.

Step 2: Implementation

  • SQL server 2014+ – custom server roles
  • SQL server 2005 -2012 (separation of duties framework on CodePlex Archive)

[First demo of the session. Get the script from Ed’s blog.]

Execute as user

And run sys.fn_my_permissions.

Db_owner on DB can create sysadmin if trustworthy on db =1

Step 3: Repeat

It is not a onetime process, you have to periodically on a regular basis.

 

 

Friday, 11-09-2018 sessions:

Session 1: T-SQL Habits and Practices That Can Kill Performance

Speaker: Vern Rabe

 

  1. Union vs Union ALL

Union makes query unique by doing Sorts which is expensive. Run query with both to see execution plans.

  1. Using superfluous ORDER BY

If a query is only using columns in select that is covered by clustered index then list is already sorted. Run without it. See result of both. You will find order by was unnecessary.

  1. Not Joining on Keys

Usually, you want to join on keys. If not you may get duplicate. Then you will use duplicate which is a performance hit. The right way to do it is to use all the key (PK) columns in join predicate.

  1. Having too many JOINS

Optimizer figures out with an execution plan and comes up with good enough plans.

One way to solve the optimizer time out of big joins, break down into separate queries and store results in the temp tables and join the temp tables.

  1. Over-using or mis-using views

Views hide the complexity of join queries underneath. It starts with somebody writing a simple view then somebody comes and adds a few extra columns from another table. Make sure what is in the view before using the column in your query, maybe directly querying base table will be cheaper. Do not reference view in a view. Bad idea. Join view with a view increases complexity.

  1. User defined function in a predicate

Example month(F.orderdate). A month is not user-defined but is system defined.

Create a UDF using month(F.orderdate) built-in function.

Run both queries. Both plans are the same. Both are doing clustered index scan but hover over row arrow shows different numbers showing the second one is doing lot more work in the beginning and then filter out. Don’t rely on execution percentage for both queries for comparison.

 

  1.  Ignoring the transaction logs

Insert and update touches tlog. Anything you can do touch the diskless will improve the performance. Example. Insert into temp table from a regular table and check using sys.fn_dblog and then in next query run an update on temp table and verify using fn_dblog. Out of 200 rows only 5 updated, the first query will update all rows. The second query will only update matching the criteria and hence less written to tlog.

His stored procedure creates where clauses very easily. His script contains this stored proc. Using EXCEPT results in the same output.

  1. Ignoring data types

Mis-matching data types in predicate will cause implicit and explicit conversions. He used a number table with 0.5 million rows. In one table column is int and another table column is char. Now compare sp_spaceused of both tables. The second table will be bigger and size will affect the query performance. Duration of both queries will differ as you can see the Select properties (Query time stats) in execution plans.

  1. Not using OPTION (RECOMPILE)

It has special uses. When the first time you execute SP, optimizer comes up with a plan. The second time it uses the plan, good thing because it saves time. Recompile will ask for a new plan each time. If a cached plan is small values but next time your passing parameter with large number of rows, a previous plan will work badly. Analyze the parameter value range and decide if you want to use recompile. Every single statement in the SP will get a plan. Not all need to be a recompile. Sp_recompile and With recompile will recompile the entire SP. The nested loop works great when one of the join tables has small no. of rows.

  1. Not making predicates SARGABLE (Able to search by Argument).

Helps optimizer to select indexes.

Where number < @i+1 (Sargable)

Where number -1 < @I (32K pages read vs 3 pages)

 

 

Session 2: Columnstore Clustered vs Nonclustered, When to Choose What

Speaker: Niko Neugebauer

Index Types:  Clustered, physical structure and all columns are sorted. NC can be 999 per table.

Columnstore indexes: Rowstore is horizontal (clustered + nonclustered). Page and row compressions available. CS compresses data for each column hence can get more compression. Only one ColumnStore (CS) index allowed either clustered or non-clustered.

OLTP Architecture: insert, update and delete load, no CS index should be created because it will not benefit.

Operational analytics: need 1 columnstore index. 1 Non-clustered (NCCI) is needed.

DW: – 1 Clustered Columnstore Index (CCI)

BI: – hybrid, you have to think and do trial and error.

SQL 2012 had NCCI for DW. Could not update them unless you are switching the partition. No OLTP support

In SQL 2014, got CCI with updatable. Batch execution mode available. But still for DW only. Cannot use any rowstore indexes with CCI only with NCCI.

In SQL 2016, NCCI becomes updatable. More restriction for DW, CDC, CT supported only got NCCI. NCCI can be filtered. Support for replication, indexed views (not documented) added. Got OLTP NCCI, in-memory (Hekaton). Not CCI. Natively compiled SP does not support CCI

In SQL 2017, LOBs data type support is finally included CCI. Computed column supported.

SQL Server 2019 included an online rebuild of CCI. Expect 2 to 3 times execution time difference because optimizer will use delta store. If it takes 1 hour to rebuild offline, It will take 3 or 4 hours online.

 

Session 4: Lightning Talks: DBA

 

This Lightning Talk session will be comprised of five 10-minute talks on SQL Server, DevOps, TempDB Performance, Troubleshooting and more! Sessions and Speakers will include:

  • “SQL Server Performance Troubleshooting: Bottom to Top… or Top to Bottom?” Andy Yun
  • “How High Is Your Availability?” Allan Hirt
  • “The Top Five Things DBAs Need to Know About DevOps” Hamish Watson
  • “Evil Indexes Reducing Performance of SELECT” Pinal Dave
  • “SQL Server’s Most Wanted: Five Common Vulnerabilities” Ed Leighton-Dick

Allan Hirt:

Fedex example: Due to virus cost 300 million dollars. Hurricane mother nature outages. 3 short stories to make a point of what he is going to say.

It’s about the requirement. He asks the business what you are trying to achieve. He wants to make sure why they want something is clear to them.

Technology can provide business continuity but should not drive it. How much risk business is ready to take? This answer should be on paper in terms of RPO and RTO.

Keep systems simple, recovery will be faster. Do not implement anything you are not ready to monitor and support. Automation breaks down so have a plan B.

His main point was define availability requirements first.

Andy Yun:

What is a proper performance method? Driving to work analogy.

He used a lot of his driving emotional pictures and made a connection that who like to be stuck in a traffic and similarly we don’t like to be stuck behind slow queries. What is the root cause of congestion? Sp_whoisactive he used. This is one way to look at the query. Look at it differently like look at the traffic from a helicopter to see what is causing it. Take top down approach. Look at server level what is going on. Such as resource monitor or perf monitor. Use tools like SentryOne. It tells you the high level wait type and then looks at closer to a database. Narrow it down to wait types and queries.

Hamish Watson:

We don’t trust that’s why we do manual things. We do not use source control consistently. Application changes can break our database.

In SSMS, it is hard to use Source Control but VS does. tSQLT is a free unit testing tool. SSDT allows you to incorporate TSQL unit testing.

Continuous Integration (CI) – each check-in is verified by an automated process. This allows for continuous delivery (CD).

Culture is very important. Because this process needs partnership not silos.

Pinal Dave:

He showed an example where Select query improves when too many indexes created. But results in slow down when Deletes, insert and updates happen. He says SELECT query will not only run slower, if there is an index and optimizer does not choose it for some reason but unused indexes actually adversely affect the execution compared to if there was no index at all.

Ed Leighton_Dick:

Patching, logging, auditing all actions are good and needed. The best defense against SQL Injection is to use safeguarded stored procedures.

 

I hope you find these notes useful. I apologize for lot of grammatical and shorthand mistakes in my notes.

 

Speaking at SQL Saturday events:

June 11, 2016, Iowa City, IA http://www.sqlsaturday.com/523/Sessions/Schedule.aspx
September 24, 2016, Kansas City, MO http://www.sqlsaturday.com/548/Sessions/Schedule.aspx
January 14, 2017, Nashville, TN http://www.sqlsaturday.com/581/Sessions/Schedule.aspx
March 11, 2017, Chicago, IL http://www.sqlsaturday.com/600/Sessions/Schedule.aspx
April 08, 2017, Madison, WI http://www.sqlsaturday.com/604/Sessions/Schedule.aspx
August 05, 2017, Louisville, KY http://www.sqlsaturday.com/643/Sessions/Schedule.aspx
September 16, 2017, Wausau, WI http://www.sqlsaturday.com/631/Sessions/Schedule.aspx
July 21, 2018, Louisville, KY http://www.sqlsaturday.com/729/Sessions/Schedule.aspx
July 28, 2018, Columbus, OH http://www.sqlsaturday.com/736/Sessions/Schedule.aspx
August 11, 2018, Indianapolis, IN http://www.sqlsaturday.com/745/Sessions/Schedule.aspx
August 18, 2018, Sioux Falls, SD http://www.sqlsaturday.com/787/Sessions/Schedule.aspx
January 12, 2019, Nashville, TN https://www.sqlsaturday.com/815/Sessions/Schedule.aspx
February 02, 2019, Cleveland, OH https://www.sqlsaturday.com/821/Sessions/Schedule.aspx
April 06, 2019, Madison, WI https://www.sqlsaturday.com/842/Sessions/Schedule.aspx

 

I am hearing a lot about GDPR (General Data Protection Regulation) in past few months. I decided to do some digging up to know how it affects us in the United States. Those of you who directly want to go to the source of my information and avoid my 2 cents are welcome to check out these 2 links. This Law is explained in very easy terms for a layperson.

https://www.red-gate.com/simple-talk/opinion/opinion-pieces/questions-gdpr-shy-ask/?utm_source=simpletalk&utm_medium=pubemail&utm_content=20171121-slota2&utm_term=simpletalkmain

https://www.csoonline.com/article/3202771/data-protection/general-data-protection-regulation-gdpr-requirements-deadlines-and-facts.html

Those of you who are short on time and want to just get the skinny version, then keep on reading.

  1. This law is about protecting European Citizens from data breaches of companies inside and outside Europe
  2. This Law may affect us if and when we store data of EU citizens.
  3. This is for EU consumer safety.
  4. In case of a data breach, the companies even outside EU can become part of the lawsuit.
  5. To avoid the lawsuit, a USA based company have to provide the EU country representative with a notification within 3 days of a data breach.
  6. Companies can be fined if the governing body finds the proof that adequate data protection measures were not taken by the company whose data was breached.

It is good to keep this EU law on our radar even if our company is not storing EU consumer data. I think something like this law will become the norm in next 5 years across the global communities. After all, it is the responsibility of the company who collects and stores consumer data to provide safeguards to protect the data. The basic rationale of this law is simple. Since companies make a profit by using a consumer information, it is fair that they should spend a part of that profit to protect that information. This provides clear accountability on one hand and provides peace of mind to the citizens on the other hand. In my opinion, this is a win-win situation.

I would appreciate your comments on this topic.

What you like to LEARN?

Here I am putting down a simple method of removing SQL Server 2016 named or default instance. I am assuming here that there are no user databases on this instance. I stopped all SQL  Server services before I started this uninstall process.

  1. To begin the uninstall process, go to Control Panel and then Programs and Features.
  2. Select Microsoft SQL Server 2016 and select Uninstall. Then click Remove. This starts the Remove SQL Server 2016 Wizard.

                                                              

  1. Setup Support Rules runs to verify your computer configuration. Since there is only one instance of SQL server 2016 on this machine, wizard selects it for you. Otherwise, you can use the drop-down menu to select your instance for removal. To continue, click Next.

4. After hitting NEXT, you will get Ready to Remove dialog box showing the summary of Engine and features that are going to be removed. Click Remove. The removal process will begin and it may take several minutes depending on your installation.

  1. If everything goes smoothly, you will receive a success message. You can go to the hyperlinked text file to see the report of the whole process of removal. Click Close to complete the removal of SQL Server 2016 instance.

In the end, you can refer to this TechNet document for more detailed instructions. https://technet.microsoft.com/en-us/windows/ms143412(v=sql.90)

 

IS IT ONE MORE BORING BLOG?

I am writing this blog to remind myself how the query processing work. It is a forgettable item because it is different from the way you write a query. To begin with let me write a simple query.

Select * from AdventureWorks2012.Person.Person

SIMPLE ANALYSIS

We have 3 elements in this simple query. First is the keyword “SELECT”. This keyword means you are retrieving data from the table (as oppose to inserting or updating data in a table). A Second element is “*” which denotes the list of columns in a table. The third element is the “FROM” clause which tells you to retrieve data from a particular table.

When SQL Server Query Optimizer receives this query it processes it in a logical order instead of a physical order. Most logical and cost-effective thing is to first check if the table exists or not. So the “FROM” clause will be processed first. Next, it checks for columns requested also exist or not. If a table exists and the columns requested also exist then Query Optimizer looks at “SELECT” keyword and retrieves the data for you.

Look at the direction of a blue arrow in the execution plan of this query. It shows, that Query Optimizer scans the table first (using the FROM clause) and then do the SELECT.

Logical query processing 1

This is how the flow diagram will look

Flow Diagram 1

Flow Diagram 1

Let’s add some more spice to this query.

Select * from AdventureWorks2012.Person.Person where Lastname = ‘Ford’

This query returns only 3 rows. “WHERE” clause in the query is filtering the table data on Lastname column. Query optimizer first checks the physical objects (tables and columns) exist and then see the filter and uses the index if there is one on the filtering column. Note the key_lookup operator in the below execution plan.  The reason is that the Non-Clustered index chosen in the Seek operation is not the covering index (remember we are using * ). In that case clustered index of the table is used to point to the data which generates key_lookup operator. It is an important consideration when you are doing query tuning. But for our purpose here it is not important. Another point to note is the size of arrows. Arrows are thin in this execution plan because the query is returning only 3 rows compared to previous plan where the query was returning about 20K rows.

Logical Query Processing 2

Logical Query Processing 2

Flow Diagram 2

Flow Diagram 2

Next Step:

Having a clear concept of how the query is logically processed will help you write more efficient and Optimizer friendly queries. In my next blog, I will take the learning done here and extend it to other query clauses such as GROUP BY and ORDER BY.

Yesterday I attend this session after work and took some notes. Sian is a great presenter and possesses wealth of information.

Topic: How Mind Works Under Stress

Presenter: Sian Beilock (Author of books Choke, How the body knows the mind)

Venue: Heller Auditorium Francis Parker School.

Notes

What happens when we fail to reach our potential? Do we feel like we are choking under the pressure? Why we fail to perform under pressure? Sian’s research is about finding techniques to overcome this choking effect and perform better.

Frontal cortex, front part of brain, is much bigger in humans than any other primates. It defines us as humans. This defines what we are capable of. FC communicates with other parts of the brain. Under stress FC stops this communication and results in weird consequences. During stressful situations, FC goes haywire but it does that in a predictable way. We have to find that pattern; Paralysis by Analysis.

What happens under stress situation is that we second guess our ability and pay detail attention to things that should be on autopilot. We become very conscious of ourselves. We start breaking down the details of our skills. Focusing on the outcome instead will give better results. All we want to do is sail rather than fail.

A Study was done at Cornell University with medical students. They divided the group of medical students of the same age and level into 2 groups. One who is preparing to take their board exams and one not. They measured the brain activities showing how the brain gets ready for board exam compared with other people not having exam. What changes in the brain? Study showed people who were excessively worried about the exam performed bad at critical thinking problems. Study also concludes that stress in one area of life effects other parts of the lives.

Now let us talk about what can we do help our brain perform at its peak.

  • People are more likely to get the answers if they walk away from the difficult problem for a while. This is taking a step back and view the problem differently. Mind resets and you get the connection back.
  • Sleep is an important for our life. Mind does not stop working even when we are sleeping but it works on its on things. It makes new connections from the information we gathered in our waking hours.
  • Talking out your problems to someone who does not know the answer will also help you solve the problem on your own.
  • Our emotions also messes with Frontal Cortex’s functions. When we are under pressure, FC does not keep the negative emotions away and gets us. In teens, FC is under developed, it completes in 20s.

fMRI, Functional MRI (Magnetic resonance Imaging) is one way of testing brain activities. It shows functioning of brain in real time. People who are nervous about Math problems were told they are going to solve a math problem. The part of brain that lights up when we are physically harmed lighted up for these people who were fearing Math problems. To them anticipating Math was as painful as they have been pricked by a needle. This is caused by lack of blood supply to Neurons in the brain. If you tell them ahead about the problem, fear shuts down FC and they don’t perform better.

What can we do about this negativity? Every time negative emotion comes, practice changing it with one thing that you can change. The practice is important. Behavior will change gradually. How we think about ourselves matters very much. If we think we are better at something, chances are we will do better in that in that task. Asian girls did better on Math problems when they thought they are Asian and therefore better at Math than when they thought they are girls and therefore not good at Math.

  • Taking walks in nature is beneficial for brain and helps focus our attention. Even looking at pictures of nature will give some of the benefits.
  • Practice being in the stressful situation and when actual stressful condition happens you are more ready to cope and perform better in it. Simulating the actual situation will help.
  • Your body language send other messages weather you are ready or not. It also gives message to your own brain. So keeping your posture upright also keeps away negative emotions.
  • Journaling is best way to get the negativity out. This opportunity to download you thoughts on paper helps you give perspective.