A closer look at everything related to SQL Server

Archive for the ‘Uncategorized’ Category

TempDB Bottlenecks and More

There has been many articles, documents and blog posts written about various problems and resolutions of TempDB. This is my effort at understanding and gathering relevant information about TempDB bottlenecks in one place for myself and any audience of this blog.

INTRODUCTION OF TEMPDB

Tempdb is a SQL Server temporary system database. Everytime SQL Server restarts, new Tempdb database is created. Whatever data was stored in the tempdb is lost. When TempDB gets recreated its data and log files reset to their last configured size. SQL Server records only enough information in the tempdb transaction log to roll back a transaction, but not to redo transactions during database recovery. This feature increases the performance of INSERT statements in tempdb. Tempdb database is always in Simple recovery mode. If you try to change it to Full Recovery mode, you will receive the following error message.

If you try to backup TempDB, you will get another error saying backup and restore is not allowed on TempDB database.

This system database is used by all the other user databases on SQL Server instance for their temporary work. It is the most busy database on almost all production SQL Servers. Here is a short list of few of the features that use tempdb.

  • Temporary tables, table variables, and table-valued functions.
  • Work tables associated with Cursors, Order By, Group by and Hash Query Plans.
  • Version stores used by Read Committed Snapshot Isolation
  • Index creation, Online index creation
  • DBCC CHECKDB
  • User-defined functions

These objects stored in TempDB are categorized as following:

USER OBJECTS – such as temporary local and global tables, table variables, table-valued functions etc.

INTERNAL OBJECTS – such as objects created to support indexing operations and query execution like work tables.

VERSION STORES – such as created when RSCI or SI isolation levels are used by transactions.

Following TSQL code uses DMV sys.dm_db_file_space_usage to get the TempDB space usage, separating out the above 3 categories.

— tempdb data file usage
SELECT GETDATE() AS runtime,
SUM(user_object_reserved_page_count) * 8 AS usr_obj_kb,
SUM(internal_object_reserved_page_count) * 8 AS internal_obj_kb,
SUM(version_store_reserved_page_count) * 8 AS version_store_kb,
SUM(unallocated_extent_page_count) * 8 AS freespace_kb,
SUM(mixed_extent_page_count) * 8 AS mixedextent_kb
FROM sys.dm_db_file_space_usage;

TEMPDB PROBLEMS

Problem 1: TempDB database grows out of disk space and only option left is to restart the SQL Server service.

There could be many ways to approach the resolution. But I believe monitoring the space usage is the best weapon. Here is a great post by William Durkin, who creates an Extended Event session to monitor space usage of data and log files of TempDB database.

— XEvent Session to track tempdb file growths
CREATE EVENT SESSION TempDBFileGrowth
ON SERVER
ADD EVENT sqlserver.database_file_size_change(
SET collect_database_name = ( 0 )
ACTION (sqlserver.client_app_name, sqlserver.client_hostname,
sqlserver.database_id, sqlserver.session_nt_username,
sqlserver.sql_text)
WHERE ([database_id] = ( 2 )))
ADD TARGET package0.event_file
( SET filename = N’TempDBFileGrowth.xel’,
max_file_size = ( 10 ), max_rollover_files = ( 5 ))
WITH (
MAX_MEMORY = 16384KB,
EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY = 1 SECONDS,
MEMORY_PARTITION_MODE = PER_NODE,
TRACK_CAUSALITY = OFF,
STARTUP_STATE = ON
);
GO
ALTER EVENT SESSION TempDBFileGrowth ON SERVER STATE = START;
GO

Create this Extended Event session for monitoring tempdb space usage. But be careful, this is for troubleshooting only. You can leave it running continuously but first change the dispatch_latency to 30 seconds which is the default. Dispatch latency controls how often XEvent data in buffer pool will be flushed and written to the disk. If this frequency is very low, on busy systems, it could affect the performance of SQL Server.

Problem 2: Tempdb PFS and SGAM page contentions.

Imagine a particular SQL Server instance processing concurrent queries. Each one of them creating tiny temp tables, use them and then drop them. These temporary tables are stored in Tempdb database. PFS page in the tempdb data file keeps track of all the page allocations and deallocations. Only one thread at a time can make changes in PFS and as a result you can see it creates a bottleneck. This behavior of Tempdb is known as PFS Page Contention. This issue can be resolved by having multiple Tempdb data files. The guidance is that you should start with 1 data file per CPU core till you reach 8 files. Add more files in chunk of 4 if you are still seeing PFS page contentions. You also need to enable trace flag 1117 to have these multiple files grow at same rate. This is a server wide setting so it changes the behavior of all user databases on the instance and not only TempDB.

Sometimes Tempdb reports contention in SGAM page. This happens when SQL Server searches for a mixed pages to allocate to a new object created and scans SGAM page. This results in severe contention on the SGAM page when multiple mixed-page allocations are underway. Deallocation activities also modify this page and create more contentions. In this situation, trace flag 1118 can be implemented to allow SQL Server to allocate full extents to the new objects. This is a server wide setting so it changes the behavior of all user databases on the instance and not only TempDB. Monitor this behavior using DMV sys.dm_os_waiting_tasks. You will notice wait types along with resource_description such as 2:1:1 and 2:1:3 (PFS and SGAM Pages in TempDB database in File 1).

SELECT
[owt].[session_id], [owt].[exec_context_id],
[owt].[wait_duration_ms], [owt].[wait_type],
[owt].[blocking_session_id], [owt].[resource_description],
CASE [owt].[wait_type]
WHEN N’CXPACKET’ THEN
RIGHT ([owt].[resource_description],
CHARINDEX (N’=’, REVERSE ([owt].[resource_description])) – 1)
ELSE NULL
END AS [Node ID],
[es].[program_name], [est].text,
[er].[database_id], [eqp].[query_plan],
[er].[cpu_time]
FROM sys.dm_os_waiting_tasks [owt]
INNER JOIN sys.dm_exec_sessions [es] ON
[owt].[session_id] = [es].[session_id]
INNER JOIN sys.dm_exec_requests [er] ON
[es].[session_id] = [er].[session_id]
OUTER APPLY sys.dm_exec_sql_text ([er].[sql_handle]) [est]
OUTER APPLY sys.dm_exec_query_plan ([er].[plan_handle]) [eqp]
WHERE
[es].[is_user_process] = 1
ORDER BY
[owt].[session_id], [owt].[exec_context_id];
GO

The above query is from Paul Randal. Another useful blog about Tempdb is from Pamela Lahoud of Microsoft.

You will see later in this blog post that since SQL Server 2016, trace flags 1117 and 1118 are not needed because they are now the default behavior of TempDB database.

Problem 3: Tempdb Metadata contentions

Third type of bottleneck we see in Tempdb is from system tables. This is also known as metadata contention and it is occurring because of hardware improvements which enables creation and deletion of temporary objects at a faster rate in TempDB database. This is a good thing because we want TempDB to be faster but it created another type of contention which has to do with keeping track of all these metadata changes in the system table. SQL Server 2019 has pretty much resolved this issue by providing an option to place the TempDB system tables on In-Memory files. In-memory tables do not face any latches or locks as it is in RAM and not on the disk. Since these system tables only contain tempdb metadata their footprint is not that big on the SQL Server memory.

You can use the following script to view all the sessions that are waiting for page-related wait types and get information about the objects that the pages belong to. sys.fn_PageResCracker is a new function available in SQL Server 2019 that takes the Page_resource value from sys.dm_db_exec_requests dmv and give you the detail of locks and latches at the page level.

USE master
GO

SELECT
er.session_id, er.wait_type, er.wait_resource,
OBJECT_NAME(page_info.[object_id],page_info.database_id) as [object_name],
er.blocking_session_id,er.command,
SUBSTRING(st.text, (er.statement_start_offset/2)+1,
((CASE er.statement_end_offset
WHEN -1 THEN DATALENGTH(st.text)
ELSE er.statement_end_offset
END – er.statement_start_offset)/2) + 1) AS statement_text,
page_info.database_id,page_info.[file_id], page_info.page_id, page_info.[object_id],
page_info.index_id, page_info.page_type_desc
FROM sys.dm_exec_requests AS er
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) AS st
CROSS APPLY sys.fn_PageResCracker (er.page_resource) AS r
CROSS APPLY sys.dm_db_page_info(r.[db_id], r.[file_id], r.page_id, ‘DETAILED’) AS page_info
WHERE er.wait_type like ‘%page%’
GO

Problem 4: Version Store growth in read-only secondary replica

If you have Availability Group setup for your SQL Server instance, make sure to monitor version store growth in Tempdb database on secondary replica. In addition to following query, you can also monitor this with performance counters like Version Store Size (KB), Version Cleanup rate (KB/s) and Version Generation rate (KB/s). Here is an interesting write up about this problem on AG secondaries.

— Version store usage
SELECT DB_NAME(database_id) AS [database_name], reserved_page_count, reserved_space_kb
FROM sys.dm_tran_version_store_space_usage
ORDER BY reserved_space_kb DESC;

IMPROVEMENTS IN TEMPDB

  1. Optimizing SQL Server Eager Writes

One long standing issue with TempDB in the past was that when you perform SELECT INTO temporary table operation, poor performance on disk I/O occurred in tempdb database. This was resolved in SQL Server 2014 RTM and back ported to SQL Server 2012 SP2 CU1. SQL Server has a concept of Eager Writes that prevent flooding the buffer pool with pages that are newly created from bulk activities and need to be written to disk. SQL Server 2014 enhancements now relaxes the need to flush these pages to the tempdb data files as quickly as the older versions because it recognizes objects created in tempdb would be temporary.

2. Tempdb Installation Defaults

Having multiple tempdb data files to remove PFS and SGAM page contentions has been long standing recommendations from Microsoft. Prior to SQL Server 2016, DBAs would manually add the additional TempDB data files after installing SQL Servers. and enable trace flags 1118 and 1117 to evenly grow these files. SQL Server 2016 installation now includes this step in the setup. It recognizes the number of CPU cores and based on that gives the option of creating multiple files. You can change the no. of files, file location etc. Also there is no need to use the previously mentioned trace flags, as this has now became the default behavior of TempDB database.

3. Tempdb system tables move to In-Memory OLTP

Once again SQL Server 2019 brought a huge improvement to TempDB performance. You can enable “Optimize for Tempdb Metadata” feature and it will put all the TempDB metadata (system tables) in memory. This takes away all type of pagelatch_XX wait types. When temporary objects are created and removed at super fast pace, due to huge improvements in hardware, disk is no longer the bottleneck for these operations. These metadata objects reside in RAM (memory) now.

TEMPDB BEST PRACTICES

  1. Configure multiple tempdb data files. Start with 8 if you have 8 or more logical cores. Then add more in chunks of 4 if you still see PFS contention. Multiple files are beneficial even if all are on the same drive which is dedicated for Tempdb.
  2. Enable trace flag 1118 as a startup trace flag for SQL Server prior to SQL Server 2016. This prevents SGAM contentions.
  3. Enable trace flag 1117 as a startup trace flag for SQL Server prior to SQL Server 2016. This makes multiple tempdb files to grow equally.
  4. Always put the tempdb data and log files on fastest disk. Have it local to the machine rather than in SAN. It keeps your SAN path free for user database traffic
  5. Put Tempdb database on a separate volume then your user databases.
  6. Just pre-allocate the tempdb files as large as possible, with enough room to give you time to react to unforeseen scenarios such as SQL Server encounters long running query.
  7. Do not explicitly drop temp tables at the end of a stored procedure, they will get cleaned up when the session that created them ends.
  8. Do not alter temp tables after they have been created.
  9. Do not truncate temp tables.
  10. Move index creation statements on temp tables to the new inline index creation syntax that was introduced in SQL Server 2014.

SUMMARY

The key to keep your TempDB healthy is proactive and continuous monitoring with DMVs and XEvents. Setup up proper alerting to notify when space usage reaches certain threshold. Document the TempDB configurations into your SOP (Standard Operating Procedure) documentation to be consistent across all your SQL Servers. Keep an eye on version store usage and be careful about their growth on read-only secondary replicas. Consider using Memory Optimized Tempdb Metadata feature of SQL Server 2019.

Automatic Soft NUMA

NUMA is basically a memory and CPU architecture inside the computer machine. NUMA stands for Non-Uniform Memory Access. The purpose of NUMA is for Scheduler (CPU) to have faster access to memory. Each CPU talks to memory controller to get the memory. When there are more than 8 CPUs, single memory controller becomes  bottleneck. This became a scalability issue. NUMA is a hardware configuration which was built to scale beyond the processor and memory limitations of traditional Symmetric-Multi-Process (SMP) systems. If there are 16 CPUs, there will 2 NUMA nodes (node 0 and 1). Each node will have its own memory controller and serves up to 8 CPU. CPU and Memory are partitioned at the hardware level to improve the access to memory. SQL Server is NUMA aware application. You don’t have to enable it. Almost all modern computer hardware now has this hardware NUMA.

I would highly recommend anyone new to NUMA to read this article written by SQLSkills team.

Following is the image of SMP configuration where all processors have equal access to memory and I/O. As you can imagine, increasing the number of CPU in this architecture will create a bottleneck because there is only one memory controller and one I/O controller.

Fig 1

In Fig 2 below you are seeing 2 NUMA nodes and CPU and Memory are partitioned. Each NUMA node gets its own memory and I/O controller.

Fig 2

Automatic Soft NUMA:

Modern processors have multiple cores per socket. Each socket is represented, usually, as a single NUMA node. The SQL Server database engine partitions various internal structures and partitions service threads per NUMA node. With processors containing 10 or more cores per socket, using software NUMA to split hardware NUMA nodes generally increases scalability and performance. Prior to SQL Server 2014 (12.x) SP2, software-based NUMA (soft-NUMA) required you to edit the registry to add a node configuration affinity mask, and was configured at the host level, rather than per instance. Starting with SQL Server 2014 (12.x) SP2 and SQL Server 2016 (13.x), soft-NUMA is configured automatically at the database-instance level when the SQL Server Database Engine service starts. Please read this  documentation and this documentation for more understanding.

Note 1: Hot-add processors are not supported by soft-NUMA

Note 2: Starting with SQL Server 2014 (12.x) SP2, use trace flag 8079 to allow SQL Server to use Automatic Soft-NUMA. Starting with SQL Server 2016 (13.x) this behavior is controlled by the engine and trace flag 8079 is not required.

The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs. Soft NUMA allows you to further divide schedulers in smaller group. You can read a really good description about how NUMA works here

NUMA can match memory with CPUs through specialized hardware (hardware NUMA) or by configuring SQL Server memory (soft-NUMA). During startup, SQL Server configures itself based on underlying operating system and hardware configuration or the soft-NUMA setting. For both hardware and soft-NUMA, when SQL Server starts in a NUMA configuration, the SQL Server log records a multimode configuration message for each node, along with the CPU mask.

Here are couple of useful queries I use to find out about NUMA layout of my customer’s SQL Server. I am using Glenn’s Performance Queries modified on September 13, 2020.

— Get instance-level configuration values for instance  (Query 4) (Configuration Values)

SELECT name, value, value_in_use, minimum, maximum, [description], is_dynamic, is_advanced FROM sys.configurations WITH (NOLOCK) ORDER BY name OPTION (RECOMPILE);

In above query result,  focus on these settings:

  1. automatic soft-NUMA disabled (should be 0 in most cases)
  2. backup checksum default (should be 1)
  3. backup compression default (should be 1 in most cases)
  4. clr enabled (only enable if it is needed)
  5. cost threshold for parallelism (depends on your workload)
  6. lightweight pooling (should be zero)
  7. max degree of parallelism (depends on your workload and hardware)
  8. max server memory (MB) (set to an appropriate value, not the default)
  9. optimize for ad hoc workloads (should be 1)
  10. priority boost (should be zero)
  11. remote admin connections (should be 1)
  12. tempdb metadata memory-optimized (0 by default, some workloads may benefit by enabling)

Following query gives you some useful information about the composition and relative load on your NUMA nodes. You want to see an equal number of schedulers on each NUMA node.  Watch out if SQL Server 2019 Standard Edition has been installed on a physical or virtual machine with more than four sockets or more than 24 physical cores.

–SQL Server NUMA Node information  (Query 12) (SQL Server NUMA Info)

SELECT node_id, node_state_desc, memory_node_id, processor_group, cpu_count, online_scheduler_count,   idle_scheduler_count, active_worker_count, avg_load_balance, resource_monitor_state FROM sys.dm_os_nodes WITH (NOLOCK)WHERE node_state_desc <> N’ONLINE DAC’ OPTION (RECOMPILE);

Automatic Soft-NUMA is enabled by default. It is recommended best practice to leave it enabled.

To disable Automatic Soft-NUMA, run the following command.

USE master; 

GO 

EXEC sp_configure ‘automatic soft-NUMA disabled’, ‘1’; 

RECONFIGURE WITH OVERRIDE; 

My Experience – PASS Summit 2019

I attended PASS Summit 2019 in November last year, 6 months ago. Nice thing was that I took all my notes in OneNote which I had just recently started using. Then life got into my way and I totally forgot about it. Then COVID-19 happened and I still did not remember till today. So here it is.

This time, I didn’t purchase any PreCon. It is not because there were no good ones but the opposite that there were so many that I worried I will pay the steep price for one and it will not be as expected or I would wish later on that I had attended the other one. So I made a cowardly decision and went to none.

But I did enjoy spending time with several friends and met lot of people from my previous work places. No breakfast this time at PASS was a bummer. They gave us $25 Starbucks gift card in lieu of 3 day of breakfasts. Just to let organizer know, if they are reading this, a modest breakfast in Seattle was priced around $10 in 2019. Another suggestion, if prices for PASS merchandises are lowered, for example a $60 t-shirt down to $30 (not sacrificing the quality), I think more people would buy them and take souvenirs home from PASS summit and net profit in the end will come out to be the same. I will keep the rest of the suggestions to myself.

Disclaimer: These are my notes taken while attending sessions and any inaccuracies are my own.

Session: SQL Server 2019 new tools

Speaker: Vicky Harp (SQL Server tools team in Microsoft)

This was all demos, no slides session.

Since 2016 tools and Engine connection was broken when SSMS became available as a separate install from database engine installation.

Azure Data Studio (ADS) is a new tool to communicate with SQL Server engine and is more powerful than SSMS in some ways. But SSMS is going to stay. It is purely for SQL Server on Windows platform and it is a 32 bit application.

ADS supports hybrid environment and platform. Your job is to secure, HA and DR your data and not learn every new technology that comes your way. ADS serves this purpose very well. It is a one stop shop kind of tool. You can connect to windows, Linux and can even connect from your non windows client such as a Mac Book. It natively supports beside TSQL, PowerShell, Python, Scala, Spark, Java etc.

Notebook in ADS is another great feature. You can run the query typed in notebook and result and comments will be saved in the same notebook. You can create complete solution in it and can email this notebook to your colleague. You can open notebook in any other compatible tools such as SSMS. You can create SOP documents for your helpdesk team. It is not compatible with OneNote yet. But discussions on this subject is in progress.

There are lot of extensions available in ADS which further improves its usefulness. Extensions such as SQK server Agent, Admin Pack for SQL Server and SQL Server Profiler to name few are available. Each extension has multiple tasks.

SQL Server Edge running on IOT devices can analyze IOT data on the device itself instead of sending it to some server to analyze. Stream Analytics is used for this purpose.

Session: Working with SQL Server Containers

Speaker: Carlos Robles

Demos: C:\Users\amlalani\OneDrive\Training\PASS Summit 2019\SQL Server Containers

Why containers?

  • Containers are standard unit of software.
  • They provide security and isolation.
  • They are lightweight, portable and cross platform.
  • They are consistent and predictable and no surprises
  • Continuous Integration (CI) and Continuous Deliver (CD)
  • eliminates shared environments and resource contentions.

SQL Server as Application:

Before difficult to integrate, operational complexity,

Now Docker Compose uses YAML file for deployment

CI – Commit -> Trigger -> Notify -> Test

CD – Deliver -> Deploy

Jenkins is open source Automation Server. Pipelines are set of jobs.

Session: Intelligent Query and Performance

Speaker: Pedro Lopez and Joe Sack

  1. Intelligent DB vision:
    1. Anxiety free upgrades
    2. Don’t worry about regressions
    3. Query tuning assistance tools
  2. Performance IQP
    1. Resource Governor
    2. Soft NUMA became default in SQL Server 2019
    3. Many new improvements in newer version are not advertized so by just upgrading you get benefits out of the box.
  3. Management by default
    1. Accelerated Database Recovery ADR
    2. Online Resumable index creation
    3. Lightweight Profiling
    4. Automatic SQL
    5. Threat Protection
    6. GDPR questions can be answered with classification and vulnerability assessment tools

Memory Grant Feedback MGF:

Available in compatibility mode 140+

Memory misestimation can result in spills to disk and overestimation hurt concurrency.

If less than 50% memory need is granted, resulting in spills, then MGF will fix and add memory plus some buffer.

In 2017, Batch mode was introduced and was available for column-store indexes only.

In 2019, row-store Batch mode is available for all index types; row and column.

  1. Does Query optimizer has to do lot of work to first check if Query is eligible for Batch mode execution?

MFG has to learn again if plan is kicked out of cache for some reason.

Batch Mode Columnstore:

Built for Analytical workload and IO bound queries. Columnstore provides effective compression over traditional rowstore.

Rowstore Batch Mode:

If you cannot add Columnstore for some reason, maybe vendor does not support or table has trigger then Rowstore Batch Mode will come in handy. Adds new scan operator. Same for Heap and B-trees tables. Great for CPU bound queries.

Interleave MSTVFs:

Query processor treats it as a black box because it materializes at runtime only and QP cannot make any decision about the plan. SQL 2016 and prior version guesstimated 100 rows. Inefficient if 1000 or 10,000 rows actually returned. SQL 2017 + assumes 500 rows for 1st run and then adjusts. Downstream operations will benefit.

Batch Mode Adaptive Join (AJ):

If cardinality Estimate is wrong, wrong join operator will be selected in a plan which will be cached.

AJ will defer the choice of Hash or Nested join until after the first join input has been scanned.

Resource: aka.ms/IQP

Session: Data Classification

Speaker: Grant Fritchey

Security and compliance requirements are undeniable. It should be first and foremost.

Regulatory Landscape:

GDPR: European. Marriott and British Airways were fined. Half of UK firms are still not GDPR compliant.

FERPA: Department of Education law in US to protect adult student from parents. They don’t share any information even if the student is dependent and in hospital or sick.

HIPPA: Health regulation for medical industry. If a person has shared PII data knowingly they could go to jail.

CCPA: California flavor of GDPR regulation and will be enforced on Jan 1, 2020.

All of them have room for interpretation and none of them is perfect.

Session: Successfully communicate with Customer

Speaker: Dennise McInevney

Results of Bad communications:

Project Fails

Career hurt or ends

Frustration and stress

Time and Money wasted

  1. Empathy: Walk in your customer shoes. See their pain from their eyes. See how they use your product. See where can you help them.
  2. Us vs them mentality: Be one team. Make sure customer understands that you are on their side by you language and attitude.
  3. Talk/Speech: Be concise, clear and to the point in written and verbal communication
  4. Writing Skills:

Who are your customers? Identify them and identify their customers. Understand goals of both groups.

Requirements (How of things)  <  >   Goals (What of things)

  1. Understand customer goals first.
  2. Desired outcome second
  3. Understand business context

Don’t

Do

Jump to Solution

Listen carefully

Make assumptions

Ask Questions

Start designing

Be precise

Diagramming/coding

Identify approver first

Write down everything in a place easily accessible.

Don’t

Do

Email

Wiki\SharePoint

Chat

Agile Tools like Jira which keeps the history

Hallway conversation

Shared document

Get Alignment

Get Sign off

Ensure clarity

Don’t skip

Pick up phone when necessary

Joint accountability

Record the outcome in writing

If you are bilingual and can speak customer language, that is a great skill.

What you can do?

  • Practice Active listening
  • Don’t multitask
  • Check for understanding
  • Summarize
  • Review the facts

Change is inevitable

Don’t

Do

Stress Out

Take a deep breath

Get Mad

Ask questions

Cast blame

Update requirement

Managing Expectation:

After things are signed off, we need to manage the expectations such as how often we will have meetings, how status of projects will be reported and such.

Reporting Status:

  • Consistent format and time
  • If you want others to follow the reporting tool, make sure you are seriously using it.
  • No surprises
  • Alert if not working as expected, missed dependencies or duration.

Show as you Go:

Sketches, diagrams, demos at regular cadence and be open to feedback.

Deadlines:

Keep calm, communicate progress, what is okay to release , negotiate, trade off can happen.

Session: Adaptive QP to Intelligent QP

Speaker: Hugo Kornelis

SQLServerFast.com

http://sqlserverfast.com/blog

@Hugo_Kornelis

Query Processing (QP)

  • Phase 1 – Optimization (Cardinality Estimates, Execution Plan)
  • Phase 2 – Execution (Actual Estimates)

Merge is better if data is properly sorted. Puts everything in first input in memory.

Nested loop reads 1st row and start finding matching row in second source.

AJ is new join type and it has 3 inputs.

Only problem is AJ for Batch mode which was for columnstore index.

Interleaved Execution:

MSTVF – don’t know how big result is going to be

1 or 100 is hard-coded value and can never be right. In 2017, as soon as QP detect MSTVF, it paused the execution and starts another process that materialize the MSTVF first. You get the number of rows and now QP can give you the correct plan. But this plan is now cached and next time MSTVF comes with different parameter, it may not work efficiently. So the problem remains.

QP in SQL Server 2019:

Memory Grant Feedback (MGF):

Determines during compile time base on cardinality estimate. If underestimate, spills to the disk (tempdb). If overestimates, wasted memory and impacts concurrency.

First execution of MGF improves 2nd execution.

  • Adjusts down if > 50% memory unused the first time
  • Adjusts up after any SPILL + some buffer.

Stores last exec plan, forgets after recompile. After adapting, certain time for example x threshold. If more than that it stops adapting.

Extended events are available to track this info.

*** Check this demo out and put it in PTO workshop ***

UDF Inlining:

FROID is a framework that plugs in optimizer before optimization takes place. It looks at all the statements in UDF and converts into relational table and sends to QP which then generates the plan.

Session: Batch Mode on Rowstore

Speaker: Nikko Neuberger

Speaker Tip #1: Telling story and showing pictures and cartoons and bold words on screen worked very good with audience. He got laughs and attention (5 minutes)

Speaker Tip #2: If demo fails, keep talking while working and may refer to Demo GOD.

Read execution plant Right to left and also left to Right.

Exec plan query executes from Left to Right.

Nikko displayed drama script of row execution. He got 3 attendees and had piles of pages in different color.

Sort operator asks for data page from table one at a time.

In batch mode, all data pages are given at the same time to sort operator saving time.

Rt click + Act Ex rows + Est Exec rows -> shows row mode

*** Ask MS to remove est execution because not adding value and cluttering the GUI.

Rt click + see Bath execution mode and storage as Rowstore

He made an interesting analogy comparing Batch mode vs Row mode. Fast cars like Ferrari or Mustang vs a bicycle. Batch mode need little more space to work as fast car need highway to go fast.

In a Demo1 of 150 vs 140, timing was 12 sec and 18 sec respectively. CPU in both execution was high but in 140 it was more IO.

In a Demo2, Batch mode on Rowstore did 11 sec and Batch mode on columnstore did 6.8 sec. Fantastic! Since table had columnstore index, batch mode kicks in by default.

Demo3 was about Batch Mode Heuristics. You can see waittype of Batch_mode_scan_on_rowstore in an extended event sessions for Full batch when your run following queries.

  • Select count of rows in a table
  • Query table of 1 GB size
  • Query table of 10 GB size
  • Disable Batch mode and run query
  • Run query with TOP(20) and order by

In 2019, you will need columnstore because

  • Batch mode w/o columnstore index will not have compression
  • No advance storage engine features. Makes a difference in Logical Reads. Does the aggregation in memory instead of reading and bringing pages to query optimizer.
  • Columnstore index will always result in Optimization level = FULL
  • Hekaton disables Batch mode scan when columns with LOB types are present.

Session: Hash Match, the Operator (2.5 hour session)

Speaker: Hugo Kornelis

Speaker tip: He engaged audience just like Nikko did in his session. The dramatization in the beginning really catches audience attention. He had roller coaster rides picture and a story to go with it and he gave credits to all the pictures he used. Nice way to introduce the topic.

He said he is going to discuss 500 lever content of deep dive internals of operators and some of them are not documented. Such as hashing, hashing function, hashing table etc.

*** Download demos from fastsqlserver.com

Hashing … Laundry that needs sorting. But in Query optimizer world hashing is not sorting it is categorizing or bucketizing similar things.

  • Input  — Any
  • Output — Integer 0 – 9999
  • Deterministic — same input gets same output values
  • Hash Collision — Different inputs getting same output values.

Hashing function — organized by hash value or heap or bucket. It stores in memory entirely.

  1. There are 7 pointers for each hash value.
  2. Store 6 pointers in each structure
  3. Create and connect new 7-pointer structure

3 phases for hash Table match

  1. Build — Always used, process top input, builds hash table
  2. Probe — not always used, process second input if there is one match, detects has collision.
  3. Final — Returns data from Hash Table

Look at hash match operator properties in execution plan, you will see

Probe residual — will appear only if there are hash collision. Upto 4 byte of data, SQL Server gaurantees to return without Probe residual. It is needed when any operation on the column is done, when any non-integer data is present and when integer data is less than 4 byte. Impact of Probe residual on performance is not expensive. It is just one extra test.

Build Phase — Assuming no spills to the disk and all hash table fits in the memory.

  • Inner Join — Most simplest. Computes the hash and goes into bucket and finds the match, returns a joined row and goes to work with another row.
  • Rt. Outer Join — If no matching row is found, it returns rows from the outer table.
  • Both join type cause blocking in the build phase. There is no Final phase for them. Order of row is mostly retained but not guaranteed in real world where spills may occur.

Probe Phase

  • Lt. outer Join\Full outer Join — Marks the matched rows and returns the unmatched rows from the hash table. Adds a Final Phase for “Left returned” rows. No order for these rows is retained.
  • Lt. Semi Join (Exists) — Returns all the marked and matched rows from the hash table. No Final phase for them.
  • Lt. Anti Semi Join (not Exists) — returns all the unmarked and unmatched rows. Both semi and anti semi incur blocking in build and probe phase. All the output is produced in the Final Phase.
  • Rt. Semi Join — returns rows from Probe. Will not search for more rows when a first match is found.
  • Rt. Anti Semi Join — No match found return and no additional search is done. No final phase.

Hash match can also have 1 input.

Build Phase: (for aggregate)

Instead of adding to bucket, it searches if row already exists and does only update to the bucket. If no row found, adds a row and update the aggregation. This row is a place holder.

Final Phase — returns all aggregated rows from the hash table. There is no probe phase for it.

Example:

  • Select distinct(city) from customer — has match flow of distinct
  • Select city, count(*) from customer group by city — does hash aggregate

Hash Mach Scalar Aggregate:

Official name is Hash Match (Aggregate).

Stream aggregate more suitable for Select count(*) from table type of queries and is not supported in Batch mode. There is no Build Phase.

Hash match partial Aggregate:

Labelled differently in parallel parts of execution plan. There is no Build Phase. In compatibility mode 120 you can see this type of aggregate. No more visible in Modern SQL versions.

Hash Memory Usage:

Memory requested at the beginning of execution. Engine reserves at least requested memory. If requested memory not available, it waits. Tries to reserve up to requested memory before beginning execution. It does not provide a way to request additional memory during execution and hence when memory is insufficient, operators spell to tempdb on disk.

  • SQL Server 2016, batch mode may get extra memory during execution.
  • SQL Server 2017, batch mode on columnstore may get extra memory during execution
  • SQL Server 2019, all execution mode get extra memory during query execution.

Dynamic De-staging:

Happens when hash operator spills to disk.

  1. In-memory HJ — Never short of memory and everything fits nicely.
  2. Hybrid HJ — combines in-memory and Grace HJ. When out of memory, partitions buckets. If it is complex, then after 1st build and probe phases, does same as Grace HJ.
  3. Grace HJ — Algorithm is used when amount of data is more than the known memory. Divides set of buckets into partitions. Eg; buckets = 1 – 20,000. 5000 buckets at a time (hash partition function 1 – 4), 2 files for each partition is created. SQL server uses tempdb. They are actually structures in tempdb. Reads both inputs, stores rows in appropriate file. 1 file for build, 1 file for probe. Process each file individually.
  4. Recursive HJ — When Grace HJ mis-estimates partition size.
  • Dynamic Staging HM starts in memory.
  • Bail out happens when bucket size is larger than memory available. It abandons  further processing. HM then goes to Nested Loops or Sort and Merge join. Both are inefficient in this case.
  • Bit Vector filtering — is good. Always used when spills happen. In-memory bit map for has values of spilled partitions.
    1. Build: set bit for hash result of rows
    2. Probe: Check if bit is set for hash
      1. If set – write to file on disk
      2. If not – handle unmatched, don’t write to file

Memory Grant Computation:

Est size (row count + row size)

Note: For Var(char) data type it assumes to be 50% full

Est size (row count + row size) of probe input

Execution plan with multiple match operator.

Session: Query Store In-Depth

Speaker: Ola Hallengren

Query Store (QS) available since SQL Server 2016. It is like a flight recorder of query execution.

DBCC CLONEDATABASE includes query store Data.

Force a last know good plan after changing the compatibility level or cardinality estimation.

Disclaimer: These are my notes taken while attending sessions and any inaccuracies are my own.

GDPR – A new European Law to protect their citizens from Data breaches

I am hearing a lot about GDPR (General Data Protection Regulation) in past few months. I decided to do some digging up to know how it affects us in the United States. Those of you who directly want to go to the source of my information and avoid my 2 cents are welcome to check out these 2 links. This Law is explained in very easy terms for a layperson.

https://www.red-gate.com/simple-talk/opinion/opinion-pieces/questions-gdpr-shy-ask/?utm_source=simpletalk&utm_medium=pubemail&utm_content=20171121-slota2&utm_term=simpletalkmain

https://www.csoonline.com/article/3202771/data-protection/general-data-protection-regulation-gdpr-requirements-deadlines-and-facts.html

Those of you who are short on time and want to just get the skinny version, then keep on reading.

  1. This law is about protecting European Citizens from data breaches of companies inside and outside Europe
  2. This Law may affect us if and when we store data of EU citizens.
  3. This is for EU consumer safety.
  4. In case of a data breach, the companies even outside EU can become part of the lawsuit.
  5. To avoid the lawsuit, a USA based company have to provide the EU country representative with a notification within 3 days of a data breach.
  6. Companies can be fined if the governing body finds the proof that adequate data protection measures were not taken by the company whose data was breached.

It is good to keep this EU law on our radar even if our company is not storing EU consumer data. I think something like this law will become the norm in next 5 years across the global communities. After all, it is the responsibility of the company who collects and stores consumer data to provide safeguards to protect the data. The basic rationale of this law is simple. Since companies make a profit by using a consumer information, it is fair that they should spend a part of that profit to protect that information. This provides clear accountability on one hand and provides peace of mind to the citizens on the other hand. In my opinion, this is a win-win situation.

I would appreciate your comments on this topic.

What you like to LEARN?

Sneak-peek of Microsoft SQL Server 2016 running on SanDisk flash

 10-07-2015 

Brief Recap of the session:

This technical briefing’s main objective was to demonstrate the latest and greatest features of SQL Server 2016 and how running it on SanDisk hardware multiplies the benefits of new features.

This demo was conducted in Microsoft Technology Center in Chicago and was jointly presented by Microsoft and SanDisk representatives. It was a 300-level advanced preview of the hardware and software without any sales pitch.

Their Fusion IoMemory card goes directly in the PCI slot on the server’s motherboard and they have parity NAND chips on that card for fault tolerance. Price went down considerably for the same product since the April 2015 for two reasons. First their Shanghai, Chaina NAND chip manufacture has become more efficient in producing high quality chips and secondly partnership with Microsoft and other hardware vendors such as Dell and HP.

20151007_110956

These are some of the numbers we get with SanDisk flash when tables are stored as normal (i.e; on Sandisk without using columnstore feature of SQL Server).

20151007_110408

SQL server 2016 has opened the clogged artery meaning flash drive performance is out of the charts. They were talking in terms of IOPS, throughput and latency. Now SQL server 2016 (compared to sql server 2014) using Sandisk or any flash drive can deliver IOPS in millions, Throughput in GBs and latencies in microseconds. The enhancement is due to the improved ColumnStore feature in SQL Server 2016. IO bottleneck for OLTP or OLAP load almost vanishes.

20151007_115143

These improvements directly affects the HA and DR environments. For example Always on Availability Group. Log transfer from one database to another, on the secondary, is much faster because of separating the redo process at the secondary server. AG, in synchronized mode, waits for the confirmation from the secondary server. Before sql server 2016, a record has to be written to the disk on the secondary and applied to the database before confirmation is sent to the Primary node. Now confirmation is sent as soon as the records is written to the disk. Redo process than run in the background at secondary server to apply the changes to the database. SQL 2016 AG now have load balancing feature included.

20151007_113725

Another benefit is cost savings in SQL server license. Because most of the time sql load is IO bound and CPU hardly goes above 25% on average. These flash drives will make CPU work for what they are worth. So only 2 CPUs are needed for server load as big as 10 TB Datawarehouse.

My impressions:

Since our databases are not even 1 TB, the whole critical databases in production can be stored on these flash drives. Or we can have our log files and tempdb files on flash drives. There are lot of options. And when these cards are purchased with VM vendors, there is warranty included which replaces these cards when it fails without cost to us.

They also have caching feature, I think this a separate software based product that needed to be purchased separately and maybe useful for application level processes to run faster.

Another point is that we are skipping upgrade to SQL server 2014 so we are naturally going to upgrade to SQL server 2016 in year 2017. I think we can plan in 2016 for testing and purchasing these flash drives.

Additionally, these will also make our HA and DR very reliable in terms of failover.

In my opinion, we can take advantage of flash drives for production load in our VMs when we upgrade SQL Server to 2016.

Everything you need to know about FileStream feature of SQL Server 2012

FileStream directory

The file stream network path technically is this \\SQL Server name\Network Share Name\Database File stream Directory Name\<File Table Name>

  1. SQL Server Name – (starting with \\)
  2. Windows Share Name – It is visible in SQL Server Configuration tool -> SQL Server properties -> FileStream tab.
  3. Database File stream Directory Name – It can be seen in Database Properties -> Options -> FILESTREAM Directory Name
  4. The last part is the file table name.

Some helpful FileStream queries

Exploration Queries:

SELECT DB_NAME(database_id) AS DatabaseName,non_transacted_access_desc, directory_name  FROM sys.database_filestream_options WHERE directory_name IS NOT NULL

SELECT directory_name,is_enabled,filename_collation_name FROM sys.filetables;

SELECT * FROM sys.tables WHERE is_filetable = 1

SELECT object_id, OBJECT_NAME(object_id) AS ‘Object Names’    FROM sys.filetable_system_defined_objects

— To get the root level UNC path of a file table

SELECT FileTableRootPath()

SELECT FileTableRootPath(N’file_table_name’)

Enable Disable Queries:

https://msdn.microsoft.com/en-us/library/gg492061.aspx#HowToDisable

— Disable Non-Transactional write access.

ALTER DATABASE database_name

SET FILESTREAM ( NON_TRANSACTED_ACCESS = READ_ONLY );

— Disable non-transactional access.

ALTER DATABASE database_name     SET FILESTREAM ( NON_TRANSACTED_ACCESS = OFF );

— Enable full non-transactional Access

ALTER DATABASE database_name     SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL );

Find locks held by filestream queries:

Open handles to the files stored in a FileTable can prevent the exclusive access that is required for certain administrative tasks. To enable urgent tasks, you may have to kill open file handles associated with one or more FileTables.

SELECT * FROM sys.dm_filestream_non_transacted_handles;

— To identify open files and the associated locks

SELECT opened_file_name    FROM sys.dm_filestream_non_transacted_handles

WHERE fcb_id IN        ( SELECT request_owner_id FROM sys.dm_tran_locks );

— Kill all open handles in all the filetables in the database.

EXEC sp_kill_filestream_non_transacted_handles;

— Kill all open handles in a single filetable.

EXEC sp_kill_filestream_non_transacted_handles @table_name = ‘filetable_name’;

— Kill a single handle.

EXEC sp_kill_filestream_non_transacted_handles @handle_id = integer_handle_id;

Code Re-usability:

— Use the relative path for portable code

https://msdn.microsoft.com/en-us/library/gg492087(v=sql.110).aspx

DECLARE @root nvarchar(100);

DECLARE @fullpath nvarchar(1000);

SELECT @root = FileTableRootPath();

SELECT @fullpath = @root + file_stream.GetFileNamespacePath() FROM EDMS

WHERE name = N’document_name’;

PRINT @fullpath;

How FileStream works with some of the High Availability Features of SQL Server 2012?

 AlwaysOn Availability Groups and FileTables

https://msdn.microsoft.com/en-us/library/gg492086.aspx#alwayson

When the database that contains FILESTREAM or FileTable data belongs to an AlwaysOn availability group:

  • FileTable functionality is partially supported by AlwaysOn Availability Groups. After a failover, FileTable data is accessible on the primary replica, but FileTable data is not accessible on readable secondary replicas.
  • The FILESTREAM and FileTable functions accept or return virtual network names (VNNs) instead of computer names.
  • All access to FILESTREAM or FileTable data through the file system APIs should use VNNs instead of computer names.

Replication and FileTables

Replication and related features (including transactional replication, merge replication, change data capture, and change tracking) are not supported with FileTables.

Setting up Log Shipping

I am doing log shipping since SQL Server 2000. I did it with SQL Server 2005. But for the 4 years that I worked with SQL Server 2008 R2, I did not work with Log shipping. Nothing has changed in SQL server 2012. But to refresh my memory, here I am writing down simple steps for setting up log shipping.

  1. Make sure database is in Full recovery mode. In this case primary server is PriSQL\Prod01 and primary database is PriDB.
  2. Pre-initialize the database at secondary. It means take the full backup of PriDB and a transaction log backup of primary database SecDB and restore on secondary SecSQL\DR01 in Standby mode.
  3. Right click the primary database, go to properties and select “Enable this as a primary database in a log shipping configuration”.
  4. Select how frequently you are going to do log backups in “Backup Settings” button. Also provide the path where log backup goes. Some other settings on this page are set as follows:
    1. Delete files older than: 72 hours
    2. Alert if no backup occurs within: 2 hours
    3. Backup schedule: Every 15 minutes
    4. Backup Path: E:\SQLBackups\PriDB_log
    5. Job name: LSBackup_PriDB
    6. Backup compression: Use default server setting.
  5. Add the secondary server. In this case it is SecSQL\DR01. Secondary database is SecDB.
    1. Tab- Initialize secondary Database:
      • Select database is already initialized.
    2. Tab- Copy files:
      • Destination folder for copied file F:\SQLBackups\LogShip
      • Set 72 hours for Delete copied files.
      • Copy job: LSCopy_PriSQL\prod01_PriDB
      • Copy Schedule: Every 15 minutes.
    3. Tab- Restore Transaction Logs
      • Restore job: LSRestore_PriSQL\prod01_PriDB
      • Restore Schedule: Every 15 minutes
      • Delay restoring: 1 minute
      • Alert if no restore occurs: 2 hours

Value of Organizing

Whether you are starting a new job or at your existing job, it is never too late to organize your SQL server database shop. As somebody famously said, “Organize, don’t agonize”. Having clear idea of what is needed to be done already puts you ahead of your task. As we all know, some DBAs are either doing nothing or fighting fires. We do not want to fall in that category. Best use of slow time is to get organized. Here are some tips to help you achieve that Zen like balance.

Create a list of SQL Server information site:
You cannot keep all there is to know about SQL server in your mind. So what to do about it? Write it down. Yes, either create a spreadsheet and save it in a table or save it on your own blog. Whatever tool you choose, it is important that you feel comfortable using it. Save addresses of MVPs blogs , SQL Server Online forums and other tips and tricks. After a while it will become overwhelming list if you do not configure some type of classification or tagging system.

Note: I will be adding more here later on.

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 480 times in 2014. If it were a cable car, it would take about 8 trips to carry that many people.

Click here to see the complete report.