A closer look at everything related to SQL Server

Archive for the ‘Uncategorized’ Category

My Experience – PASS Summit 2019

I attended PASS Summit 2019 in November last year, 6 months ago. Nice thing was that I took all my notes in OneNote which I had just recently started using. Then life got into my way and I totally forgot about it. Then COVID-19 happened and I still did not remember till today. So here it is.

This time, I didn’t purchase any PreCon. It is not because there were no good ones but the opposite that there were so many that I worried I will pay the steep price for one and it will not be as expected or I would wish later on that I had attended the other one. So I made a cowardly decision and went to none.

But I did enjoy spending time with several friends and met lot of people from my previous work places. No breakfast this time at PASS was a bummer. They gave us $25 Starbucks gift card in lieu of 3 day of breakfasts. Just to let organizer know, if they are reading this, a modest breakfast in Seattle was priced around $10 in 2019. Another suggestion, if prices for PASS merchandises are lowered, for example a $60 t-shirt down to $30 (not sacrificing the quality), I think more people would buy them and take souvenirs home from PASS summit and net profit in the end will come out to be the same. I will keep the rest of the suggestions to myself.

Disclaimer: These are my notes taken while attending sessions and any inaccuracies are my own.

Session: SQL Server 2019 new tools

Speaker: Vicky Harp (SQL Server tools team in Microsoft)

This was all demos, no slides session.

Since 2016 tools and Engine connection was broken when SSMS became available as a separate install from database engine installation.

Azure Data Studio (ADS) is a new tool to communicate with SQL Server engine and is more powerful than SSMS in some ways. But SSMS is going to stay. It is purely for SQL Server on Windows platform and it is a 32 bit application.

ADS supports hybrid environment and platform. Your job is to secure, HA and DR your data and not learn every new technology that comes your way. ADS serves this purpose very well. It is a one stop shop kind of tool. You can connect to windows, Linux and can even connect from your non windows client such as a Mac Book. It natively supports beside TSQL, PowerShell, Python, Scala, Spark, Java etc.

Notebook in ADS is another great feature. You can run the query typed in notebook and result and comments will be saved in the same notebook. You can create complete solution in it and can email this notebook to your colleague. You can open notebook in any other compatible tools such as SSMS. You can create SOP documents for your helpdesk team. It is not compatible with OneNote yet. But discussions on this subject is in progress.

There are lot of extensions available in ADS which further improves its usefulness. Extensions such as SQK server Agent, Admin Pack for SQL Server and SQL Server Profiler to name few are available. Each extension has multiple tasks.

SQL Server Edge running on IOT devices can analyze IOT data on the device itself instead of sending it to some server to analyze. Stream Analytics is used for this purpose.

Session: Working with SQL Server Containers

Speaker: Carlos Robles

Demos: C:\Users\amlalani\OneDrive\Training\PASS Summit 2019\SQL Server Containers

Why containers?

  • Containers are standard unit of software.
  • They provide security and isolation.
  • They are lightweight, portable and cross platform.
  • They are consistent and predictable and no surprises
  • Continuous Integration (CI) and Continuous Deliver (CD)
  • eliminates shared environments and resource contentions.

SQL Server as Application:

Before difficult to integrate, operational complexity,

Now Docker Compose uses YAML file for deployment

CI – Commit -> Trigger -> Notify -> Test

CD – Deliver -> Deploy

Jenkins is open source Automation Server. Pipelines are set of jobs.

Session: Intelligent Query and Performance

Speaker: Pedro Lopez and Joe Sack

  1. Intelligent DB vision:
    1. Anxiety free upgrades
    2. Don’t worry about regressions
    3. Query tuning assistance tools
  2. Performance IQP
    1. Resource Governor
    2. Soft NUMA became default in SQL Server 2019
    3. Many new improvements in newer version are not advertized so by just upgrading you get benefits out of the box.
  3. Management by default
    1. Accelerated Database Recovery ADR
    2. Online Resumable index creation
    3. Lightweight Profiling
    4. Automatic SQL
    5. Threat Protection
    6. GDPR questions can be answered with classification and vulnerability assessment tools

Memory Grant Feedback MGF:

Available in compatibility mode 140+

Memory misestimation can result in spills to disk and overestimation hurt concurrency.

If less than 50% memory need is granted, resulting in spills, then MGF will fix and add memory plus some buffer.

In 2017, Batch mode was introduced and was available for column-store indexes only.

In 2019, row-store Batch mode is available for all index types; row and column.

  1. Does Query optimizer has to do lot of work to first check if Query is eligible for Batch mode execution?

MFG has to learn again if plan is kicked out of cache for some reason.

Batch Mode Columnstore:

Built for Analytical workload and IO bound queries. Columnstore provides effective compression over traditional rowstore.

Rowstore Batch Mode:

If you cannot add Columnstore for some reason, maybe vendor does not support or table has trigger then Rowstore Batch Mode will come in handy. Adds new scan operator. Same for Heap and B-trees tables. Great for CPU bound queries.

Interleave MSTVFs:

Query processor treats it as a black box because it materializes at runtime only and QP cannot make any decision about the plan. SQL 2016 and prior version guesstimated 100 rows. Inefficient if 1000 or 10,000 rows actually returned. SQL 2017 + assumes 500 rows for 1st run and then adjusts. Downstream operations will benefit.

Batch Mode Adaptive Join (AJ):

If cardinality Estimate is wrong, wrong join operator will be selected in a plan which will be cached.

AJ will defer the choice of Hash or Nested join until after the first join input has been scanned.

Resource: aka.ms/IQP

Session: Data Classification

Speaker: Grant Fritchey

Security and compliance requirements are undeniable. It should be first and foremost.

Regulatory Landscape:

GDPR: European. Marriott and British Airways were fined. Half of UK firms are still not GDPR compliant.

FERPA: Department of Education law in US to protect adult student from parents. They don’t share any information even if the student is dependent and in hospital or sick.

HIPPA: Health regulation for medical industry. If a person has shared PII data knowingly they could go to jail.

CCPA: California flavor of GDPR regulation and will be enforced on Jan 1, 2020.

All of them have room for interpretation and none of them is perfect.

Session: Successfully communicate with Customer

Speaker: Dennise McInevney

Results of Bad communications:

Project Fails

Career hurt or ends

Frustration and stress

Time and Money wasted

  1. Empathy: Walk in your customer shoes. See their pain from their eyes. See how they use your product. See where can you help them.
  2. Us vs them mentality: Be one team. Make sure customer understands that you are on their side by you language and attitude.
  3. Talk/Speech: Be concise, clear and to the point in written and verbal communication
  4. Writing Skills:

Who are your customers? Identify them and identify their customers. Understand goals of both groups.

Requirements (How of things)  <  >   Goals (What of things)

  1. Understand customer goals first.
  2. Desired outcome second
  3. Understand business context

Don’t

Do

Jump to Solution

Listen carefully

Make assumptions

Ask Questions

Start designing

Be precise

Diagramming/coding

Identify approver first

Write down everything in a place easily accessible.

Don’t

Do

Email

Wiki\SharePoint

Chat

Agile Tools like Jira which keeps the history

Hallway conversation

Shared document

Get Alignment

Get Sign off

Ensure clarity

Don’t skip

Pick up phone when necessary

Joint accountability

Record the outcome in writing

If you are bilingual and can speak customer language, that is a great skill.

What you can do?

  • Practice Active listening
  • Don’t multitask
  • Check for understanding
  • Summarize
  • Review the facts

Change is inevitable

Don’t

Do

Stress Out

Take a deep breath

Get Mad

Ask questions

Cast blame

Update requirement

Managing Expectation:

After things are signed off, we need to manage the expectations such as how often we will have meetings, how status of projects will be reported and such.

Reporting Status:

  • Consistent format and time
  • If you want others to follow the reporting tool, make sure you are seriously using it.
  • No surprises
  • Alert if not working as expected, missed dependencies or duration.

Show as you Go:

Sketches, diagrams, demos at regular cadence and be open to feedback.

Deadlines:

Keep calm, communicate progress, what is okay to release , negotiate, trade off can happen.

Session: Adaptive QP to Intelligent QP

Speaker: Hugo Kornelis

SQLServerFast.com

http://sqlserverfast.com/blog

@Hugo_Kornelis

Query Processing (QP)

  • Phase 1 – Optimization (Cardinality Estimates, Execution Plan)
  • Phase 2 – Execution (Actual Estimates)

Merge is better if data is properly sorted. Puts everything in first input in memory.

Nested loop reads 1st row and start finding matching row in second source.

AJ is new join type and it has 3 inputs.

Only problem is AJ for Batch mode which was for columnstore index.

Interleaved Execution:

MSTVF – don’t know how big result is going to be

1 or 100 is hard-coded value and can never be right. In 2017, as soon as QP detect MSTVF, it paused the execution and starts another process that materialize the MSTVF first. You get the number of rows and now QP can give you the correct plan. But this plan is now cached and next time MSTVF comes with different parameter, it may not work efficiently. So the problem remains.

QP in SQL Server 2019:

Memory Grant Feedback (MGF):

Determines during compile time base on cardinality estimate. If underestimate, spills to the disk (tempdb). If overestimates, wasted memory and impacts concurrency.

First execution of MGF improves 2nd execution.

  • Adjusts down if > 50% memory unused the first time
  • Adjusts up after any SPILL + some buffer.

Stores last exec plan, forgets after recompile. After adapting, certain time for example x threshold. If more than that it stops adapting.

Extended events are available to track this info.

*** Check this demo out and put it in PTO workshop ***

UDF Inlining:

FROID is a framework that plugs in optimizer before optimization takes place. It looks at all the statements in UDF and converts into relational table and sends to QP which then generates the plan.

Session: Batch Mode on Rowstore

Speaker: Nikko Neuberger

Speaker Tip #1: Telling story and showing pictures and cartoons and bold words on screen worked very good with audience. He got laughs and attention (5 minutes)

Speaker Tip #2: If demo fails, keep talking while working and may refer to Demo GOD.

Read execution plant Right to left and also left to Right.

Exec plan query executes from Left to Right.

Nikko displayed drama script of row execution. He got 3 attendees and had piles of pages in different color.

Sort operator asks for data page from table one at a time.

In batch mode, all data pages are given at the same time to sort operator saving time.

Rt click + Act Ex rows + Est Exec rows -> shows row mode

*** Ask MS to remove est execution because not adding value and cluttering the GUI.

Rt click + see Bath execution mode and storage as Rowstore

He made an interesting analogy comparing Batch mode vs Row mode. Fast cars like Ferrari or Mustang vs a bicycle. Batch mode need little more space to work as fast car need highway to go fast.

In a Demo1 of 150 vs 140, timing was 12 sec and 18 sec respectively. CPU in both execution was high but in 140 it was more IO.

In a Demo2, Batch mode on Rowstore did 11 sec and Batch mode on columnstore did 6.8 sec. Fantastic! Since table had columnstore index, batch mode kicks in by default.

Demo3 was about Batch Mode Heuristics. You can see waittype of Batch_mode_scan_on_rowstore in an extended event sessions for Full batch when your run following queries.

  • Select count of rows in a table
  • Query table of 1 GB size
  • Query table of 10 GB size
  • Disable Batch mode and run query
  • Run query with TOP(20) and order by

In 2019, you will need columnstore because

  • Batch mode w/o columnstore index will not have compression
  • No advance storage engine features. Makes a difference in Logical Reads. Does the aggregation in memory instead of reading and bringing pages to query optimizer.
  • Columnstore index will always result in Optimization level = FULL
  • Hekaton disables Batch mode scan when columns with LOB types are present.

Session: Hash Match, the Operator (2.5 hour session)

Speaker: Hugo Kornelis

Speaker tip: He engaged audience just like Nikko did in his session. The dramatization in the beginning really catches audience attention. He had roller coaster rides picture and a story to go with it and he gave credits to all the pictures he used. Nice way to introduce the topic.

He said he is going to discuss 500 lever content of deep dive internals of operators and some of them are not documented. Such as hashing, hashing function, hashing table etc.

*** Download demos from fastsqlserver.com

Hashing … Laundry that needs sorting. But in Query optimizer world hashing is not sorting it is categorizing or bucketizing similar things.

  • Input  — Any
  • Output — Integer 0 – 9999
  • Deterministic — same input gets same output values
  • Hash Collision — Different inputs getting same output values.

Hashing function — organized by hash value or heap or bucket. It stores in memory entirely.

  1. There are 7 pointers for each hash value.
  2. Store 6 pointers in each structure
  3. Create and connect new 7-pointer structure

3 phases for hash Table match

  1. Build — Always used, process top input, builds hash table
  2. Probe — not always used, process second input if there is one match, detects has collision.
  3. Final — Returns data from Hash Table

Look at hash match operator properties in execution plan, you will see

Probe residual — will appear only if there are hash collision. Upto 4 byte of data, SQL Server gaurantees to return without Probe residual. It is needed when any operation on the column is done, when any non-integer data is present and when integer data is less than 4 byte. Impact of Probe residual on performance is not expensive. It is just one extra test.

Build Phase — Assuming no spills to the disk and all hash table fits in the memory.

  • Inner Join — Most simplest. Computes the hash and goes into bucket and finds the match, returns a joined row and goes to work with another row.
  • Rt. Outer Join — If no matching row is found, it returns rows from the outer table.
  • Both join type cause blocking in the build phase. There is no Final phase for them. Order of row is mostly retained but not guaranteed in real world where spills may occur.

Probe Phase

  • Lt. outer Join\Full outer Join — Marks the matched rows and returns the unmatched rows from the hash table. Adds a Final Phase for “Left returned” rows. No order for these rows is retained.
  • Lt. Semi Join (Exists) — Returns all the marked and matched rows from the hash table. No Final phase for them.
  • Lt. Anti Semi Join (not Exists) — returns all the unmarked and unmatched rows. Both semi and anti semi incur blocking in build and probe phase. All the output is produced in the Final Phase.
  • Rt. Semi Join — returns rows from Probe. Will not search for more rows when a first match is found.
  • Rt. Anti Semi Join — No match found return and no additional search is done. No final phase.

Hash match can also have 1 input.

Build Phase: (for aggregate)

Instead of adding to bucket, it searches if row already exists and does only update to the bucket. If no row found, adds a row and update the aggregation. This row is a place holder.

Final Phase — returns all aggregated rows from the hash table. There is no probe phase for it.

Example:

  • Select distinct(city) from customer — has match flow of distinct
  • Select city, count(*) from customer group by city — does hash aggregate

Hash Mach Scalar Aggregate:

Official name is Hash Match (Aggregate).

Stream aggregate more suitable for Select count(*) from table type of queries and is not supported in Batch mode. There is no Build Phase.

Hash match partial Aggregate:

Labelled differently in parallel parts of execution plan. There is no Build Phase. In compatibility mode 120 you can see this type of aggregate. No more visible in Modern SQL versions.

Hash Memory Usage:

Memory requested at the beginning of execution. Engine reserves at least requested memory. If requested memory not available, it waits. Tries to reserve up to requested memory before beginning execution. It does not provide a way to request additional memory during execution and hence when memory is insufficient, operators spell to tempdb on disk.

  • SQL Server 2016, batch mode may get extra memory during execution.
  • SQL Server 2017, batch mode on columnstore may get extra memory during execution
  • SQL Server 2019, all execution mode get extra memory during query execution.

Dynamic De-staging:

Happens when hash operator spills to disk.

  1. In-memory HJ — Never short of memory and everything fits nicely.
  2. Hybrid HJ — combines in-memory and Grace HJ. When out of memory, partitions buckets. If it is complex, then after 1st build and probe phases, does same as Grace HJ.
  3. Grace HJ — Algorithm is used when amount of data is more than the known memory. Divides set of buckets into partitions. Eg; buckets = 1 – 20,000. 5000 buckets at a time (hash partition function 1 – 4), 2 files for each partition is created. SQL server uses tempdb. They are actually structures in tempdb. Reads both inputs, stores rows in appropriate file. 1 file for build, 1 file for probe. Process each file individually.
  4. Recursive HJ — When Grace HJ mis-estimates partition size.
  • Dynamic Staging HM starts in memory.
  • Bail out happens when bucket size is larger than memory available. It abandons  further processing. HM then goes to Nested Loops or Sort and Merge join. Both are inefficient in this case.
  • Bit Vector filtering — is good. Always used when spills happen. In-memory bit map for has values of spilled partitions.
    1. Build: set bit for hash result of rows
    2. Probe: Check if bit is set for hash
      1. If set – write to file on disk
      2. If not – handle unmatched, don’t write to file

Memory Grant Computation:

Est size (row count + row size)

Note: For Var(char) data type it assumes to be 50% full

Est size (row count + row size) of probe input

Execution plan with multiple match operator.

Session: Query Store In-Depth

Speaker: Ola Hallengren

Query Store (QS) available since SQL Server 2016. It is like a flight recorder of query execution.

DBCC CLONEDATABASE includes query store Data.

Force a last know good plan after changing the compatibility level or cardinality estimation.

Disclaimer: These are my notes taken while attending sessions and any inaccuracies are my own.

GDPR – A new European Law to protect their citizens from Data breaches

I am hearing a lot about GDPR (General Data Protection Regulation) in past few months. I decided to do some digging up to know how it affects us in the United States. Those of you who directly want to go to the source of my information and avoid my 2 cents are welcome to check out these 2 links. This Law is explained in very easy terms for a layperson.

https://www.red-gate.com/simple-talk/opinion/opinion-pieces/questions-gdpr-shy-ask/?utm_source=simpletalk&utm_medium=pubemail&utm_content=20171121-slota2&utm_term=simpletalkmain

https://www.csoonline.com/article/3202771/data-protection/general-data-protection-regulation-gdpr-requirements-deadlines-and-facts.html

Those of you who are short on time and want to just get the skinny version, then keep on reading.

  1. This law is about protecting European Citizens from data breaches of companies inside and outside Europe
  2. This Law may affect us if and when we store data of EU citizens.
  3. This is for EU consumer safety.
  4. In case of a data breach, the companies even outside EU can become part of the lawsuit.
  5. To avoid the lawsuit, a USA based company have to provide the EU country representative with a notification within 3 days of a data breach.
  6. Companies can be fined if the governing body finds the proof that adequate data protection measures were not taken by the company whose data was breached.

It is good to keep this EU law on our radar even if our company is not storing EU consumer data. I think something like this law will become the norm in next 5 years across the global communities. After all, it is the responsibility of the company who collects and stores consumer data to provide safeguards to protect the data. The basic rationale of this law is simple. Since companies make a profit by using a consumer information, it is fair that they should spend a part of that profit to protect that information. This provides clear accountability on one hand and provides peace of mind to the citizens on the other hand. In my opinion, this is a win-win situation.

I would appreciate your comments on this topic.

What you like to LEARN?

Sneak-peek of Microsoft SQL Server 2016 running on SanDisk flash

 10-07-2015 

Brief Recap of the session:

This technical briefing’s main objective was to demonstrate the latest and greatest features of SQL Server 2016 and how running it on SanDisk hardware multiplies the benefits of new features.

This demo was conducted in Microsoft Technology Center in Chicago and was jointly presented by Microsoft and SanDisk representatives. It was a 300-level advanced preview of the hardware and software without any sales pitch.

Their Fusion IoMemory card goes directly in the PCI slot on the server’s motherboard and they have parity NAND chips on that card for fault tolerance. Price went down considerably for the same product since the April 2015 for two reasons. First their Shanghai, Chaina NAND chip manufacture has become more efficient in producing high quality chips and secondly partnership with Microsoft and other hardware vendors such as Dell and HP.

20151007_110956

These are some of the numbers we get with SanDisk flash when tables are stored as normal (i.e; on Sandisk without using columnstore feature of SQL Server).

20151007_110408

SQL server 2016 has opened the clogged artery meaning flash drive performance is out of the charts. They were talking in terms of IOPS, throughput and latency. Now SQL server 2016 (compared to sql server 2014) using Sandisk or any flash drive can deliver IOPS in millions, Throughput in GBs and latencies in microseconds. The enhancement is due to the improved ColumnStore feature in SQL Server 2016. IO bottleneck for OLTP or OLAP load almost vanishes.

20151007_115143

These improvements directly affects the HA and DR environments. For example Always on Availability Group. Log transfer from one database to another, on the secondary, is much faster because of separating the redo process at the secondary server. AG, in synchronized mode, waits for the confirmation from the secondary server. Before sql server 2016, a record has to be written to the disk on the secondary and applied to the database before confirmation is sent to the Primary node. Now confirmation is sent as soon as the records is written to the disk. Redo process than run in the background at secondary server to apply the changes to the database. SQL 2016 AG now have load balancing feature included.

20151007_113725

Another benefit is cost savings in SQL server license. Because most of the time sql load is IO bound and CPU hardly goes above 25% on average. These flash drives will make CPU work for what they are worth. So only 2 CPUs are needed for server load as big as 10 TB Datawarehouse.

My impressions:

Since our databases are not even 1 TB, the whole critical databases in production can be stored on these flash drives. Or we can have our log files and tempdb files on flash drives. There are lot of options. And when these cards are purchased with VM vendors, there is warranty included which replaces these cards when it fails without cost to us.

They also have caching feature, I think this a separate software based product that needed to be purchased separately and maybe useful for application level processes to run faster.

Another point is that we are skipping upgrade to SQL server 2014 so we are naturally going to upgrade to SQL server 2016 in year 2017. I think we can plan in 2016 for testing and purchasing these flash drives.

Additionally, these will also make our HA and DR very reliable in terms of failover.

In my opinion, we can take advantage of flash drives for production load in our VMs when we upgrade SQL Server to 2016.

Everything you need to know about FileStream feature of SQL Server 2012

FileStream directory

The file stream network path technically is this \\SQL Server name\Network Share Name\Database File stream Directory Name\<File Table Name>

  1. SQL Server Name – (starting with \\)
  2. Windows Share Name – It is visible in SQL Server Configuration tool -> SQL Server properties -> FileStream tab.
  3. Database File stream Directory Name – It can be seen in Database Properties -> Options -> FILESTREAM Directory Name
  4. The last part is the file table name.

Some helpful FileStream queries

Exploration Queries:

SELECT DB_NAME(database_id) AS DatabaseName,non_transacted_access_desc, directory_name  FROM sys.database_filestream_options WHERE directory_name IS NOT NULL

SELECT directory_name,is_enabled,filename_collation_name FROM sys.filetables;

SELECT * FROM sys.tables WHERE is_filetable = 1

SELECT object_id, OBJECT_NAME(object_id) AS ‘Object Names’    FROM sys.filetable_system_defined_objects

— To get the root level UNC path of a file table

SELECT FileTableRootPath()

SELECT FileTableRootPath(N’file_table_name’)

Enable Disable Queries:

https://msdn.microsoft.com/en-us/library/gg492061.aspx#HowToDisable

— Disable Non-Transactional write access.

ALTER DATABASE database_name

SET FILESTREAM ( NON_TRANSACTED_ACCESS = READ_ONLY );

— Disable non-transactional access.

ALTER DATABASE database_name     SET FILESTREAM ( NON_TRANSACTED_ACCESS = OFF );

— Enable full non-transactional Access

ALTER DATABASE database_name     SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL );

Find locks held by filestream queries:

Open handles to the files stored in a FileTable can prevent the exclusive access that is required for certain administrative tasks. To enable urgent tasks, you may have to kill open file handles associated with one or more FileTables.

SELECT * FROM sys.dm_filestream_non_transacted_handles;

— To identify open files and the associated locks

SELECT opened_file_name    FROM sys.dm_filestream_non_transacted_handles

WHERE fcb_id IN        ( SELECT request_owner_id FROM sys.dm_tran_locks );

— Kill all open handles in all the filetables in the database.

EXEC sp_kill_filestream_non_transacted_handles;

— Kill all open handles in a single filetable.

EXEC sp_kill_filestream_non_transacted_handles @table_name = ‘filetable_name’;

— Kill a single handle.

EXEC sp_kill_filestream_non_transacted_handles @handle_id = integer_handle_id;

Code Re-usability:

— Use the relative path for portable code

https://msdn.microsoft.com/en-us/library/gg492087(v=sql.110).aspx

DECLARE @root nvarchar(100);

DECLARE @fullpath nvarchar(1000);

SELECT @root = FileTableRootPath();

SELECT @fullpath = @root + file_stream.GetFileNamespacePath() FROM EDMS

WHERE name = N’document_name’;

PRINT @fullpath;

How FileStream works with some of the High Availability Features of SQL Server 2012?

 AlwaysOn Availability Groups and FileTables

https://msdn.microsoft.com/en-us/library/gg492086.aspx#alwayson

When the database that contains FILESTREAM or FileTable data belongs to an AlwaysOn availability group:

  • FileTable functionality is partially supported by AlwaysOn Availability Groups. After a failover, FileTable data is accessible on the primary replica, but FileTable data is not accessible on readable secondary replicas.
  • The FILESTREAM and FileTable functions accept or return virtual network names (VNNs) instead of computer names.
  • All access to FILESTREAM or FileTable data through the file system APIs should use VNNs instead of computer names.

Replication and FileTables

Replication and related features (including transactional replication, merge replication, change data capture, and change tracking) are not supported with FileTables.

Setting up Log Shipping

I am doing log shipping since SQL Server 2000. I did it with SQL Server 2005. But for the 4 years that I worked with SQL Server 2008 R2, I did not work with Log shipping. Nothing has changed in SQL server 2012. But to refresh my memory, here I am writing down simple steps for setting up log shipping.

  1. Make sure database is in Full recovery mode. In this case primary server is PriSQL\Prod01 and primary database is PriDB.
  2. Pre-initialize the database at secondary. It means take the full backup of PriDB and a transaction log backup of primary database SecDB and restore on secondary SecSQL\DR01 in Standby mode.
  3. Right click the primary database, go to properties and select “Enable this as a primary database in a log shipping configuration”.
  4. Select how frequently you are going to do log backups in “Backup Settings” button. Also provide the path where log backup goes. Some other settings on this page are set as follows:
    1. Delete files older than: 72 hours
    2. Alert if no backup occurs within: 2 hours
    3. Backup schedule: Every 15 minutes
    4. Backup Path: E:\SQLBackups\PriDB_log
    5. Job name: LSBackup_PriDB
    6. Backup compression: Use default server setting.
  5. Add the secondary server. In this case it is SecSQL\DR01. Secondary database is SecDB.
    1. Tab- Initialize secondary Database:
      • Select database is already initialized.
    2. Tab- Copy files:
      • Destination folder for copied file F:\SQLBackups\LogShip
      • Set 72 hours for Delete copied files.
      • Copy job: LSCopy_PriSQL\prod01_PriDB
      • Copy Schedule: Every 15 minutes.
    3. Tab- Restore Transaction Logs
      • Restore job: LSRestore_PriSQL\prod01_PriDB
      • Restore Schedule: Every 15 minutes
      • Delay restoring: 1 minute
      • Alert if no restore occurs: 2 hours

Value of Organizing

Whether you are starting a new job or at your existing job, it is never too late to organize your SQL server database shop. As somebody famously said, “Organize, don’t agonize”. Having clear idea of what is needed to be done already puts you ahead of your task. As we all know, some DBAs are either doing nothing or fighting fires. We do not want to fall in that category. Best use of slow time is to get organized. Here are some tips to help you achieve that Zen like balance.

Create a list of SQL Server information site:
You cannot keep all there is to know about SQL server in your mind. So what to do about it? Write it down. Yes, either create a spreadsheet and save it in a table or save it on your own blog. Whatever tool you choose, it is important that you feel comfortable using it. Save addresses of MVPs blogs , SQL Server Online forums and other tips and tricks. After a while it will become overwhelming list if you do not configure some type of classification or tagging system.

Note: I will be adding more here later on.

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 480 times in 2014. If it were a cable car, it would take about 8 trips to carry that many people.

Click here to see the complete report.

Buffer pages and checkpoint

Buffer pages and checkpoints are one of the often written and talked about topics of SQL Server. Here I will provide my understanding and thoughts and research on these two topics.

Buffer cache is the the RAM allocated to store data pages requested by the system. This avoids making trip to the the disk subsystem and saves disk IO. Performance monitor , a windows tool to monitor server performance, has a counter call “Buffer Cache Hit Ratio” and it shows how SQL Server utilizes buffer cache. It gives the ratio of the data pages found and read from the SQL Server buffer cache and all data page requests. Ideally the value should be above 95 for OLTP systems. The pages that are not found in the buffer cache are read from the disk, which is significantly slower and affects performance. But according to Jonathan Kehiyas’s article https://www.simple-talk.com/sql/database-administration/great-sql-server-debates-buffer-cache-hit-ratio/, BCHR is not a reliable counter. In various tests, even when sql server is under memory pressure, this counter remains above threshold. So relying solely on this counter could be misleading. It is recommended that a DBA should never look at one performance monitor counter to base his or her decision about the poor performance but should look at other counters such “Page life expectancy” or PLE and “Read-ahead pages” etc to get the full picture.

Whenever the data in sql server database changes (such as inserts, updates or deletes), it is written to a transaction log. As soon as these records are written out, the transaction can commit. This implies that the speed of SQL Server transactions depends on the speed of transaction log writing. Pages that were changed in memory are written to the storage subsystem through a “Checkpoint” process. By default the this process runs about every minute, and requests all dirty pages from the buffer manager. A dirty page is page that was changed in memory, but hasn’t yet been written to the storage. As soon as a dirty page is written out to the storage, the page is marked as a clean page.

SQL 2012 Identity column Unwanted Feature

(Applies to SQL Server 2012 Enterprise Edition only)

Problem:
A feature or bug in SQL Server 2012 results in an unexpected behavior for identity columns. Technically, the specification for SQL does not dictate that identity columns be numerically contiguous, only that they are steadily increasing. In previous versions of SQL Server identity columns had been implemented by Microsoft to be contiguous, but in 2012, Microsoft changed the behavior so that identity columns cache values differently and may no longer be contiguous.

In practice what tends to happen is that when SQL Server service is restarted, failed over, hits a log checkpoint or possibly under other circumstances, you may see a number jump in identity column numbers, for example:

1, 2, 3, 4, (restart occurs), 1001, 1002, 1003, 1004, 1005, (failover occurs), 2000, 2001, 2002. . .

If an application is expecting contiguous numbering, this may result in unexpected issues. This can cause serious data integrity issue.

Solution:
There is a trace flag 272 that can be enabled to force the contiguous numbering behavior from previous versions of SQL Server. The downside is slightly slower identity column values upon insert, which should merely result in the same identity column performance from previous versions. You can read the comments SQL Server community leaders has sent to Microsoft on this issue.
http://connect.microsoft.com/SQLServer/feedback/details/739013/failover-or-restart-results-in-reseed-of-identity

Test:
I reproduced the error using sql server 2012 on my laptop. Anybody can use it on test server to see the issue.

— Example is using AdventureWorks2012.Production.WorkOrder table
— Check if table has identity column using sp_help [production.workorder]

— First Check the latest value of identity column
select max(workorderID) from production.workorder — 72591

— Now insert a row
INSERT INTO [Production].[WorkOrder]
([ProductID]
,[OrderQty]
,[ScrappedQty]
,[StartDate]
,[EndDate]
,[DueDate]
,[ScrapReasonID]
,[ModifiedDate])
VALUES
(1
,1
,1
,getdate()
,Sysdatetime()
,getdate() +1
,1
,Sysdatetime())
GO

— Check the value of identity column
select max(workorderID) from production.workorder — 72592

——–*************** Restart sql server service *************——————
— Insert another row
INSERT INTO [Production].[WorkOrder]
([ProductID]
,[OrderQty]
,[ScrappedQty]
,[StartDate]
,[EndDate]
,[DueDate]
,[ScrapReasonID]
,[ModifiedDate])
VALUES
(2
,2
,2
,getdate()
,Sysdatetime()
,getdate() +1
,2
,Sysdatetime())
GO

— Finally check the identity colum value again. This time it jumped to 73592, an increase of 1000.
select max(workorderID) from production.workorder — 73592

Result:

After enabling traceflag -T272, I repeated the previous steps and there was identity column increase of 1. I repeated this test on Developer edition SQL server 2012 and got increase of 2 after enabling traceflag. So at this point this traceflag only helps in Enterprise Edition only.

WorkOrderID ProductID OrderQty StockedQty ScrappedQty StartDate
72591 531 26 26 0 7/3/08 0:00
72592 1 1 0 1 6/16/14 14:36
73592 2 2 0 2 6/16/14 14:37
73593 3 3 0 3 6/16/14 14:37
74592 4 4 0 4 6/16/14 14:38
75593 4 5 0 5 6/16/14 14:42
75594 4 6 0 6 6/16/14 14:43
75595 4 7 0 7 6/16/14 14:44
75596 4 8 0 8 6/16/14 14:46