A closer look at everything related to SQL Server

Today, I explained to my internal IT team and a DBA on another team about the advantages of using AG for Disaster Recovery scenario. They had lot of questions and I prepared myself with these documents. I drew the scenario on my board and took the picture of it. I modified the presentation I got from one of the SQL Saturdays I attended. I downloaded Word document on this topic from MSDN site.

SQLServerAlwaysOn

Building a High Availability and Disaster Recovery Solution using AlwaysOn Availability Groups

 

AG_HA_DR

 

I think people are hesitant to change because they do not want to come out of their comfort zone and\or not knowing enough about the new technology makes them less confident. They like to stick to Log Shipping because it works. Another uncertainty was about geo-clustering or AG on different subnet for DR. Here I also lack knowledge so I pointed out this would be a good oppurtunity for all of us to learn and we will have one more option for DR in case log shipping feature is not available in future SQL Server releases. Then there was bad experience that the team had several years ago with Failover Clustering. It used to take time for SQL Server services to come up. I know I have lot to do to change the mindset but so little time. In the end, I got the agreement that we will create one DR box in the second data center where I will do the log shipping and also will include it in my current Availability group configuration. Let us see how it goes.

Friday 04/10/2015
SQL Saturday Madison Pre-Con: Planning a SQL Server High Availability and Disaster Recovery Plan with Ryan Adams

I have read and researched and tested and have implemented in production many high availability (HA) solutions. But things keep on changing as new SQL Server features become available. Not forgetting that hardware side of the things (VMs and Azure clouds) also keep changing. SQL Server offers many options, but how do you choose the right one to meet your business requirements? In this one day class, Ryan took us from basic of Backup and Restore to Windows clustering and Availability groups. In between we touched pros and cons and gotchas of log shipping, mirroring and replication etc. Every feature has pros and cons. It will be hard to list all here.

He gave us a scenario in the beginning of the class and we were to design our solution as class progresses and our knowledge increases. The scenario was to configure a HA\DR plan to ensure a company’s systems are always up and running and can withstand a technical or natural disaster. There are 3 data centers Maine, California and Arizona. Network connection between Maine and California is excellent and between Maine and Arizona is okay. There are 3 critical databases in Main that need HA\DR and Reports can be 2 hours behind. Only one solution was asked and I was able to design 2 solutions and instructor was impressed and asked me to present my thought process behind it.

You will ask why backup and restore discussion in HA class? Answer is proper backup and restore strategies are foundation for successful implementation of any HA\DR solution. I learned and saw firsthand in demos that using backup options such as MaxTransfersize and BlockSize (which are not available in Maintenance Plan utility) can cut your backup and restore time in half. This is not important to us now but I learned something new. I can use them now to plan backups if and when our DB Sizes reaches into 100s of Gigs.

I also learned that in my recent availability group setup, I had omitted one important step and that is having 2 NICs connected to each Node of the configuration. I have to talk to our IT guys now.

Saturday 04/11/2015 (FREE TRAINING)

Now I will briefly describe the sessions that I attended on Saturday. There were 7 tracks and 7 x 5 = 35 sessions. But unfortunately, one person can only be in one session at a time. Each session of 75 minutes. I chose to attend following 5 sessions.

Session 1: Practical Powershell for DBA
This session was 90 percent demo of the scripts that the presenter was using himself at his work. It not only showed the actual work but also the possibilities in my work where I can extend the functionality. It was mostly geared towards people who have 30 to 300 sql server instances to manage. I don’t see why this cannot be used in our smaller shop. I have used Powershell scripts in the past employment and can use it at present job for Database maintenance work.

Session 2: SSMS Showdown
This was my favorite session of all. 2 presenters acted as they were competing against each other to show us the coolest tips and tricks of SQL Server Management studio (lovingly called SSMS). There were 5 rounds so we should be learning 10 coolest tips but in order to outsmart the opponent each presenter tried to show us more in one round. We, audience, voted after each round. I am planning to show what I learned here to my development team. It can be really useful in our day to day dealing with TSQL scripting and coding using SSMS. It will improve efficiency and overall quality of work.

Session 3: Collect baseline for all your servers
Base-lining is important because before we answer why the system is slow or what is wrong with SQL Server, we need to know how it reacts under normal circumstances. Again everyone has their own tool of collecting server level and database level information for trending and troubleshooting purposes and this presenter showed his own. He used SQL Server Integration Services (SSIS) to collect his baseline metrics. I saw the benefits of using it and how flexible it becomes when you have to monitor several servers. You only need to make a change in the environment variable in the configuration file and you can use this same solution across multiple servers from Dev to production. Since connection managers uses the standard connection strings, no linked servers were required.

Session 4: Administering SQL Servers for Development Teams
There were 2 presenters. One was a DBA and another was a developer. Since this is my first job here where I am actually the part of the sprint meeting every day, I thought I would attend this session. I did not as much learned anything new in this session as it validated and confirmed what we are doing is right with our sprint meetings, agile methods and TFS to track bugs and source control to save codes etc. I wish I could bring some new information but glad to know that what we are doing at work is exactly as the best industry practice suggests. It also talked about how DBA can help developers and again we are following that method here.

Session 5: Automated SQL Installs
I chose this session because it was going to be all demos while other sessions can be followed through the PowerPoint slides and script that each presenter post on SQL Saturday site. I followed through 3 quarter of the session than I lost it because it was too technical for me. Also at that point I realized that although this is nice to automate base SQL server install, Service pack upgrades, Cumulative upgrades and patches etc., we are not going to use it here due to security concerns etc. But again the point of choosing the sessions is to go little bit out of your comfort zone to learn something new. I think I know where to turn to if in my future career time comes to use the Automated Install of SQL Server. Presenter used CodePlex project “FineBuild” (open source) to show us step by step process of building install scripts and not every but most of the possible options to select.

Final thoughts:
I know many of these things that I mentioned above can be learned from the books or attending online webinars etc. But the enthusiasm of the speakers, volunteers and participants cannot be bought or experienced without being there in person. As a SQL Server professional you need that face time with MVPs and see firsthand their confidence, humility and spirit of giving back to the community. This inspires us to go extra mile at our own work place and feel joy from applying the new concept, trick or shortcut just learned from the industry experts. It also makes us humble and never feel ashamed of our shortcomings or learning from others. It is like a dose of medicine that you need to take periodically throughout the year. Therefore I am planning to attend at least 4 SQL Saturday events each year. Another benefit of attending these community events is that you build new and maintain existing relationships with presenters and vendors.

Whether you are starting a new job or at your existing job, it is never too late to organize your SQL server database shop. As somebody famously said, “Organize, don’t agonize”. Having clear idea of what is needed to be done already puts you ahead of your task. As we all know, some DBAs are either doing nothing or fighting fires. We do not want to fall in that category. Best use of slow time is to get organized. Here are some tips to help you achieve that Zen like balance.

Create a list of SQL Server information site:
You cannot keep all there is to know about SQL server in your mind. So what to do about it? Write it down. Yes, either create a spreadsheet and save it in a table or save it on your own blog. Whatever tool you choose, it is important that you feel comfortable using it. Save addresses of MVPs blogs , SQL Server Online forums and other tips and tricks. After a while it will become overwhelming list if you do not configure some type of classification or tagging system.

Note: I will be adding more here later on.

Recently, I had an opportunity to attend Adam Mechanic’s “Parallel Execution Performance” class. He offered this class as a Pre-Con to SQLSaturday #291. I do not need to mention he is super intelligent guy who wrote universal stored procedure called “sp_whoisactive” and is used by most of the DBAs all over the world. But he is also a very good teacher and presenter. Here I will try to explain some of the nuggets of wisdom I collected from the class.

He mentions world in general and technology in particular has changed. So today’s DBA also need to change how they think about performance and bottlenecks. What was true 10 years ago maybe not true at all today or partially true. Adam said in Query Execution Plan there are 2 types of zones; parallel zones and serial zones. It is the serial zones where opportunity of performance tuning lies.

Although I was familiar about the Moore’s Law, but there were other laws that apparently I was not aware of. One is Amdhal’s law and other is Guftan’s Law. He gave an example to understand Amdahl’s law. Consider a restaurant Chef preparing a dish containing 3 items. Each item takes 10 minutes to prepare. And final plating takes 10 minutes. Altogether it will take 40 minutes to prepare one plate. She hires 3 cooks and now each cook prepares the single item and gives it to chef for plating. So instead of 40 minutes, dish is ready in 20 minutes. This is 50% increase in performance. This is not a linear relation ship and it is explained by the following formula.

improvement = 1/(1-P) +p/N

= 1/(1-.75) + .75/3 = 2 ( 50% increase than serial task. There were 3 cooks who can work in parallel, plating time was serial)

Let us now add one more cook. P = .75, N = 4

improvement = 1/(1-.75) + .75/4  = 2.385 (42% increase than serial task. There were 4 cooks but 3 tasks. Plating time was still serial)

Let’s add one more task. P = 0.8, N= 4

improvement = 1/ (1-.8) + .8/4) = 2.5 (60% increase than serial task. Serially it would have taken 50 minutes for chef to finish the dish with 4 tasks plus plating.)

I hope you get the idea. To me this is very interesting to note. I will try to find other areas where I can use this logic. Back to the earlier example, we see that if no. of tasks equals to no. of cooks, parallelism gives optimal result but the serial part of plating the dish is constant. If we improve that part of the process, it will give more performance improvement.

 

 

 

 

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 480 times in 2014. If it were a cable car, it would take about 8 trips to carry that many people.

Click here to see the complete report.

Buffer pages and checkpoints are one of the often written and talked about topics of SQL Server. Here I will provide my understanding and thoughts and research on these two topics.

Buffer cache is the the RAM allocated to store data pages requested by the system. This avoids making trip to the the disk subsystem and saves disk IO. Performance monitor , a windows tool to monitor server performance, has a counter call “Buffer Cache Hit Ratio” and it shows how SQL Server utilizes buffer cache. It gives the ratio of the data pages found and read from the SQL Server buffer cache and all data page requests. Ideally the value should be above 95 for OLTP systems. The pages that are not found in the buffer cache are read from the disk, which is significantly slower and affects performance. But according to Jonathan Kehiyas’s article https://www.simple-talk.com/sql/database-administration/great-sql-server-debates-buffer-cache-hit-ratio/, BCHR is not a reliable counter. In various tests, even when sql server is under memory pressure, this counter remains above threshold. So relying solely on this counter could be misleading. It is recommended that a DBA should never look at one performance monitor counter to base his or her decision about the poor performance but should look at other counters such “Page life expectancy” or PLE and “Read-ahead pages” etc to get the full picture.

Whenever the data in sql server database changes (such as inserts, updates or deletes), it is written to a transaction log. As soon as these records are written out, the transaction can commit. This implies that the speed of SQL Server transactions depends on the speed of transaction log writing. Pages that were changed in memory are written to the storage subsystem through a “Checkpoint” process. By default the this process runs about every minute, and requests all dirty pages from the buffer manager. A dirty page is page that was changed in memory, but hasn’t yet been written to the storage. As soon as a dirty page is written out to the storage, the page is marked as a clean page.

(Applies to SQL Server 2012 Enterprise Edition only)

Problem:
A feature or bug in SQL Server 2012 results in an unexpected behavior for identity columns. Technically, the specification for SQL does not dictate that identity columns be numerically contiguous, only that they are steadily increasing. In previous versions of SQL Server identity columns had been implemented by Microsoft to be contiguous, but in 2012, Microsoft changed the behavior so that identity columns cache values differently and may no longer be contiguous.

In practice what tends to happen is that when SQL Server service is restarted, failed over, hits a log checkpoint or possibly under other circumstances, you may see a number jump in identity column numbers, for example:

1, 2, 3, 4, (restart occurs), 1001, 1002, 1003, 1004, 1005, (failover occurs), 2000, 2001, 2002. . .

If an application is expecting contiguous numbering, this may result in unexpected issues. This can cause serious data integrity issue.

Solution:
There is a trace flag 272 that can be enabled to force the contiguous numbering behavior from previous versions of SQL Server. The downside is slightly slower identity column values upon insert, which should merely result in the same identity column performance from previous versions. You can read the comments SQL Server community leaders has sent to Microsoft on this issue.
http://connect.microsoft.com/SQLServer/feedback/details/739013/failover-or-restart-results-in-reseed-of-identity

Test:
I reproduced the error using sql server 2012 on my laptop. Anybody can use it on test server to see the issue.

— Example is using AdventureWorks2012.Production.WorkOrder table
— Check if table has identity column using sp_help [production.workorder]

— First Check the latest value of identity column
select max(workorderID) from production.workorder — 72591

— Now insert a row
INSERT INTO [Production].[WorkOrder]
([ProductID]
,[OrderQty]
,[ScrappedQty]
,[StartDate]
,[EndDate]
,[DueDate]
,[ScrapReasonID]
,[ModifiedDate])
VALUES
(1
,1
,1
,getdate()
,Sysdatetime()
,getdate() +1
,1
,Sysdatetime())
GO

— Check the value of identity column
select max(workorderID) from production.workorder — 72592

——–*************** Restart sql server service *************——————
— Insert another row
INSERT INTO [Production].[WorkOrder]
([ProductID]
,[OrderQty]
,[ScrappedQty]
,[StartDate]
,[EndDate]
,[DueDate]
,[ScrapReasonID]
,[ModifiedDate])
VALUES
(2
,2
,2
,getdate()
,Sysdatetime()
,getdate() +1
,2
,Sysdatetime())
GO

— Finally check the identity colum value again. This time it jumped to 73592, an increase of 1000.
select max(workorderID) from production.workorder — 73592

Result:

After enabling traceflag -T272, I repeated the previous steps and there was identity column increase of 1. I repeated this test on Developer edition SQL server 2012 and got increase of 2 after enabling traceflag. So at this point this traceflag only helps in Enterprise Edition only.

WorkOrderID ProductID OrderQty StockedQty ScrappedQty StartDate
72591 531 26 26 0 7/3/08 0:00
72592 1 1 0 1 6/16/14 14:36
73592 2 2 0 2 6/16/14 14:37
73593 3 3 0 3 6/16/14 14:37
74592 4 4 0 4 6/16/14 14:38
75593 4 5 0 5 6/16/14 14:42
75594 4 6 0 6 6/16/14 14:43
75595 4 7 0 7 6/16/14 14:44
75596 4 8 0 8 6/16/14 14:46

There is a SQL 2012 bug found (http://sqlperformance.com/2014/06/sql-indexes/hotfix-sql-2012-rebuilds) when index rebuilt is done with ONLINE option.

A related KB article (http://support.microsoft.com/kb/2969896) states that:
The online index rebuild can cause index corruption or data loss when it is used together with concurrent queries that modify many rows. For example, a replication updates many rows on a table when the online index build is running on the same table concurrently.

This issue occurs when the following conditions are true:

  1. The online index build query runs in parallel mode.
  2. Deadlock error and a fatal error such as “lock timeout” occur in a specific order.

Note To work around this issue, run the online index build in serial mode (DOP = 1).

According to the article link I pasted above, SQL server 2012 SP2 is out now. But fix for this bug is not included because it was discovered recently. So it suggests that you DO NOT apply SP2 until CU for this bug is available.
In the meantime, we have these options.
1. Add maxdop = 1 in the index rebuild query with ONLINE
OR
2. Run defrag with ONLINE = OFF
OR
3. Apply hotfix (http://support.microsoft.com/kb/2969896) available for sql server 2012 SP1. As with any hotfix, they are not tested thoroughly and can introduce new bugs.
OR
4. Wait for SP2 CU2, targeted to release by end of July and apply both of them together to SQL server 2012 SP1.

Implicit Conversion:

Implicit Data type conversion is done by SQL Server without notifying user.

 

Explicit Conversion:
Explicit conversion is done by user either by issuing Convert or Cast function command. An example of explicit conversion would be converting Date data types to the format you want .

SELECT GETDATE(), CAST(GETDATE() AS time(7)) AS ‘time’ ,CAST(GETDATE() AS date) AS ‘date’
,CAST(GETDATE() AS smalldatetime) AS ‘smalldatetime’ ,CAST(GETDATE()AS datetime) AS ‘datetime’
,CAST(GETDATE() AS datetime2(7)) AS ‘datetime2’ ,CAST(GETDATE() AS datetimeoffset(7)) AS ‘datetimeoffset’;

For more information, see http://msdn.microsoft.com/en-us/library/ms187752.aspx

 

Image

There is so much we miss out in this world and in our life by just not being present in present. We equate not doing anything to wasting time. In reality it is quite the opposite. I have experienced it myself and the feeling is so freeing and exhilarating that you will regret why you have not experienced it before. I think the old saying “idle mind is a Devil’s workshop” has misled us.

We are in this world for a short period of time and yet 50% of our time we are thinking about something that is not present. Weather we are listening to enticing music or watching a favorite TV show or reading an interesting book, we will not add these adjectives if we are not focused on what we are doing. ImageAn enticing music becomes irritable if some nagging thought in your mind keep bothering you. A favorite TV Show becomes boring if you constantly think about what you have to do next. An interesting reading becomes a chore if you are feeling anxious and stressful.

Actually I love all these 3 activities that I have used as an example here. So I have no problem on focusing on these activities. But there are other areas such as work and social life where I want to learn to apply this thinking. The benefit would be more satisfied business interaction which in turn will result in doing it again and again. It is just like a happiness cycle, once you start it it will get the momentum of its own.