A closer look at everything related to SQL Server

Archive for the ‘T-SQL’ Category

Logic of Query Processing – PART 1

IS IT ONE MORE BORING BLOG?

I am writing this blog to remind myself how the query processing work. It is a forgettable item because it is different from the way you write a query. To begin with let me write a simple query.

Select * from AdventureWorks2012.Person.Person

SIMPLE ANALYSIS

We have 3 elements in this simple query. First is the keyword “SELECT”. This keyword means you are retrieving data from the table (as oppose to inserting or updating data in a table). A Second element is “*” which denotes the list of columns in a table. The third element is the “FROM” clause which tells you to retrieve data from a particular table.

When SQL Server Query Optimizer receives this query it processes it in a logical order instead of a physical order. Most logical and cost-effective thing is to first check if the table exists or not. So the “FROM” clause will be processed first. Next, it checks for columns requested also exist or not. If a table exists and the columns requested also exist then Query Optimizer looks at “SELECT” keyword and retrieves the data for you.

Look at the direction of a blue arrow in the execution plan of this query. It shows, that Query Optimizer scans the table first (using the FROM clause) and then do the SELECT.

Logical query processing 1

This is how the flow diagram will look

Flow Diagram 1

Flow Diagram 1

Let’s add some more spice to this query.

Select * from AdventureWorks2012.Person.Person where Lastname = ‘Ford’

This query returns only 3 rows. “WHERE” clause in the query is filtering the table data on Lastname column. Query optimizer first checks the physical objects (tables and columns) exist and then see the filter and uses the index if there is one on the filtering column. Note the key_lookup operator in the below execution plan.  The reason is that the Non-Clustered index chosen in the Seek operation is not the covering index (remember we are using * ). In that case clustered index of the table is used to point to the data which generates key_lookup operator. It is an important consideration when you are doing query tuning. But for our purpose here it is not important. Another point to note is the size of arrows. Arrows are thin in this execution plan because the query is returning only 3 rows compared to previous plan where the query was returning about 20K rows.

Logical Query Processing 2

Logical Query Processing 2

Flow Diagram 2

Flow Diagram 2

Next Step:

Having a clear concept of how the query is logically processed will help you write more efficient and Optimizer friendly queries. In my next blog, I will take the learning done here and extend it to other query clauses such as GROUP BY and ORDER BY.

SQL Saturday Madison, Wisconsin

Friday 04/10/2015
SQL Saturday Madison Pre-Con: Planning a SQL Server High Availability and Disaster Recovery Plan with Ryan Adams

I have read and researched and tested and have implemented in production many high availability (HA) solutions. But things keep on changing as new SQL Server features become available. Not forgetting that hardware side of the things (VMs and Azure clouds) also keep changing. SQL Server offers many options, but how do you choose the right one to meet your business requirements? In this one day class, Ryan took us from basic of Backup and Restore to Windows clustering and Availability groups. In between we touched pros and cons and gotchas of log shipping, mirroring and replication etc. Every feature has pros and cons. It will be hard to list all here.

He gave us a scenario in the beginning of the class and we were to design our solution as class progresses and our knowledge increases. The scenario was to configure a HA\DR plan to ensure a company’s systems are always up and running and can withstand a technical or natural disaster. There are 3 data centers Maine, California and Arizona. Network connection between Maine and California is excellent and between Maine and Arizona is okay. There are 3 critical databases in Main that need HA\DR and Reports can be 2 hours behind. Only one solution was asked and I was able to design 2 solutions and instructor was impressed and asked me to present my thought process behind it.

You will ask why backup and restore discussion in HA class? Answer is proper backup and restore strategies are foundation for successful implementation of any HA\DR solution. I learned and saw firsthand in demos that using backup options such as MaxTransfersize and BlockSize (which are not available in Maintenance Plan utility) can cut your backup and restore time in half. This is not important to us now but I learned something new. I can use them now to plan backups if and when our DB Sizes reaches into 100s of Gigs.

I also learned that in my recent availability group setup, I had omitted one important step and that is having 2 NICs connected to each Node of the configuration. I have to talk to our IT guys now.

Saturday 04/11/2015 (FREE TRAINING)

Now I will briefly describe the sessions that I attended on Saturday. There were 7 tracks and 7 x 5 = 35 sessions. But unfortunately, one person can only be in one session at a time. Each session of 75 minutes. I chose to attend following 5 sessions.

Session 1: Practical Powershell for DBA
This session was 90 percent demo of the scripts that the presenter was using himself at his work. It not only showed the actual work but also the possibilities in my work where I can extend the functionality. It was mostly geared towards people who have 30 to 300 sql server instances to manage. I don’t see why this cannot be used in our smaller shop. I have used Powershell scripts in the past employment and can use it at present job for Database maintenance work.

Session 2: SSMS Showdown
This was my favorite session of all. 2 presenters acted as they were competing against each other to show us the coolest tips and tricks of SQL Server Management studio (lovingly called SSMS). There were 5 rounds so we should be learning 10 coolest tips but in order to outsmart the opponent each presenter tried to show us more in one round. We, audience, voted after each round. I am planning to show what I learned here to my development team. It can be really useful in our day to day dealing with TSQL scripting and coding using SSMS. It will improve efficiency and overall quality of work.

Session 3: Collect baseline for all your servers
Base-lining is important because before we answer why the system is slow or what is wrong with SQL Server, we need to know how it reacts under normal circumstances. Again everyone has their own tool of collecting server level and database level information for trending and troubleshooting purposes and this presenter showed his own. He used SQL Server Integration Services (SSIS) to collect his baseline metrics. I saw the benefits of using it and how flexible it becomes when you have to monitor several servers. You only need to make a change in the environment variable in the configuration file and you can use this same solution across multiple servers from Dev to production. Since connection managers uses the standard connection strings, no linked servers were required.

Session 4: Administering SQL Servers for Development Teams
There were 2 presenters. One was a DBA and another was a developer. Since this is my first job here where I am actually the part of the sprint meeting every day, I thought I would attend this session. I did not as much learned anything new in this session as it validated and confirmed what we are doing is right with our sprint meetings, agile methods and TFS to track bugs and source control to save codes etc. I wish I could bring some new information but glad to know that what we are doing at work is exactly as the best industry practice suggests. It also talked about how DBA can help developers and again we are following that method here.

Session 5: Automated SQL Installs
I chose this session because it was going to be all demos while other sessions can be followed through the PowerPoint slides and script that each presenter post on SQL Saturday site. I followed through 3 quarter of the session than I lost it because it was too technical for me. Also at that point I realized that although this is nice to automate base SQL server install, Service pack upgrades, Cumulative upgrades and patches etc., we are not going to use it here due to security concerns etc. But again the point of choosing the sessions is to go little bit out of your comfort zone to learn something new. I think I know where to turn to if in my future career time comes to use the Automated Install of SQL Server. Presenter used CodePlex project “FineBuild” (open source) to show us step by step process of building install scripts and not every but most of the possible options to select.

Final thoughts:
I know many of these things that I mentioned above can be learned from the books or attending online webinars etc. But the enthusiasm of the speakers, volunteers and participants cannot be bought or experienced without being there in person. As a SQL Server professional you need that face time with MVPs and see firsthand their confidence, humility and spirit of giving back to the community. This inspires us to go extra mile at our own work place and feel joy from applying the new concept, trick or shortcut just learned from the industry experts. It also makes us humble and never feel ashamed of our shortcomings or learning from others. It is like a dose of medicine that you need to take periodically throughout the year. Therefore I am planning to attend at least 4 SQL Saturday events each year. Another benefit of attending these community events is that you build new and maintain existing relationships with presenters and vendors.

Parallel or Serial?

Recently, I had an opportunity to attend Adam Mechanic’s “Parallel Execution Performance” class. He offered this class as a Pre-Con to SQLSaturday #291. I do not need to mention he is super intelligent guy who wrote universal stored procedure called “sp_whoisactive” and is used by most of the DBAs all over the world. But he is also a very good teacher and presenter. Here I will try to explain some of the nuggets of wisdom I collected from the class.

He mentions world in general and technology in particular has changed. So today’s DBA also need to change how they think about performance and bottlenecks. What was true 10 years ago maybe not true at all today or partially true. Adam said in Query Execution Plan there are 2 types of zones; parallel zones and serial zones. It is the serial zones where opportunity of performance tuning lies.

Although I was familiar about the Moore’s Law, but there were other laws that apparently I was not aware of. One is Amdhal’s law and other is Guftan’s Law. He gave an example to understand Amdahl’s law. Consider a restaurant Chef preparing a dish containing 3 items. Each item takes 10 minutes to prepare. And final plating takes 10 minutes. Altogether it will take 40 minutes to prepare one plate. She hires 3 cooks and now each cook prepares the single item and gives it to chef for plating. So instead of 40 minutes, dish is ready in 20 minutes. This is 50% increase in performance. This is not a linear relation ship and it is explained by the following formula.

improvement = 1/(1-P) +p/N

= 1/(1-.75) + .75/3 = 2 ( 50% increase than serial task. There were 3 cooks who can work in parallel, plating time was serial)

Let us now add one more cook. P = .75, N = 4

improvement = 1/(1-.75) + .75/4  = 2.385 (42% increase than serial task. There were 4 cooks but 3 tasks. Plating time was still serial)

Let’s add one more task. P = 0.8, N= 4

improvement = 1/ (1-.8) + .8/4) = 2.5 (60% increase than serial task. Serially it would have taken 50 minutes for chef to finish the dish with 4 tasks plus plating.)

I hope you get the idea. To me this is very interesting to note. I will try to find other areas where I can use this logic. Back to the earlier example, we see that if no. of tasks equals to no. of cooks, parallelism gives optimal result but the serial part of plating the dish is constant. If we improve that part of the process, it will give more performance improvement.

 

 

 

 

Implicit and Explicit Data type conversion

Implicit Conversion:

Implicit Data type conversion is done by SQL Server without notifying user.

 

Explicit Conversion:
Explicit conversion is done by user either by issuing Convert or Cast function command. An example of explicit conversion would be converting Date data types to the format you want .

SELECT GETDATE(), CAST(GETDATE() AS time(7)) AS ‘time’ ,CAST(GETDATE() AS date) AS ‘date’
,CAST(GETDATE() AS smalldatetime) AS ‘smalldatetime’ ,CAST(GETDATE()AS datetime) AS ‘datetime’
,CAST(GETDATE() AS datetime2(7)) AS ‘datetime2’ ,CAST(GETDATE() AS datetimeoffset(7)) AS ‘datetimeoffset’;

For more information, see http://msdn.microsoft.com/en-us/library/ms187752.aspx

 

When the last time you have done nothing?

Image

There is so much we miss out in this world and in our life by just not being present in present. We equate not doing anything to wasting time. In reality it is quite the opposite. I have experienced it myself and the feeling is so freeing and exhilarating that you will regret why you have not experienced it before. I think the old saying “idle mind is a Devil’s workshop” has misled us.

We are in this world for a short period of time and yet 50% of our time we are thinking about something that is not present. Weather we are listening to enticing music or watching a favorite TV show or reading an interesting book, we will not add these adjectives if we are not focused on what we are doing. ImageAn enticing music becomes irritable if some nagging thought in your mind keep bothering you. A favorite TV Show becomes boring if you constantly think about what you have to do next. An interesting reading becomes a chore if you are feeling anxious and stressful.

Actually I love all these 3 activities that I have used as an example here. So I have no problem on focusing on these activities. But there are other areas such as work and social life where I want to learn to apply this thinking. The benefit would be more satisfied business interaction which in turn will result in doing it again and again. It is just like a happiness cycle, once you start it it will get the momentum of its own.

 

 

 

 

 

My journey of learning Advanced T-SQL

Who absolutely look forward to annual performance reviews at work? I do not consider myself in this group. I hate the logistic part of it. Tons of rows to fill in. Sometimes questions are ambiguous or repetitive or just plain senseless. Long story short, I got my annual performance review and lets say it was above average but my manager suggested one development goal for me and it  was to get training on Advanced T-SQL queries.

Do not get me wrong. I am a DBA for more than 10 years and I can write tsql and keep myself updated with the addition and changes in tsql syntax with the new versions of sql server. I do lot of reading and researching in my areas of expertise (High Availability, Disaster Recovery and performance tuning) almost daily. I do not consider myself to be a DBA developer. I never have to write complex stored procedures or views or triggers etc. According to my manager, this is where I can grow more. Let’s keep the discussion of whether I agree with him or not for a later time. I planned to take this as a challenge. I have grown in so many areas, why not excel in this one too? Real challenge is to work on something which is outside of your comfort zone and you have to make yourself do it.

So I decided to blog my journey of Advanced T-SQL Learning. I searched the net but have not found any classroom training that meets my need. But there is lot of material (Blogs, YouTube videos, books, white papers, BOL etc ) available. And this was one of the reason, I never concentrated on learning development stuff because whenever I needed something I always find that T-SQL or PowerShell  script online written by MVPs and other very intelligent people of the SQL World (Wait this sounds like name of my Blog!)  available. With little tweaking I was able to use it and it definitely served the purpose.

Anyways, this is how I plan to learn and blog about Advance T-SQL. I will select one item such as advanced error handling or CTE or DMVs etc an d will work on each piece for a week. So I will write one blog per week for 4 weeks. Then I will summarize my 4 weeks of learning in the 5th week blog with examples or maybe little project or something fun. Then again I will start the cycle of learning to get to the next T-SQL concept.

As for now, my intention with this blog is that it is for my own eyes only. It will also serve as a reminder\repository\evidence etc of what I have learned and achieved. This will also be a tool down the road, when going gets tough, to remind me how persistent I was and as a result I will get new motivation to move forward.

Cheers to Learning and Blogging!

 

TSQL 2012 IFF Function

In a nutshell, IFF function is a shorthand for Case function. Here is the example to illustrate its use and syntax.

Create database myDB;
Go
Use myDB;
Go
Create table Employees (id int, name nvarchar(30), city nvarchar(30), region nvarchar(30), country nvarchar(30));
Go

Insert some records to play with. (more…)