Logic of Query Processing – PART 1
IS IT ONE MORE BORING BLOG?
I am writing this blog to remind myself how the query processing work. It is a forgettable item because it is different from the way you write a query. To begin with let me write a simple query.
Select * from AdventureWorks2012.Person.Person
We have 3 elements in this simple query. First is the keyword “SELECT”. This keyword means you are retrieving data from the table (as oppose to inserting or updating data in a table). A Second element is “*” which denotes the list of columns in a table. The third element is the “FROM” clause which tells you to retrieve data from a particular table.
When SQL Server Query Optimizer receives this query it processes it in a logical order instead of a physical order. Most logical and cost-effective thing is to first check if the table exists or not. So the “FROM” clause will be processed first. Next, it checks for columns requested also exist or not. If a table exists and the columns requested also exist then Query Optimizer looks at “SELECT” keyword and retrieves the data for you.
Look at the direction of a blue arrow in the execution plan of this query. It shows, that Query Optimizer scans the table first (using the FROM clause) and then do the SELECT.
This is how the flow diagram will look
Let’s add some more spice to this query.
Select * from AdventureWorks2012.Person.Person where Lastname = ‘Ford’
This query returns only 3 rows. “WHERE” clause in the query is filtering the table data on Lastname column. Query optimizer first checks the physical objects (tables and columns) exist and then see the filter and uses the index if there is one on the filtering column. Note the key_lookup operator in the below execution plan. The reason is that the Non-Clustered index chosen in the Seek operation is not the covering index (remember we are using * ). In that case clustered index of the table is used to point to the data which generates key_lookup operator. It is an important consideration when you are doing query tuning. But for our purpose here it is not important. Another point to note is the size of arrows. Arrows are thin in this execution plan because the query is returning only 3 rows compared to previous plan where the query was returning about 20K rows.
Flow Diagram 2
Having a clear concept of how the query is logically processed will help you write more efficient and Optimizer friendly queries. In my next blog, I will take the learning done here and extend it to other query clauses such as GROUP BY and ORDER BY.