Optimizing for SQL “if exists (…”

| Saturday, February 7, 2009

While perusing some of the code in Reporting Services 2008 i ran into the following code:

IF ( NOT EXISTS (
SELECT TOP 1 1
FROM
...

This was the first time I had seen the “select top 1 1…” usage.  I had always used “select * from…” because of my theory that the db dudes at Microsoft were smart enough to recognize when a query was inside of an “exists” test, and would return true or false immediately when the query had rows, regardless of how many, and regardless of what columns where being queried.  I also liked the “select * from…” syntax since I think it’s easier to recognize a simple existence test with the absence of column names.

The programmer who wrote the code above was thinking that the SQL Server was going to execute the query, return the requested data, then determine if rows existed.  Well, I can finally put this argument to bed.  Jacob Sebastian wrote:

First of all, EXISTS() returns as soon as a row is found. So it does not retrieve all the matching rows. The condition becomes true after the first row is found and EXISTS() function exits immediately.

I could not find any difference between 1, * or column names. Here are some test scripts.

He goes on to show every method (select *, select 1, select column, select top 1 1) uses the exact same query plan. 

I don’t mind the look of “select top 1 1 from…”, but I don’t want any of our programmers confused about special optimization rules for existence testing.

2 comments:

Meat said...
February 7, 2009 at 11:24 AM

Dude, I was using "select column from ..."

You know who swore to me up and down that select * was not optimal. I'm glad you posted this!

Crash said...
February 7, 2009 at 9:26 PM

We should officially name "You Know Who"... I'll think on that and blog about him.

Post a Comment

This blog has an advanced spam detections system. Send us spam and we will shoot a missile into your living room.