Microsoft SQL Server Knowledge Bank: BETWEEN Clause Vs [ >= AND <= ]

Tuesday, 9 October 2012

BETWEEN Clause Vs [ >= AND <= ]

Introduction

In this article I am trying to discuss related to BETWEEN clause and >= AND <= comparisons operators and which one is best.

Most of the junior developer has some bad concepts related to BETWEEN clause and they told, not to use it. In this article I am demon staring the BETWEEN clause and how good it is.

What is the Difference

Let's take an example to understand it properly.

Step-1 [ Create the Base Table ]

CREATE TABLE my_TestTab

(ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,

VAL DATETIME NOT NULL);

Step-2 [ Inserting Records ]

INSERT INTO my_TestTab

(VAL)

VALUES('05-01-2012'),

('05-07-2012'),

('05-11-2012'),

('05-15-2012'),

('05-22-2012'),

('05-23-2012'),

('05-25-2012'),

('05-27-2012'),

('05-28-2012');

Step-3 [ Now use Between Clause ]

SELECT *

FROM my_TestTab

WHERE VAL BETWEEN '05-01-2012' AND '05-28-2012';

Output:

ID VAL

1 2012-05-01 00:00:00.000

2 2012-05-07 00:00:00.000

3 2012-05-11 00:00:00.000

4 2012-05-15 00:00:00.000

5 2012-05-22 00:00:00.000

6 2012-05-23 00:00:00.000

7 2012-05-25 00:00:00.000

8 2012-05-27 00:00:00.000

9 2012-05-28 00:00:00.000

Step-4 [ Now using >= AND <= ]

SELECT *

FROM my_TestTab

WHERE VAL >='05-01-2012' AND VAL<='05-28-2012';

Output:

ID VAL

1 2012-05-01 00:00:00.000

2 2012-05-07 00:00:00.000

3 2012-05-11 00:00:00.000

4 2012-05-15 00:00:00.000

5 2012-05-22 00:00:00.000

6 2012-05-23 00:00:00.000

7 2012-05-25 00:00:00.000

8 2012-05-27 00:00:00.000

9 2012-05-28 00:00:00.000

To observe the difference between Step-3 and Step-4 just execute the SQL of Step-3 Again and see the actual execution plan.

SELECT *

FROM my_TestTab

WHERE VAL BETWEEN '05-01-2012' AND '05-28-2012';

If we see the execution plan we find that

SELECT * FROM my_TestTab WHERE [VAL] >= @1 AND [VAL]<=@2

So there is no difference between Step-3 and Step-4. Actually internally the BETWEEN clause is converted to >= and <= logical operators.

Summary

As per me using BETWEEN clause is much easier then the >= and <= operators and it looks great in complex query.

It actually depends on developer and there writing style of T-SQL.

Posted by: MR. JOYDEEP DAS

6 comments:

Anonymous10 October 2012 at 19:31
It also depends on having a clustered index on the column you are comparing. Notice that your query plan shows a clustered index scan (effectively a table scan).

If you had a clustered index on the column being compared, you would have an index seek, not a scan. This is one of many reasons to NOT have a clustered index on your primary key, but to instead use clustered indexes where the engine would most benefit from it.

Your example does benefit from having the dates inserted in pre-sorted order. In an OLTP system, it is highly unlikely that your dates will be entered in pre-sorted order.
ReplyDelete
Replies
Nakul Vachhrajani10 October 2012 at 21:54
A blast from the past :). This one made me recollect one of my older posts: http://beyondrelational.com/modules/2/blogs/77/posts/11353/date-comparison-dissecting-between-use-between-or-comparison-greater-thanless-than-operators-with-eq.aspx
ReplyDelete
Replies
Unknown11 October 2012 at 19:58
This question seemed to grab my attention so I set up a real world test it and wanted to share my results.

The following test scenario is using a query to select created dates for users in one my databases.

The table is about 0.969 MB and has a good load of data for this example with 4750 rows spanning 5 years

The tables index set up is 1 clustered on the PK and 5 non-clustered indexes exist but none on the create_timestamp column and exactly as Marc pointed out in real world it is unsorted data.

Query tuning reflected that SQL shows this query missing an index on the date column. For performance the addition of the non-clustered index on this column is recommended. I did not add it but I wanted to point out that it is a recommended change and here is the details of the impact.

Missing Index Details -
The Query Processor estimates that implementing the following index could improve the query cost by 94.1297%.

CREATE NONCLUSTERED INDEX []
ON [dbo].[user_mstr] ([create_timestamp])

select create_timestamp from user_mstr
where create_timestamp between '2007-02-19 11:01:32.727' and '2011-01-28 11:55:42.313'

This returned 3683 rows.

Profiler showed the following.
CPU: 15
Reads: 132
Write: 0
Duration: 4 <=====

Same query.

This returned 3683 rows.

select create_timestamp from user_mstr
where create_timestamp >= '2007-02-19 11:01:32.727'
and create_timestamp <= '2011-01-28 11:55:42.313'

Profiler showed the following.
CPU: 0
Reads: 132
Write: 0
Duration: 6 <=====

It’s important to note that behavior was only significantly different on the first execution (prior to caching) afterward, the execution of the two were virtually the same.

Here are 4 subsequent executions.

Using Between
CPU: 0
Reads: 126
Write: 0
Duration: 2 <=====

Using >=/<=
CPU: 0
Reads: 126
Write: 0
Duration: 3 <=====

Using Between
CPU: 0
Reads: 126
Write: 0
Duration: 3 <=====

Using >=/<=
CPU: 0
Reads: 126
Write: 0
Duration: 2 <=====

In the end I agree with Nakul as using between as a best practice for readability and asthetics but with the above results it seems that Between is also a better choice for performance.
ReplyDelete
Replies

Add comment