Date time best practise in SQL Server

Nice article for DBA's

When can you use arithmetic operators in date/time calculations in SQL Server? When are the date/time functions provided by Microsoft the better option? Follow a few scenarios that demonstrate when arithmetic operators are safe and when they are risky.

few months ago, I received a message from quite an experienced DBA and database programmer asking me whether I knew a nice trick for manipulating date/time data without using Microsoft SQL Server's date/time functions—something like GETDATE() + 10. It reminded me of a few internet forum discussions about the same subject in which participants were irked by the fact that Microsoft missed such a convenient feature. I decided that the topic deserved a deeper discussion. This article is the result.

How Is This Possible?
Try running the following example with the SQL Server DATEADD() function:


SELECT GETDATE();
SELECT DATEADD(dd, 1, GETDATE());
SELECT GETDATE() + 1;

Results:

2007-03-15 16:21:41.630
2007-03-16 16:21:41.630
2007-03-16 16:21:41.630

As you can see, both the DATEADD() function and an addition operation that uses a plus sign (+) produce exactly the same result: they add one day to a current day and time. You'd expect that from DATEADD(), but why does an addition operation produce the same result? For the answer, you must explore the SQL Server date/time storage internals.

As you know, the datetime data type occupies eight bytes: four to store the number of days before or after January 01, 1900 and four more for the number of 3.33ms clock ticks since midnight. The smalldatetime data type needs only four bytes as it is less precise than datetime; it stores the number of days since January 01, 1900 and the number of minutes since midnight. All the numbers are stored as integers. So when you run SELECT GETDATE() + 1, you actually add datetime data to the integer number. Since the datetime data type has higher precedence, the integer number 1 (as an addend with lower precedence) has to be implicitly converted to a data type of the addend with higher precedence.

The following example converts "1" to datetime and then treats it as one day since Jan 01, 1900:


SELECT CAST(1 as datetime);

Results:

1900-01-02 00:00:00.000

When you run SELECT GETDATE() + 1, you add two datetime values that internally are interpreted as integers. As a result an addition operation that uses a plus sign (+) becomes perfectly valid. For example, all the following statements will be correct:


SELECT GETDATE();
SELECT DATEADD(dd, 1, GETDATE());
SELECT GETDATE() + 1;
SELECT GETDATE() + 'Jan 02, 1900';

Results:

2007-03-16 23:01:37.420
2007-03-17 23:01:37.420
2007-03-17 23:01:37.420
2007-03-17 23:01:37.420

Notice that usage of the addition operator (+) in date/time calculations requires at least one datetime addend, which also has to be an addend with the highest precedence.

Try the following example:


SELECT DATEADD(dd, 1, 'Mar 17, 2007');
SELECT 'Mar 17, 2007' + 1;
SELECT 'Mar 17, 2007' + 'Jan 02, 1900';
SELECT GETDATE() + 1 + 'Jan 02, 1900';

As you can see, the first SELECT that uses SQL Server's DATEADD() function recognizes dates written as varchar and produces the correct result. The second SELECT fails with the error: "Conversion failed when converting the varchar value 'Mar 17, 2007' to data type int." This happens because varchar addend has lower precedence and must be implicitly converted to the integer data type. In this case such a conversion is impossible.

The third SELECT works, but doesn't make any sense. It just concatenates two varchar expressions. The forth SELECT is very interesting. Since second and third addends have lower precedence than the first one, they need to be implicitly converted to a datetime data type. Each of them will be interpreted as one day. Therefore two days will be added to the reult of the GETDATE() function.



First Pitfalls
At this point, you may start to believe that using addition or subtraction operators in date/time calculations is completely safe. It isn't. For example try to calculate the interval between two dates (in days) using the first and second approaches as follows:


DECLARE @dt1 datetime, @dt2 datetime;
SELECT @dt1 = 'Mar 17, 2007 09:09:00', @dt2 = 'Mar 17, 2007 22:09:00';
SELECT DATEDIFF(dd, @dt1, @dt2);
SELECT CAST((@dt2 - @dt1) as int);

Results:

0
1

The results calculated by SQL Server's DATEDIFF() function and those using the subtraction operator are different. The first (0) is correct, the second (1) is wrong. Why? Well, when you convert a datetime value to the integer, the result will be rounded to the nearest integer. How it will be rounded depends on time portion of the datetime value. Consider the following example:


DECLARE @dt3 datetime, @dt4 datetime;
SELECT @dt3 = 'Mar 17, 2007 11:59:59.994', @dt4 = 'Mar 17, 2007 11:59:59.997';
SELECT CAST(@dt3 AS int);
SELECT CAST(@dt4 AS int)
SELECT CAST(CAST(@dt3 AS int) AS datetime);
SELECT CAST(CAST(@dt4 AS int) AS datetime);

Results:

39156
39157
2007-03-17 00:00:00.000
2007-03-18 00:00:00.000

As you can see, two time stamps with a 3ms interval were taken just before noon. If you convert the two date values to integers and then reconvert the results back to date values, you'll see a difference of one day. Similarly, two datetime values representing different days (Mar 16 and Mar 17) can be incorrectly converted to the same date, like this:


DECLARE @dt3 datetime, @dt4 datetime;
SELECT @dt3 = 'Mar 16, 2007 12:00:01.000', @dt4 = 'Mar 17, 2007 11:59:59.994';
SELECT CAST(@dt3 AS int);
SELECT CAST(@dt4 AS int)
SELECT CAST(CAST(@dt3 AS int) AS datetime);
SELECT CAST(CAST(@dt4 AS int) AS datetime);

Results:

39156
39156
2007-03-17 00:00:00.000
2007-03-17 00:00:00.000

Now, consider one more example. Suppose you have a table where you store sales transactions that each have an ID, time, and amount. You want to find the total amount and number of transactions per day, which is a very common task. Since your table is very large, you decide to speed up your query by grouping your transactions by date interval, not by date. But instead of using the SQL Server DATEDIFF() function, you decide to use an "advanced" approach: converting datetime data to integer or calculating the interval using simple arithmetic subtraction.

Here's what can happen in this scenario:


SET NOCOUNT ON;
IF OBJECT_ID('sales', 'U') IS NOT NULL
DROP TABLE sales

CREATE TABLE sales(
transactionID int,
transactionTime datetime,
amount decimal(4,2));

INSERT INTO sales VALUES(1, 'Mar 17, 2007 08:00:23', 24.34);
INSERT INTO sales VALUES(2, 'Mar 17, 2007 10:33:23', 88.54);
INSERT INTO sales VALUES(3, 'Mar 17, 2007 12:00:44', 12.12);
INSERT INTO sales VALUES(4, 'Mar 17, 2007 14:23:23', 43.25);
INSERT INTO sales VALUES(5, 'Mar 17, 2007 16:45:22', 76.34);
INSERT INTO sales VALUES(6, 'Mar 17, 2007 17:11:22', 51.11);
INSERT INTO sales VALUES(7, 'Mar 17, 2007 19:45:23', 30.99);

SELECT COUNT(*) AS #Trans, SUM(amount) AS totalAmount
FROM sales
GROUP BY DATEDIFF(dd, 0, transactionTime);

SELECT COUNT(*) AS #Trans, SUM(amount) AS totalAmount
FROM sales
GROUP BY CAST(transactionTime AS int);

SELECT COUNT(*) AS #Trans, SUM(amount) AS totalAmount
FROM sales
GROUP BY CAST((transactionTime - 0) AS int);

Results:

#Trans totalAmount
----------- ---------------------------------------
7 326.69

#Trans totalAmount
----------- ---------------------------------------
2 112.88
5 213.81

#Trans totalAmount
----------- ---------------------------------------
2 112.88
5 213.81

SQL Server DATEDIFF() produces the correct result because it works with dates like a FLOOR() function. It cuts the time portion of each date and manipulates only days. If you chose the other approach of converting to integers or direct subtraction, each datetime value would be rounded to the nearest integer, taking into account the time portion of the datetime value. This is why the second and third SELECT statements produce the wrong results.



Arithmetic Operations on Time Portion of Date/Time Data
So far this article has discussed the arithmetic operations on the date portion of data/time values. So when you try to run SELECT GETDATE() + 1, the second addend is implicitly considered the number of days that have to be added to the value returned by the GETDATE() function. But what if you want to add one hour or one minute to the current datetime value? Using the SQL Server DATEADD() function you can do that very easily, but using addition or substraction operators will be tricky.

Let's examine how to do that. One day consists of 24 hours or 1,440 minutes or 86,400 seconds. So if you run the following statement:


SELECT GETDATE() + 1/24

You probably will get the current datetime plus one hour. In reallity, however, you cannot use just 1/24 because both dividend and divisor are integers, and the result of dividing them will be zero. You need to convert the integers to decimal or float data types as follows in order to get the correct result:


-- How to add one hour
---------------------------------------
SELECT GETDATE()
SELECT DATEADD(hh, 1, GETDATE())
SELECT GETDATE () + CAST(1 AS dec(9,4))/ CAST (24 AS dec(9,4))
SELECT GETDATE () + CAST(1 AS dec)/ CAST (24 AS dec)
SELECT GETDATE () + CAST (1 AS float)/ CAST (24 AS float)

Results:

2007-03-25 20:31:13.870
2007-03-25 21:31:13.870
2007-03-25 21:31:13.867
2007-03-25 21:31:13.870
2007-03-25 21:31:13.870


-- How to add one minute
---------------------------------------
SELECT GETDATE()
SELECT DATEADD(mi, 1, GETDATE())
SELECT GETDATE () + CAST(1 AS dec(9,4))/ CAST (1440 AS dec(9,4))
SELECT GETDATE () + CAST(1 AS dec(18,9))/ CAST (1440 AS dec(18,9))
SELECT GETDATE () + CAST (1 AS float)/ CAST (1440 AS float)

Results:

2007-03-25 20:35:15.127
2007-03-25 20:36:15.127
2007-03-25 20:36:15.123
2007-03-25 20:36:15.127
2007-03-25 20:36:15.127


-- How to add one second
---------------------------------------
SELECT GETDATE()
SELECT DATEADD(ss, 1, GETDATE())
SELECT GETDATE () + CAST(1 AS dec(9,4))/ CAST (86400 AS dec(9,4))
SELECT GETDATE () + CAST(1 AS dec(18,9))/ CAST (86400 AS dec(18,9))
SELECT GETDATE () + CAST(1 AS dec(24,18))/ CAST (86400 AS dec(24,18))
SELECT GETDATE () + CAST (1 AS float)/ CAST (86400 AS float)

Results:

2007-03-25 20:42:26.617
2007-03-25 20:42:27.617
2007-03-25 20:42:27.613
2007-03-25 20:42:27.613
2007-03-25 20:42:27.613
2007-03-25 20:42:27.617


-- How to add one second, using variables
------------------------------------------

DECLARE @dec1 dec(24,18), @dec2 dec(24,18), @dec3 dec(24,18), @dt datetime
SELECT @dec1 = 1, @dec2 = 86400, @dt = GETDATE();
SELECT @dec3 = @dec1 / @dec2;

SELECT @dt
SELECT DATEADD(ss, 1, @dt)
SELECT @dt + @dec3
SELECT @dt + CAST (1 AS float)/ CAST (86400 AS float)

Results:

2007-03-25 20:49:16.817
2007-03-25 20:49:17.817
2007-03-25 20:49:17.813
2007-03-25 20:49:17.817

As you can see from the last example, the SQL Server function DATEADD() works perfectly, but an addition operator may cause a problem. For example, when you try to add one hour or one minute, you need to find a sufficient precision for decimal data type. Otherwise, the result will be slightly inaccurate. However, when you try to add one second, applying an addition operator and decimal conversion, you won't be able to get the exact result at all.

On the other hand, the float conversion looks precise and safe for the time calculations with an addition operator, but if you start to use it you may run into a problem: duplicates and missing values. To illustrate and understand the problem, create and populate an auxillary table:


SET NOCOUNT ON;
DECLARE @max int, @cnt int;
SELECT @cnt = 10000;

IF EXISTS(SELECT * FROM sysobjects
WHERE ID = (OBJECT_ID('sequence')) AND xtype = 'U')
DROP TABLE sequence;

CREATE TABLE sequence(num int NOT NULL);
INSERT INTO sequence VALUES(1);
SELECT @max = 1;
WHILE(@max <= @cnt)
BEGIN
INSERT INTO sequence
SELECT @max + num FROM sequence;
SELECT @max = MAX(num) FROM sequence;
END

When you run this script, it will insert 16,384 sequential numbers into the table sequence. (The number 16,384 doesn't have any special meaning. It was selected for illustration purposes only.)

Now, generate the sequence of hours using the auxillary table and SQL Server's date/time function as follows:


IF EXISTS(SELECT * FROM sysobjects
WHERE id = OBJECT_ID('test'))
DROP TABLE test;

SELECT num, DATEADD(hh, num, 'Dec 31, 2006 23:00:00') dt
INTO test
FROM sequence;

SELECT * FROM test;

Results:

num dt
----------- -----------------------
1 2007-01-01 00:00:00.000
2 2007-01-01 01:00:00.000
3 2007-01-01 02:00:00.000
. . . . . . . . . . . . . . . . . .
3099 2007-05-10 02:00:00.000
3100 2007-05-10 03:00:00.000
. . . . . . . . . . . . . . . . . .
16381 2008-11-13 12:00:00.000
16382 2008-11-13 13:00:00.000
16383 2008-11-13 14:00:00.000
16384 2008-11-13 15:00:00.000

The function DATEADD() works as expected and generates a sequence of datetime values with one-hour intervals.

Now, try to roll up the sequence you just generated using the same the SQL Server DATEADD() function:


SELECT DISTINCT DATEADD(hh, -num, dt) FROM test

Results:

2006-12-31 23:00:00.000

The last result proves that SQL Server's date/time function generates date/time values properly. In order to test the solution that uses an arithmetic operator (+), run the following example:


DECLARE @time float
SELECT @time = CAST(1 as float)/CAST(24 as float)

IF EXISTS(SELECT * FROM sysobjects
WHERE id = OBJECT_ID('test'))
DROP TABLE test;

SELECT num, (CAST('Dec 31, 2006 23:00:00' AS datetime) + @time * num) dt
INTO test
FROM sequence;

SELECT * FROM test;

num dt
----------- -----------------------
1 2007-01-01 00:00:00.000
2 2007-01-01 01:00:00.000
3 2007-01-01 02:00:00.000
4 2007-01-01 03:00:00.000
5 2007-01-01 03:59:59.997
6 2007-01-01 05:00:00.000
7 2007-01-01 05:59:59.997
8 2007-01-01 07:00:00.000
9 2007-01-01 08:00:00.000
10 2007-01-01 08:59:59.997
. . . . . . . . . . . . . . . . . .
16380 2008-11-13 11:00:00.000
16381 2008-11-13 11:59:59.997
16382 2008-11-13 12:59:59.997
16383 2008-11-13 14:00:00.000
16384 2008-11-13 14:59:59.997

You will find that an addition operator produces inaccurate results. Sometimes they differ from the expected ones by 3 ms. If you try to roll up the sequence of generated date/time values, you will get more than one date/time "seed" value (as in the following example) and that is incorrect:


SELECT DISTINCT DATEADD(hh, -num, dt) FROM test

Results:

2006-12-31 22:59:59.997
2006-12-31 23:00:00.000

You may say that +/- 3 ms precision is acceptable for most applications, but look how that seemingly tiny problem can produce a bigger one:


SELECT CONVERT(varchar(100), dt, 100)
FROM test
ORDER BY num

Results:

Jan 1 2007 12:00AM
Jan 1 2007 1:00AM
Jan 1 2007 2:00AM
Jan 1 2007 3:00AM
Jan 1 2007 3:59AM
Jan 1 2007 5:00AM
Jan 1 2007 5:59AM
. . . . . . . . . .
Nov 13 2008 5:59AM
Nov 13 2008 6:59AM
Nov 13 2008 8:00AM
Nov 13 2008 8:59AM
Nov 13 2008 9:59AM
Nov 13 2008 11:00AM
Nov 13 2008 11:59AM
Nov 13 2008 12:59PM
Nov 13 2008 2:00PM
Nov 13 2008 2:59PM

This example uses a CONVERT() function to produce a different date/time format. As a result, inaccuracies in the generated values increased from 3 ms to 1 minute and became unacceptable. However, this is not the only problem. If you try to generate the sequences of minutes or seconds, things become even worse. Look at this example:


DECLARE @time float
SELECT @time = cast(1 as float)/cast(1440 as float)

IF EXISTS(SELECT * FROM sysobjects
WHERE id = OBJECT_ID('test'))
DROP TABLE test;

SELECT num, (CAST('Dec 31, 2006 23:59:00' AS datetime) + @time * num) dt
INTO test
FROM sequence;

SELECT * FROM test;

Results:

num dt
----------- -----------------------
1 2007-01-01 00:00:00.000
2 2007-01-01 00:01:00.000
. . . . . . . . . . . . . . . . . .
1579 2007-01-02 02:17:59.997
1580 2007-01-02 02:19:00.000
1581 2007-01-02 02:19:59.997
. . . . . . . . . . . . . . . . . .
16382 2007-01-12 09:01:00.000
16383 2007-01-12 09:01:59.997
16384 2007-01-12 09:03:00.000

As you can see, there are inaccuracies in the generated values again. In addition, when you try to convert these values to another format as follows, you will get duplicated or missing dates:


SELECT CONVERT(varchar(100), dt, 100)
FROM test
ORDER BY num

Results:

Jan 1 2007 12:00AM
Jan 1 2007 12:01AM
. . . . . . . . . .
Jan 2 2007 12:00AM
Jan 2 2007 12:00AM
Jan 2 2007 12:02AM
Jan 2 2007 12:02AM
Jan 2 2007 12:04AM
Jan 2 2007 12:04AM
. . . . . . . . . .
Jan 12 2007 9:00AM
Jan 12 2007 9:01AM
Jan 12 2007 9:01AM
Jan 12 2007 9:03AM

For instance, there are two values "Jan 02, 2007 12:02AM", but the value "Jan 02, 2007 12:03AM" is missing.

If you want to see the list of all duplicates, you can run the following query:


SELECT COUNT(*), CONVERT(varchar(100), dt, 100)
FROM test
GROUP BY CONVERT(varchar(100), dt, 100)
HAVING COUNT(*) > 1
ORDER BY 2

Finally, you can generate the sequence of seconds using the same approach:


DECLARE @time float
SELECT @time = cast(1 as float)/cast(86400 as float)

IF EXISTS(SELECT * FROM sysobjects
WHERE id = OBJECT_ID('test'))
DROP TABLE test;

SELECT num, (CAST('Dec 31, 2006 23:59:59' AS datetime) + @time * num) dt
INTO test
FROM sequence;

SELECT * FROM test;

Results:

num dt
----------- -----------------------
1 2007-01-01 00:00:00.000
2 2007-01-01 00:00:01.000
3 2007-01-01 00:00:02.000
4 2007-01-01 00:00:03.000
5 2007-01-01 00:00:03.997
6 2007-01-01 00:00:05.000
7 2007-01-01 00:00:06.000
8 2007-01-01 00:00:07.000
9 2007-01-01 00:00:08.000
10 2007-01-01 00:00:08.997
11 2007-01-01 00:00:09.997
. . . . . . . . . . . . . . . . . .
16382 2007-01-01 04:33:00.997
16383 2007-01-01 04:33:02.000
16384 2007-01-01 04:33:03.000


SELECT CONVERT(varchar(100), dt, 120)
FROM test
ORDER BY num

Results:

2007-01-01 00:00:00
2007-01-01 00:00:01
2007-01-01 00:00:02
2007-01-01 00:00:03
2007-01-01 00:00:03
2007-01-01 00:00:05
. . . . . . . . . .
2007-01-01 04:32:55
2007-01-01 04:32:56
2007-01-01 04:32:56
2007-01-01 04:32:58
2007-01-01 04:32:59
2007-01-01 04:33:00
2007-01-01 04:33:00
2007-01-01 04:33:02
2007-01-01 04:33:03

This last example has the same problems as the previous one. In addition, using arithmetic operators for date/time manipulations can lead to other errors, weird results, degradation in performance, and more problems than are discussed here. You can avoid all these problems by using the SQL Server's date/time functions.



Bonus Benefits Beyond Bad Results
SQL Server's date/time functions not only help you avoid the errors, but they also make the programmer's life much easier and more convenient. If, for example, you need to generate the fifth day of each month in the year 2007, you can do that easily using the DATEADD() function:


SELECT DATEADD(mm, num - 1, 'Jan 05, 2007') AS [5th_Day]
FROM sequence
WHERE num <= 12

Results:

5th_Day
-----------------------
2007-01-05 00:00:00.000
2007-02-05 00:00:00.000
2007-03-05 00:00:00.000
2007-04-05 00:00:00.000
2007-05-05 00:00:00.000
2007-06-05 00:00:00.000
2007-07-05 00:00:00.000
2007-08-05 00:00:00.000
2007-09-05 00:00:00.000
2007-10-05 00:00:00.000
2007-11-05 00:00:00.000
2007-12-05 00:00:00.000

The beauty of this query is in its convenience. You do not need to know how many days are in each specific month or in the year. The logic and calculations are already included into the SQL Server functions and are transparent to you. Trying to use the arithmetic operator (+) for the same task would require implementing that logic yourself, which would increase the development and debugging time and potentially raise the chance of errors.

So should you use arithmetic operators in date/time calculations in SQL Server? No. Although it's possible to use arithmetic operators with dates for very simple operations such as GETDATE() + 1—because SQL Server will convert the 1 to a date internally—you're better off avoiding using arithmetic operators with dates altogether. If you use them in more complex date/time calculations, you will need to be very careful and conscientious.