Archive for the ‘TSQL’ Category

The Problem

I had a list of different routes and stops along the route (for the benefit of this example they are called seq). What I wanted to do was get the start time and the end time for each journey and so work out the journey times.
Each journey – here called route – may have different start and stop points (seq values)

The Solution

In the real case I found the min and max seq for each journey and added these as columns to my base table. This was all done in SQL 2014

[JourneyID] [int] NOT NULL,
[Route] [int] NOT NULL,
[seq] [int] NOT NULL,
[tme] [smalldatetime] NOT NULL,
[maxSeq] [int] NOT NULL,
[minSeq] [int] NOT NULL);

Values(1056975,20,2,’2015-06-03 09:34:00′,99,2),
(1056975,20,5,’2015-06-03 09:38:00′,99,2),
(1056975,20,6,’2015-06-03 09:39:00′,99,2),
(1056975,20,99,’2015-06-03 09:44:00′,99,2),
(1056975,20,99,’2015-06-03 09:45:00′,99,2),
(1056975,20,99,’2015-06-03 09:49:00′,99,2),
(1056975,20,99,’2015-06-03 09:53:00′,99,2),
(1056975,20,99,’2015-06-03 09:56:00′,99,2),
(1056975,20,99,’2015-06-03 09:57:00′,99,2),
(2471362,1,1,’2015-06-06 07:48:00′,99,1),
(2471362,1,1,’2015-06-06 07:49:00′,99,1),
(2471362,1,2,’2015-06-06 07:56:00′,99,1),
(2471362,1,5,’2015-06-06 07:57:00′,99,1),
(2471362,1,5,’2015-06-06 07:59:00′,99,1),
(2471362,1,8,’2015-06-06 08:05:00′,99,1),
(2471362,1,11,’2015-06-06 08:08:00′,99,1),
(2471362,1,14,’2015-06-06 08:15:00′,99,1),
(2471362,1,15,’2015-06-06 08:21:00′,99,1),
(2471362,1,99,’2015-06-06 08:23:00′,99,1)

;WITH journeyStart(JourneyID, [Route], seq, tme, RN)
as (
SELECT JourneyID, [Route], seq, tme, RN=row_number()
FROM #Journey WHERE seq = [minSeq])

SELECT JS.[Route], JS.seq as TrainSequenceStart, JS.tme as StartTime,

RJ2.seq as TrainSequenceEnd, RJ2.tme as EndTime, JT.JourneyTime as
[JourneyTime (minutes)]
FROM journeyStart as JS

CROSS APPLY (select top (1) RJe.*
FROM #Journey as RJe
where RJe.JourneyID = JS.JourneyID
AND RJe.[Route] = JS.[Route]
AND RJe.Seq = RJe.maxSeq
order by RJe.JourneyID, RJe.seq) as RJ2

CROSS APPLY (SELECT DateDiff(mi,JS.tme,RJ2.tme) as journeyTime) as JT


DROP TABLE #Journey;

The important point it in the 1st cross apply, RJe.Seq = RJe.maxSeq, which just says, get me the max seq value.
I added the second cross apply for the date calculation just to make the code easier to read.

Using the Apply operator

Posted: October 28, 2015 in Apply, SQL Bits, SQL Video, TSQL

I started this blog to remind myself how I solved certain problems or to dump code snippets. I haven’t been using it much but am going to get it going again. From now on I’m also going to include links to interesting SQL resources (mainly videos) I watch.

To start this off, I watched this one today: Boost your T-SQL with the Apply Operator. Certainly worth a watch if you are doing any T-SQL coding. It’s presented by Itzik Ben-Gan who also wrote the book T-SQL Querying, which I happen to be reading at the moment.

The Problem

I had a table which represented clusters of an incident. I then wanted 1 row for every instance of that incidence. So,

DayName InstanceCount
Day 1 2
Day2 3
Day3 1

Would become:


The Solution

At first I used a cursor and entered each row into a new table.
I wasn’t happy with the cursor so looked into other answers and came up with this:

DECLARE @ii as int;
DECLARE @i as int = 0;
SELECT @ii = max(InstanceCount) + 10
FROM [dbo].[SourceTable];

DECLARE @NumRows as Table (num int not null);
WHILE @i < @ii
INSERT INTO @NumRows(num)
VALUES (@i);
SET @i = @i + 1;

FROM [dbo].[SourceTable] as J
JOIN @NumRows as n
ON n.num < CONVERT(int, LEFT(CAST(InstanceCount AS INT ),20));

First get the highest instance count and add a few for luck. Then join this numbers table to your base table. This got the results I wanted. (I still had to use a cursor because I needed to randomise an element in each row, but that’s another story).

I watched a video by Steve Stedman ( and he gave a demo of a numbers table using a CTE. So I changed my code to this:

DECLARE @ii as int;
SELECT @ii = max(InstanceCount) + 10
FROM [dbo].[SourceTable];

;WITH numbers(num) as
SELECT 1 + num
FROM numbers
WHERE num < @ii
FROM [dbo].[SourceTable] as J
JOIN numbers as n
ON n.num < CONVERT(int, LEFT(CAST(InstanceCount AS INT ),20));

Annoyingly, MAXRECURSION wouldn’t accept a variable as a value else I would have used @ii

Slow Linked Server Query.

Posted: November 24, 2011 in Linked Servers, OpenQuery

A BI Developer reported a problem when using a linked server. Queries which normally executed at a reasonable speed were now taking 20-30 minutes.

I logged on to the linked server and ran a simple select

select * from sys.databases;

This returned the results set in less than 1 millisecond.

I then ran the developer’s query and it came back with the results in 5-6 minutes (this is roughly the time I was told it would take). So there was nothing wrong with the remote server itself.

I took the same command and converted the From clauses to run through the linked server using 4 part naming: servername.database.schea.object

I limited this result set to TOP 100. This came back in about 6 minutes.
As it ran there was a wait on the linked server for CXPacket. After this there was a wait for Async_Network_IO.

I ran the same query again, using the 4 part name, but this time I removed the TOP 100 clause. I watched the Activity Monitor on the remote server again. There was the same CXPacket wait – which took about the same amount of time as before. Then it switched to Async_Network_IO. I killed the query after 20 minutes.

After a bit of fiddling with the query I ran it a third time, using open query and the results came back in 2.5 minutes – this was around 44.5k rows. (The query had already been run once on the local server so this query took advantage of the pre-cached execution plan.)


This type of query (one which selects data from very large tables) is not suitable for linked servers using the four part naming convention.

When a query compiles, using this convention, it assumes that there is no difference between using the linked server and using the local server.

The way SQL returns data is to grab data from all the tables in the FROM clause (in order left to right) and then filter them on the joins then filter on the columns selected and finally filter on the WHERE clause. This is all done in memory and the final results returned. Using the linked server it can’t all be done in memory – the data is selected on the remote server and then sent back to the local server to begin filtering – so instead of waiting for a memory function we are waiting for a network connection.

When the same query is run using openquery the query is passed to the linked server for execution and the smaller result set is returned.

The Problem

In reporting services a new report is created. The report region/Language is set to English(UK). A date parameter is created.
When the report is run the date parameters are showing up in mm/dd/yyyy format 

The Fix

A parameter will take a label and a value field.
Change your query to have a label formatted in the way you like and the value left raw:
Select Right(‘0′ + Cast(Day(dateRange) as Varchar(2)),2) +’-‘+ Right(‘0’ + Cast(Month(dateRange)as Varchar(2)),2) + ‘-‘+ cast(Year(dateRange) as char(4)) as label, dateRange as value
FROM dbo.vwDateRange
Order by dateRange desc 

Where daterange is a smalldatetime 

Set the report parameters to display label as the label and value as value

Pivot or Sum

Posted: August 2, 2010 in Pivot, SQL Server 2005, TSQL

A developer was working on a report. The data was originally in a SQL 2000 Server database. The data being used was then transferred into a SQL 2005 database. The original report used SUM and Group BY. Now it was in 2005 Pivot could be used. He wrote a new script and compared results. His question give the same results, is it wrong to assume that SQL Pivot newness means betterness?

Here are the two queries:

Select [Reviewed] + [Logged] as [Queue], [Open] As Development, [Testing], [Pending Imp],
[Open] + [Testing] + [Pending Imp] as [Total In Progress],
[Reviewed] + [Logged] + [Open] + [Testing] + [Pending Imp] as [Grand Total]
from (Select [state] FROM dbo.tblCombinedSnapshto) Datatable
For [state]
In ([Reviewed], [Logged], [Open], [Testing], [Pending Imp])
) Pivottable;

Select Sum(case when [state] = ‘Reviewed’ or [state] = ‘Logged’ then 1 else 0 end) as [Queue],
Sum(case when [state] = ‘Open’ then 1 else 0 end) as Development,
Sum(case when [state] = ‘Testing’ then 1 else 0 end) as Testing,
Sum(case when [state] = ‘Pending Imp’ then 1 else 0 end) as [Pending Imp],
Sum(case when [state] = ‘Open’ or[state] = ‘Testing’
or [state] = ‘Pending Imp’ then 1 else 0 end) as [Total In Progress],
Count(*) as [Grand Total]
from dbo.tblCombinedSnapshto;

The results were:

PIVOT timing
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 38 ms.
IO Table ‘tblCombinedSnapshto’. Scan count 1, logical reads 8, physical reads 1, read-ahead reads 48, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

2000 timing
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 50 ms.
IO Table ‘tblCombinedSnapshto’. Scan count 1, logical reads 8, physical reads 1, read-ahead reads 48, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So for disk usage they were equal. For speed Pivot wins out.

Only a small scale test but worth doing now and then.

IsNull and Dates

Posted: May 10, 2010 in TSQL

The following code was written expecting the results to be -1,-1

DECLARE @x as datetime;
set @x = NULL;
SELECT IsNull(@x,-1) AS Col_1, IsNull(NULL,-1) AS Col_2;

The actual result was 1899-12-31 00:00:00.000,-1

Why did it happen?

The reason it does this is because dates can be given INT values where 0 = 1900-01-01

All other dates are relative to that so:
1 = 1900-01-02
-1 = 1899-12-31

SELECT IsNull(@x,-1) AS Col_1;
Is the same as set @x = -1;
Which is the same as
set @x = ‘1899-12-31 00:00:00.000’;

So it did assign the value -1 to the date which then translated to Dec 31st 1899