Archive for the ‘OpenQuery’ Category

Slow Linked Server Query.

Posted: November 24, 2011 in Linked Servers, OpenQuery

A BI Developer reported a problem when using a linked server. Queries which normally executed at a reasonable speed were now taking 20-30 minutes.

I logged on to the linked server and ran a simple select

select * from sys.databases;

This returned the results set in less than 1 millisecond.

I then ran the developer’s query and it came back with the results in 5-6 minutes (this is roughly the time I was told it would take). So there was nothing wrong with the remote server itself.

I took the same command and converted the From clauses to run through the linked server using 4 part naming: servername.database.schea.object

I limited this result set to TOP 100. This came back in about 6 minutes.
As it ran there was a wait on the linked server for CXPacket. After this there was a wait for Async_Network_IO.

I ran the same query again, using the 4 part name, but this time I removed the TOP 100 clause. I watched the Activity Monitor on the remote server again. There was the same CXPacket wait – which took about the same amount of time as before. Then it switched to Async_Network_IO. I killed the query after 20 minutes.

After a bit of fiddling with the query I ran it a third time, using open query and the results came back in 2.5 minutes – this was around 44.5k rows. (The query had already been run once on the local server so this query took advantage of the pre-cached execution plan.)

Conclusion

This type of query (one which selects data from very large tables) is not suitable for linked servers using the four part naming convention.

When a query compiles, using this convention, it assumes that there is no difference between using the linked server and using the local server.

The way SQL returns data is to grab data from all the tables in the FROM clause (in order left to right) and then filter them on the joins then filter on the columns selected and finally filter on the WHERE clause. This is all done in memory and the final results returned. Using the linked server it can’t all be done in memory – the data is selected on the remote server and then sent back to the local server to begin filtering – so instead of waiting for a memory function we are waiting for a network connection.

When the same query is run using openquery the query is passed to the linked server for execution and the smaller result set is returned.