I have often seen a recommendation that using DISTINCT instead of
GROUP BY is better from a performance perspective. DISTINCT is
definitely a more readable alternative, but I wanted to test for myself
as to what the truth was.
To compare the performance of the two, I wrote the following simple queries.
The query below helps in identifying the number of records that SQL Server will need to process. On my machine, when executed without the GROUP BY or the DISTINCT clause, the query generates records in excess of 380,000.
The queries below fetch the same information – one uses the GROUP BY whereas the other query uses a DISTINCT.
Before executing the two queries (one with the GROUP BY and the other
with the DISTINCT), I had turned on the actual execution plan.
Upon studying the execution plan, the following observations can be made:
To compare the performance of the two, I wrote the following simple queries.
The query below helps in identifying the number of records that SQL Server will need to process. On my machine, when executed without the GROUP BY or the DISTINCT clause, the query generates records in excess of 380,000.
1.USE AdventureWorks2012;2.GO3.SELECT COUNT(so1.object_id)4.FROM sys.objects AS so15.CROSS JOIN sys.objects AS so2 ;6.GO01.USE AdventureWorks2012;02.GO03.--Clean the buffers and procedure cache04.DBCC DROPCLEANBUFFERS;05.DBCC FREEPROCCACHE;06.GO07. 08.--Evaluate GROUP BY09.SELECT so1.object_id,10.so1.name11.FROM sys.objects AS so112.CROSS JOIN sys.objects AS so213.GROUP BY so1.object_id, so1.name ;14.GO15. 16.--Clean the buffers and procedure cache17.DBCC DROPCLEANBUFFERS;18.DBCC FREEPROCCACHE;19.GO20. 21. 22.--Evaluate DISTINCT23.SELECT DISTINCT24.so1.object_id,25.so1.name26.FROM sys.objects AS so127.CROSS JOIN sys.objects AS so2 ;28.GOUpon studying the execution plan, the following observations can be made:
- When the SQL server database engine detected that the 2nd query did not have any aggregation function, it was smart enough to convert the GROUP BY into a DISTINCT when the query plan is generated
- There is no difference in the execution plan across both the queries
DISTINCT is not faster than GROUP BY. In fact, Both DISTINCT and GROUP BY demonstrate equal performance impact.
When aggregations are not being used, DISTINCT is more readable when compared to GROUP BY.
No comments:
Post a Comment