There has always been this rift in the CF community that has bothered me since I typed my first SQL statement into a <cfquery>, whether or not to use “SELECT *” to perform a query of all columns on a table. A lot of people have expressed their opinions from time to time with some really qualified answered and some not so qualified answers. Here are two of the usual ones I come across:

1) By naming the columns you get better performance.
2) It makes the code harder to read.

I have problems with both these answers; let’s examine them. The first one obviously is something that if you’re going to say, you need to back it up with some sort of SQL profile or SQL trace to see what is going on in your database world. Simply stating that it gives you better performance means nothing. So many people make broad claims like this and they have nothing to back it up. Now I will tell you that selecting all the columns in a table when you are only going to use two is a waste of network chatter and database performance. Is it enough to bring down your entire web site? I doubt it unless your web site is getting massive amounts of traffic (I’ll talk about this in a later blog post). However we’re not talking about just selecting two rows here, are we? What we want to know is by using “SELECT *” if we decrease performance on our database server. Well for that we will need to use trusty old SQL Query Analyzer (SQA) and run some queries against a database. I’ll be using [Northwind].[Customers] table for this article since it is installed with SQL Server by default.

select_all_ep.jpg

select_rows_ep.jpg

AS you can see from the Execution Plans for both queries, there is no difference between the two. They both have the exact same I/O and CPU costs.

select_all_trace.jpg

select_rows_trace.jpg

Not much of a difference here at all. Both queries generated 19 reads to the disk. This is really nothing surprising since we got this same exact information from the Execution Plan before.

So as of right now we have seen that both queries affect the CPU and hard drive the same, but what about network traffic? Let’s see below.

select_all_stats.jpg

select_rows_stats.jpg

AH-HA!!!! There is a difference!!! As you can see from the statistics that network traffic by using “SELECT *” is lower than the calling all the columns. To be honest, I expected this. Since you’re sending more text across the pipe, you’re obviously going to have more network traffic. Really this shouldn’t be looked at, but I wanted to show you that, yes, there is a difference some where. Is it anything to be concerned about? Not at all.

So as you can see that by using the Execution Plan, Trace and Statistics displays of SQA there is no difference between using “SELECT *” and writing out the column names, NOTHING, NATHAN, NADA, ZILTCH, ZIP, ZERO. So…. like I said in the blog comment, the only reason people say this is to express their own opinion and nothing else. There is no evidence to back it up.

On to the next answer 2!

This is TOTALLY a matter of opinion and I can’t even give screenshots for this and with that I will give my opinion: TYPE OUT YOUR COLUMN NAMES. I can’t tell you how many times I’ve looked back at a “SELECT *” query I wrote months earlier only to curse myself because I didn’t remember what columns were in the table. Especially now that a lot of people put their queries in CFCS, it more important than ever to type out all your column names. But like everything else in this article, it’s a matter of opinion.

Back online

July 7, 2006

I’m back blogging since my departure a year ago. Since I’ve been offline a lot has happened in the community.

I see that Model-Glue has taken off and is now the leader in frameworks for CF development, surpassing even FuseBox (thank the lord).

BlueDragon has gained tremendous ground as an alternative CFML engine. I wouldn’t be surprised if BD soon surpasses Adobe in sales for CFML and takes over as the leading CFML engine. People will say that I’m on dope for saying something like that, but by reading Vince’s blog and looking at the road that BD is heading, it looks like BD is concentrating on what the community really wants as opposed to Adobe which is just adding more and more useless features.

Duck-typing is now the talk of the town. We can thank our wonderful community leaders for letting this ignorant idea out of the bag. No offense guys (well take offense, I care less), but why in the world did you even mention the supposed performance benefits. I still don’t believe it, and that’s because no one has posted any solid proof of it and I could bet no one will.

Some things I guess will never change.

I’m still reading blog posts about which is better to use: SELECT * or SELECT [columnlist]. Guys, there is no difference between the two, get over it already.

I’ll be posting more things soon, but that’s all for now.