To “SELECT *” or not. It’s a matter of opinion.
There has always been this rift in the CF community that has bothered me since I typed my first SQL statement into a <cfquery>, whether or not to use “SELECT *” to perform a query of all columns on a table. A lot of people have expressed their opinions from time to time with some really qualified answered and some not so qualified answers. Here are two of the usual ones I come across:
1) By naming the columns you get better performance.
2) It makes the code harder to read.
I have problems with both these answers; let’s examine them. The first one obviously is something that if you’re going to say, you need to back it up with some sort of SQL profile or SQL trace to see what is going on in your database world. Simply stating that it gives you better performance means nothing. So many people make broad claims like this and they have nothing to back it up. Now I will tell you that selecting all the columns in a table when you are only going to use two is a waste of network chatter and database performance. Is it enough to bring down your entire web site? I doubt it unless your web site is getting massive amounts of traffic (I’ll talk about this in a later blog post). However we’re not talking about just selecting two rows here, are we? What we want to know is by using “SELECT *” if we decrease performance on our database server. Well for that we will need to use trusty old SQL Query Analyzer (SQA) and run some queries against a database. I’ll be using [Northwind].[Customers] table for this article since it is installed with SQL Server by default.
AS you can see from the Execution Plans for both queries, there is no difference between the two. They both have the exact same I/O and CPU costs.
Not much of a difference here at all. Both queries generated 19 reads to the disk. This is really nothing surprising since we got this same exact information from the Execution Plan before.
So as of right now we have seen that both queries affect the CPU and hard drive the same, but what about network traffic? Let’s see below.
AH-HA!!!! There is a difference!!! As you can see from the statistics that network traffic by using “SELECT *” is lower than the calling all the columns. To be honest, I expected this. Since you’re sending more text across the pipe, you’re obviously going to have more network traffic. Really this shouldn’t be looked at, but I wanted to show you that, yes, there is a difference some where. Is it anything to be concerned about? Not at all.
So as you can see that by using the Execution Plan, Trace and Statistics displays of SQA there is no difference between using “SELECT *” and writing out the column names, NOTHING, NATHAN, NADA, ZILTCH, ZIP, ZERO. So…. like I said in the blog comment, the only reason people say this is to express their own opinion and nothing else. There is no evidence to back it up.
On to the next answer 2!
This is TOTALLY a matter of opinion and I can’t even give screenshots for this and with that I will give my opinion: TYPE OUT YOUR COLUMN NAMES. I can’t tell you how many times I’ve looked back at a “SELECT *” query I wrote months earlier only to curse myself because I didn’t remember what columns were in the table. Especially now that a lot of people put their queries in CFCS, it more important than ever to type out all your column names. But like everything else in this article, it’s a matter of opinion.