When CF8 was released I was really excited since it looked like my dream would finally come true with CFCOMPONENT now supporting OnMissingMethod and implicit creation for structures and arrays. OnMissingMethod works like a champ, but we all know though that implicit creation is broken and doesn’t look like it’s getting fixed anytime soon (Ben had a post saying that it would be fixed soon but we haven’t heard anything and he didn’t respond to my last post on the matter).
Because of this, I basically decided to continue development with the old code of ICEGen. After working on it for the last month I managed to get all the bugs out and it works really well. Thing is though this isn’t the project I’ve envisioned it to be, and I’m having a hard time coming to terms with that it. It’s actually kind of sad when I really sit down to look at it.
So rather then continuing to work and support a project that I can’t stand, I’ve decided to let it die. Hopefully once the implicit creation bugs get worked out, I’ll finally be able to create the ORM that I want. Sorry for anyone who wanted to see 2.0 get release, but it’s not happening.
Ben started a whirl wind of ideas with his recent post about optimizing SQL queries for better performance. I was getting too carried away with commenting and decided that further tips that I had should be put down in my own post. Below are the ideas that I commented on, along with some new ones.
1) Create indexes on foreign key(s)
I think, no wait, I know that this is the most overlooked design aspect on almost every database. Far too often people forget to create an index for the foreign key(s) that are on a table. I guess the reason is because since all database servers automatically create an index on the columns that you specify to make up the primary key, why shouldn’t it do the same for foreign keys? Well it doesn’t and this is why it’s the most overlooked and biggest culprit of performance lost in a database.
How many times have you seen a statement that joins 10 different tables all to grab a single value from each table? I’m so guilty of this, I should be thrown to the wolves. Again this is a HUGE performance hog and can be avoided by sitting down and taking the time to normalize your database. What is normalization you ask? It’s the idea that some data that is in one table can be copied into another table to prevent a join from occurring. I guess an example I can give is:
You have 2 tables named employees and companies. Companies has an one to many relation to Employees. When you write a query to retrieve an employee’s company name, you create a join between the employees table and the companies table and retrieve the company name.
It’s simple and we all do it and it’s probably not the greatest example, but I wanted something simple. Now if you wanted to normalize this, you would create a column on the employees table called company_name and copy the value from the companies table into that column, thus preventing you from have to create a join when retrieving the company name in a query which will improve the performance of the query. This can be accomplished at the application level or by using triggers within the database.
Now I wouldn’t use normalization in this situation, again I wanted something simple to explain the concept. With that said, when should you start and how do you determine what to normalize all depends on the data and the time the queries are taking to retrieve it. Most of time though I only normalize simple data that isn’t changed too often. For that stuff I use views.
3) Views and Indexed Views
Yes they’re different and indexed views aren’t supported on all database platforms so this doesn’t pertain to everyone.
Normalization is tough to deal with and maintain. So how do we go about increasing the performance of our database with copying columns and data everywhere, we use views!
Views are basically queries that you can save and then use like normal table when writing other queries. The idea is that instead of writing the same sub-queries or derived tables all the time and throughout your application, you can move those statements into views and then interact with them as you would with any other table.
Views are a great place to house the business logic of your application when you don’t want to or can’t use stored procedures. I use them a lot in applications to calculate expiration dates of memberships, totaling up line items in a shopping cart or just about any other calculations that the application needs.
Imagine trying to maintain a set of complicated calculations and business logic in an application that is copied over and over in queries scattered everywhere? You can’t!
Now I understand that most people argue that calculations like these should be moved into classes with the application, but I’ve found that that’s not always a smart thing to do and can sometimes cause bigger performance lost then realized. Imagine if you would, you have a method in your application that calculates the expiration date of a member and you need to access to this information at the database level. Well if the calculation is performed at the application level, that means you will need to copy and translate the logic to your database. By moving this calculation to a view you now have access to the information from both your database and the application and it’s all in one place! Also now since the calculation is being performed at the database level it makes you write less code in your application and not have to calculation for each record you return.
Ok great… so what is an indexed view? An index view is the basically the same thing as a regular view only you have the option of creating indexes on it where as a regular view you can’t. This can greatly increase the performance of queries that use views or span across multiple views. BE FOREWARNED though there are STRICT guidelines that you must follow in the creation and use of these views and they differ between database servers. You must consult your database servers documentation when attempting to use them.
Examples on using and creating indexed views in MSSQL 2000 can be found here.
4) Clustered Indexes
At the beginning of this article I mentioned that placing indexes on foreign key can be a big help and that almost all database server automatically create indexes for primary keys. Well expanding on that is the use of clustered indexes versus regular indexes.
When you create a regular old index on a table your database server basically creates a file on the server with information about where everything is within that table. Nothing happens to the data within the original table. Not so when you create a clustered index. A clustered index tells the database server how to physically write the data for this table onto the server’s disk. This is the reason why you can only have one clustered index on a table and it’s THE MOST important decision you can make when talking about performance.
Remember the advice at the beginning about placing index on foreign keys? Well to expand on that, you should also determine if your foreign key should be used as or be part of the clustered index on the table.
Let’s look back at the scenario I gave between the employees and the companies. After we’ve written our application, we noticed that throughout our application, there are many times when we query to list or find employees that are part of a particular company. Further investigation reveals that the only time we’re ever querying the employees table directly is when We’re authenticating a login or retrieving their profile.
By looking at this scenario it would probably make sense to include the company foreign key within the clustered index on the employees table and make it the first column of the index. Reason is that this will dramatically speed up the seek time of getting all the employees for a company since they will all be located around each other.
Now let me pull the reigns back a little. I’m not saying that you should go about doing this on every table in your database. There are very specific situations where this action makes sense and it doesn’t occur often. The only way to be completely sure is to load test your database in a testing environment with the change. Another HUGE WARNING! Doing this will cause the physical restructuring of data on the disk and as such it can take an incredibly long time to complete this change on large tables. Again, only testing can determine if a change like this is worth making.
5) Dropping, Rebuilding and Defragging Indexes and General Maintenance
When was the last time you rebuild or defragged your indexes?
Have you updated the statistics on the database lately?
How about checked the integrity of your data?
When was the last time you backed up your data?
Do you have any idea what I’m saying?
If your aren’t continually performing proper maintenance on your database, none of the ideas I talked about make any sense doing. Without proper maintenance your database will continue to degrade in performance no matter what you do. Almost all aspects of maintenance is automated and it’s so simple there’s no reason not to do it. MSSQL is especially easy since it has the Database Maintenance Plan wizard to guide you though it all. Check out the documentation the came with your database server to see what maintenance options or wizards it comes with.
More from the website that emulates the true personality of the internet (sheds a tear).
And the winner this week:
Below is a link to the interview that George Noory gave with Ron Paul last night on Coast to Coast. Give a listen:
So we had to install an SSL cert for one of out clients. Low and behold the client goes to print out some badges and the images don’t want to show up. First words out of my mouth were…. “MOTHER FUCKER, GOD DAMN CFDOCUMENT!!!”
Yes kiddies, that’s right CFDOCUMENT doesn’t like SSL.
The solution (which sucks) is to make sure you put the full NON-SSL URL link inside the document. Make sure that you do this also for any CSS that you are including, because that will brake also.
Happy International Bunny Day everyone!!!!!
Some know it as Easter, but I tell it like it is. Go enjoy the day with your families and remember to rabbit for dinner tonight. Yeah I know I’m a morbid fuck, but rabbit is soooooooooooooo good.
Again BASH is slow….. sigh. So I had to not only get the latest and greatest, but reach into the archives, the deep cockels, the colon even, of BASH to meet the quota. Enjoy!
and this week’s winner….
In case you missed it, BlueDragon (the J2EE version) is going open source as announced by New Atlanta earlier this week. This announcement has singled handedly fired up the CF community. I’ve been hoping for so that the Smith Project would take off and become the front runner of CFML engines since it was the only open sourced one at the time. Alas the project has been plagued with little enthusiasm from the CF community since it doesn’t support a lot of features and many just don’t want to contribute. BD on the other hand is feature rich and ready to go.
But let’s take and step back and look at what this could mean for other languages and frameworks out there. What could the other get from BD going open source?
For one thing, I’ve always wanted to just ditch CF and get into Rails, but there are some feature in CF that I just can’t get away from. I think the biggest is document generation with CFDOCUMENT. I can’t tell you how many clients I have using this feature to print invoices, attendee badges and other things. CFDOCUMENT makes it so easy to do. Trying using PDF::Writer in Rails to print an invoice, you’ll probably shoot yourself. This is one feature that the Rails community could look at and port into Ruby and ultimately include it in Rails.
Personally I think the open sourcing of BD was a very bold move by New Atlanta and will undoubtedly help get CF in front of more people. Kudo to Vince and the team for stepping up!
Can’t wait for the download.
Sucks man…. BASH for some reason is just so slow lately. Well anywho, guess that means I have to reach in to the archives once again…. but who gives a fuck, it’s BASH!
Runners up: These made me crack up!
Winner this week: This one had me crying for 5 minutes.
Visit my brother’s blog right now!