Rip's Domain

Optimizing your database with indexes

Posted in Microsoft, SQL, TechSupport by rip747 on December 3, 2007

While browsing Reddit, I came across an article about why you should index your database.

In short, the author finally figures out that using an index can decrease the execution time of a query dramatically. An incredibly uninformative article, since there is no mention of what he indexed or any advice as how you should go about doing it yourself.

Well being a database guy at heart I thought that I would give some hints in this subject since it’s a fairly common one. Many a times in my career I’ve been asked by people about what they should be indexing in their databases. I don’t know why, but people seem scared to place an index on a table. Let me just tell you that you aren’t going to destroy your database by experimenting with indexes, you might slow it down a bit, but you won’t mess up any data. So don’t be afraid to start playing around.

So what should we index? As a rule of thumb you should be indexing the following:

  1. Any foreign keys linking table together (you will be amazed that this is the most overlooked index to create)
  2. Any columns that you will use to create joins between table (usually these are the same as the foreign keys, but they could differ sometimes)
  3. Any columns that you will be filtering against in your queries (these are columns in your queries that you reference in your where clauses)

By going down those 3 items, you will most likely see a performance gain in your database. Now there are some gotchas with this:

  1. Be careful to not index a column more then once. This could be common if you place an index on a foreign key and then include that foreign key within another index. There are obviously exceptions to the rule.
  2. You don’t want to place an index on every column in a table even if you do use even column in where clauses throughout your application. By doing so you will be killing the performance of the table. Try to use judgment in selecting the most important columns within the table.

Can you have too many indexes on a table? Of course you can! One of the ways that you know that you have too many indexes is if you see a lag when altering data on a table (inserting, updating and deleting). Remember that your database server has to maintain and do housecleaning on all indexes affected by altering the data on a table. Because of this, if you place too many indexes on a table, it will slow down the write and updates of the indexes which will cause lags.

For people just getting started with indexes and using SQL Server, a great way to learn and practice is to use the Index Tuning Wizard that is included with SQL Server. To practice, place a slow running query or a query that you want to optimize in Query Analyzer and execute it. Next, try figuring out on your own which columns on which tables you should index in order to make the query perfomrm better and write them down. Run the Index Tuning Wizard and see if your suggestions matches the ones that the Index Tuning Wizard comes up with.

Good luck optimizing your tables with indexes. If you have any questions, suggestions or comments, leave them below.

3 Responses

Subscribe to comments with RSS.

  1. Victor said, on December 6, 2007 at 10:27 am

    Good article!

    In the databases we can create indexes on multiple columns. When we have to do this and when we not? What are merits and demerits of multiple column indexes?

    Thanks

  2. Andy Yates said, on December 6, 2007 at 11:15 am

    Multiple column indexes can only be used by the database engine in certain situations. For example we’ve got a table with the fields NAME, SEX and AGE and we apply a multiple column index in the same order now these queries should use the index:

    select * from tab where name =? and sex=? and age>?
    select * from tab where name =? and sex=?
    select * from tab where name =?

    However this query:

    select * from tab where name =? and age>?

    Will be very very happy to use the index for name but will then do a scan of the rows for the age rather than skipping into the age portion of the index.

    You should use multiple column indexes when you are accessing the data from a table in a particular manner (actually in Oracle you have one if you’re defining a unique constraint over multiple fields). Otherwise Oracle experience has taught me that most of the time single indexes will perform nearly as well.

  3. […] couple of days ago, I found dzone after someone took one of my article and threw it up there. Now I’ve replace my digg bookmarklet with a dzone one and […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: