When upgrading our Oracle database to 11g we received the following error when executing previously working SSIS packages on our MSSQL2008R2 server:
Executed as user: <redacted> Microsoft (R) SQL Server Execute Package Utility Version 10.50.1600.1 for 64-bit Copyright (C) Microsoft Corporation 2010. All rights reserved. Started: 3:10:34 PM Error: 2013-08-16 15:10:37.14 Code: 0xC02020F6 Source: <redacted> OLE DB Source  Description: Column “<redacted>” cannot convert between unicode and non-unicode string data types. End Error Error: 2013-08-16 15:10:37.15 Code: 0xC004706B Source: <redacted> SSIS.Pipeline Description: “component “OLE DB Source” (1)” failed validation and returned validation status “VS_ISBROKEN”. End Error Error: 2013-08-16 15:10:37.15 Code: 0xC004700C Source: <redacted> SSIS.Pipeline Description: One or more component failed validation. End Error Error: 2013-08-16 15:10:37.15 Code: 0xC0024107 Source: <redacted> Description: There were errors during task validation. End Error DTExec: The package execution returned DTSER_FAILURE (1). Started: 3:10:34 PM Finished: 3:10:37 PM Elapsed: 2.308 seconds. The package execution failed. The step failed.
Turns out the fix was very simple. Basically all we had to do was edit the package in notepad and replace all occurrences of:
most likely this is and issue with the look up for your alias. a quick workaround for is to set the Data Source and Provider Strings to the following:
<FQ hosting name>:<port>/<alias>
thanks to bkgroups<AT>yahoo<DOT>com for the post all the way back in 2007 on sqlmonster.com for the answer.
I was smacking my head against the wall trying to figure out why in the world a scheduled task I wrote was timing out on my production server, but not on my development server.
I started by looking at the database first since I know that that is the place that 99% of performance tuning could be done. I fired up Query Analyzer and popped in the SQL statement and began combing through the tables and views that made up the query.
To make a long story short (too late), I noticed that in one view I was returning a count aggregate and in my development database every row returned a value, but on my production database there was a row that was returning a null. I patched the problem with a quick isnull() and reran the query again. IT WORKED!!!!
The lessons I learned, double check your work, always account for nulls on joins and aggregates and never write business logic views at 4 in the morning.
Ben started a whirl wind of ideas with his recent post about optimizing SQL queries for better performance. I was getting too carried away with commenting and decided that further tips that I had should be put down in my own post. Below are the ideas that I commented on, along with some new ones.
1) Create indexes on foreign key(s)
I think, no wait, I know that this is the most overlooked design aspect on almost every database. Far too often people forget to create an index for the foreign key(s) that are on a table. I guess the reason is because since all database servers automatically create an index on the columns that you specify to make up the primary key, why shouldn’t it do the same for foreign keys? Well it doesn’t and this is why it’s the most overlooked and biggest culprit of performance lost in a database.
How many times have you seen a statement that joins 10 different tables all to grab a single value from each table? I’m so guilty of this, I should be thrown to the wolves. Again this is a HUGE performance hog and can be avoided by sitting down and taking the time to normalize your database. What is normalization you ask? It’s the idea that some data that is in one table can be copied into another table to prevent a join from occurring. I guess an example I can give is:
You have 2 tables named employees and companies. Companies has an one to many relation to Employees. When you write a query to retrieve an employee’s company name, you create a join between the employees table and the companies table and retrieve the company name.
It’s simple and we all do it and it’s probably not the greatest example, but I wanted something simple. Now if you wanted to normalize this, you would create a column on the employees table called company_name and copy the value from the companies table into that column, thus preventing you from have to create a join when retrieving the company name in a query which will improve the performance of the query. This can be accomplished at the application level or by using triggers within the database.
Now I wouldn’t use normalization in this situation, again I wanted something simple to explain the concept. With that said, when should you start and how do you determine what to normalize all depends on the data and the time the queries are taking to retrieve it. Most of time though I only normalize simple data that isn’t changed too often. For that stuff I use views.
3) Views and Indexed Views
Yes they’re different and indexed views aren’t supported on all database platforms so this doesn’t pertain to everyone.
Normalization is tough to deal with and maintain. So how do we go about increasing the performance of our database with copying columns and data everywhere, we use views!
Views are basically queries that you can save and then use like normal table when writing other queries. The idea is that instead of writing the same sub-queries or derived tables all the time and throughout your application, you can move those statements into views and then interact with them as you would with any other table.
Views are a great place to house the business logic of your application when you don’t want to or can’t use stored procedures. I use them a lot in applications to calculate expiration dates of memberships, totaling up line items in a shopping cart or just about any other calculations that the application needs.
Imagine trying to maintain a set of complicated calculations and business logic in an application that is copied over and over in queries scattered everywhere? You can’t!
Now I understand that most people argue that calculations like these should be moved into classes with the application, but I’ve found that that’s not always a smart thing to do and can sometimes cause bigger performance lost then realized. Imagine if you would, you have a method in your application that calculates the expiration date of a member and you need to access to this information at the database level. Well if the calculation is performed at the application level, that means you will need to copy and translate the logic to your database. By moving this calculation to a view you now have access to the information from both your database and the application and it’s all in one place! Also now since the calculation is being performed at the database level it makes you write less code in your application and not have to calculation for each record you return.
Ok great… so what is an indexed view? An index view is the basically the same thing as a regular view only you have the option of creating indexes on it where as a regular view you can’t. This can greatly increase the performance of queries that use views or span across multiple views. BE FOREWARNED though there are STRICT guidelines that you must follow in the creation and use of these views and they differ between database servers. You must consult your database servers documentation when attempting to use them.
Examples on using and creating indexed views in MSSQL 2000 can be found here.
4) Clustered Indexes
At the beginning of this article I mentioned that placing indexes on foreign key can be a big help and that almost all database server automatically create indexes for primary keys. Well expanding on that is the use of clustered indexes versus regular indexes.
When you create a regular old index on a table your database server basically creates a file on the server with information about where everything is within that table. Nothing happens to the data within the original table. Not so when you create a clustered index. A clustered index tells the database server how to physically write the data for this table onto the server’s disk. This is the reason why you can only have one clustered index on a table and it’s THE MOST important decision you can make when talking about performance.
Remember the advice at the beginning about placing index on foreign keys? Well to expand on that, you should also determine if your foreign key should be used as or be part of the clustered index on the table.
Let’s look back at the scenario I gave between the employees and the companies. After we’ve written our application, we noticed that throughout our application, there are many times when we query to list or find employees that are part of a particular company. Further investigation reveals that the only time we’re ever querying the employees table directly is when We’re authenticating a login or retrieving their profile.
By looking at this scenario it would probably make sense to include the company foreign key within the clustered index on the employees table and make it the first column of the index. Reason is that this will dramatically speed up the seek time of getting all the employees for a company since they will all be located around each other.
Now let me pull the reigns back a little. I’m not saying that you should go about doing this on every table in your database. There are very specific situations where this action makes sense and it doesn’t occur often. The only way to be completely sure is to load test your database in a testing environment with the change. Another HUGE WARNING! Doing this will cause the physical restructuring of data on the disk and as such it can take an incredibly long time to complete this change on large tables. Again, only testing can determine if a change like this is worth making.
5) Dropping, Rebuilding and Defragging Indexes and General Maintenance
When was the last time you rebuild or defragged your indexes?
Have you updated the statistics on the database lately?
How about checked the integrity of your data?
When was the last time you backed up your data?
Do you have any idea what I’m saying?
If your aren’t continually performing proper maintenance on your database, none of the ideas I talked about make any sense doing. Without proper maintenance your database will continue to degrade in performance no matter what you do. Almost all aspects of maintenance is automated and it’s so simple there’s no reason not to do it. MSSQL is especially easy since it has the Database Maintenance Plan wizard to guide you though it all. Check out the documentation the came with your database server to see what maintenance options or wizards it comes with.