Saturday, May 26, 2012

SSIS: How to resolve Excel Import 255 character Truncation issue?

You might have seen this topic headline (Excel import issue) discussed in many forums, blogs etc. This issue puts a serious limitation to Ms-Excel to be used as a data source. In this post we are going to see conceptually; why the experts are saying so? I recently learnt this and putting it here so that other people can learn.

Let us take an example – we have following data in Ms-Excel spreadsheet. We have 8 rows (from Row # 2 to Row # 9). We have three columns - EmpName, EmpComments and ColumnLength. The ColumnLength field contains the length of characters in field EmpComments. The highest number of character length is in Row #9 – 686.

I created a SQL table which will store the data exported from above Ms-Excel spreadsheet.

CREATE TABLE [dbo].[Employee](
    [EmpName] [varchar](50) NULL,
    [EmpComment] [varchar](1000) NULL
)

The table has two columns as my intention is to export only two columns – EmpName and EmpComment.

I build up a SSIS package with two connection manager – Excel Connection manager which points to our Ms-Excel spreadsheet and one OLE DB Connection which points to our SQL Server database.

In the package we have a Data Flow Task. Under Data Flow tab; we have three components. The first component is Excel Source which reads the rows from Ms-Excel Spreadsheet. The Derived Column component simply converts the columns EmpName and EmpComment data type from Unicode to String. The OLE DB Destination component is to load the values it received from Derived Column to table Employee under SQL Server database.

The derived column converts the unicode valaues to string as our data type in SQL table is varchar.

After setting up the SSIS package, I pressed the run button and it runs smoothly. All 8 rows were exported to SQL table.

Till this point all went fine for us. Now let us try to repeat the same scenario with 9 records (from Row # 2 to Row # 10). We have added a new Row at Row #9.

After setting the Ms-Excel Spreadsheet with 9 rows, I tried to run the SSIS package and this time it failed.

The Progress tab reported that there is a Truncation error at column EmpComments. Why? We did not change anything except adding a new row in our source spreadsheet.

If you handle the truncation error in your SSIS package it will insert only 255 characters in columns and truncate rest of the characters (EmpComments in our case).

The reason for this is that Ms-Excel connection manager actually works on a guess. Yes, that is right it works on the assumptions. It reads a registry key called "TypeGuessRows". The default value for this registry key is 8. This mean excel will scan the first 8 row to determine the data type and data length of the column.

The valid value for the key “TypeGuessRows” can be between 0 to 16. Setting the value to 0 tell Ms-Excel to scan all the rows before determining the data length and data type of column. This will cause a small performance hit especially if the data volume is large.

So in our first scenario all went fine, the Ms-Excel scanned the 8 rows and determined that the length of the column Empcomments would be equal to at least 686 characters.

In our second example, we inserted a row in between and as you can see the first 8 rows has maximum 77 characters length in column EmpComments. The 9th row has 686 characters length so the Package failed to insert and return truncation error.

So what is the solution? All we need to tell Ms-Excel is to scan some more rows before assuming the data type and data length of the columns. Right?

Yes, that is exactly what we have to do. I went to the registry and changed the value to 0. When the TypeGuessRows values become 0 it scans the 16,384 rows before assuming the data type and data length of the column.

After changing the TypeGuessRows values the SSIS package executed successfully. All 9 rows were imported from Ms-Excel spreadsheet.

So in a nutshell, if your data volume is extremely large and you are not sure about the position of rows with maximum length characters in the spreadsheet; the Ms-Excel may fail to achieve the purpose. It will fail or truncate the data. The solution is to use dump your data into a text file and use the text file instead as source of your data load.

I hope you like the post.
Please feel free to add comment to make this post more usable.

Related Article:

Popular Posts

Real Time Web Analytics