SSIS is Case-Sensitive

Introduction

SSIS is case-sensitive even if the database is case-insensitive.

Imagine…

 … you work in an ETL shop where someone who believes in natural keys won the Battle of the Joins. Imagine one of your natural keys is a string. (I know it’s a stretch… play along!).

Let’s build some tables to sketch it out. If you do not have a TestDB database, why not? Build one! You’ll use it often.

Use TestDB
go

Create Table SSIS1
(StrID char(5)
,Name varchar(15)
,Value int)

Insert Into SSIS1
Select ‘abc’,‘One’,1
union all
Select ‘def’,‘Two’,2
union all
Select ‘ghi’,‘Three’,3
union all
Select ‘jkl’,‘Four’,4

Create Table SSIS2
(STRID char(5) constraint PK_SSIS2_STRID Primary Key
,Name varchar(15)
,Value int)

Insert Into SSIS2
Select ‘abc’,‘One’,1
union all
Select ‘def’,‘Two’,2
union all
Select ‘ghi’,‘Three’,3
union all
Select ‘JKL’,‘Four’,4

Cool – now you have a couple tables we can use.

The SSIS Project

Now let’s build an SSIS project. After you create the project, add a Data Flow Task to the Control Flow and open it for editing. Drag an OLE DB Source Adapter onto the canvas and configure a connection to the SSIS1 table in your TestDB:

 

Next, add a Lookup Transformation and connect a Data Flow Path from the OLE DB Source Adapter to the Lookup. Configure the Lookup to use a SQL Query that pulls data from the SSIS2 table in your TestDB:

Click the Configure Error Output button and change the Error column for the Lookup Output buffer from the default (Fail Component) to Ignore Failure. This action allows the Lookup to return rows (populated with NULLs) if it does not find a match between Data Flow data and data returned by the SQL Query:

Configure the Columns in the Lookup as shown:

Close the Lookup Transformation editor and drag a Conditional Split Transformation onto the Data Flow canvas. Open the editor and add an Output buffer to detect New Rows as shown:

Close the Conditional Split Transformation editor. Next, add an OLE DB Destination to the canvas and connect a Data Flow Path from the Lookup to it. Open the editor and configure as shown:

Close the OLE DB Destination editor and let’s run this! It fails as shown:

Why? That’s an excellent question! Let’s add some Data Viewers to find out

First, SSIS2.STRID is the primary key. Since the database is case-insensitive it believes ‘jkl’ is the same as ‘JKL’ and therefore we are trying to load a duplicate primary key into the table. The technical term for this is “bad” (like crossing the streams in GhostBusters).

The Data Viewers reveal there are four rows flowing in from the Source Adapter, but only one is making it through the Conditional Split (Filter) – the ‘jkl’ row. Why is the Lookup matching on the other three rows and not this row?

SSIS is case-sensitive and the database is not.

The Lookup Transformation is accomplishing an operation similar to (but not exactly like) a Left Join. If we execute the following T-SQL in SSMS we’ll see something similar to the results of the Lookup:

select *
from SSIS1
left join SSIS2 on SSIS1.StrID = SSIS2.STRID

The results in SSMS are:

StrID Name            Value       STRID Name            Value
—– ————— ———– —– ————— ———–
abc   One             1           abc   One             1
def   Two             2           def   Two             2
ghi   Three           3           ghi   Three           3
jkl   Four            4           JKL   Four            4

Note the join works between the keys ‘jkl’ and ‘JKL’ in T-SQL, but not in SSIS.

How Do You Fix It?

Great question. There are a couple approaches – this one works for me.

In the OLE DB Source Adapter I add a field to the query – Upper(StrID) as JoinField:

Why not simply apply the Upper() function to the StrID column in the original query? I want to preserve the case of the actual data during the load. I see no need to alter the case of the data to accomplish a join. I want that data – as is – loaded. And I want my Lookup to work. And I want a pony, but that’s another post…

I perform a similar change to the SQL Query in the Lookup: 

The major change is accomplished on the Columns tab of the Lookup where I use the newly added JoinField to match rows from the SSIS1 table (already in the Data Flow’s data stream) to rows returned from the SSIS2 table:

Why don’t I return the value of JoinField from the Available Lookup Columns? Again, I’m not loading that data. I need that column only to make my join work.

Executing now brings the desired results:

Conclusion

This is one way to work around the case-sensitivity of SSIS when loading case-insensitive data. Do you have a different (or better) way? Please let me know!

:{> Andy

Andy Leonard

andyleonard.blog

Christian, husband, dad, grandpa, Data Philosopher, Data Engineer, Azure Data Factory, SSIS guy, and farmer. I was cloud before cloud was cool. :{>

5 thoughts on “SSIS is Case-Sensitive

  1. Excellent post!  I like the idea of a join field.  I have not done that, but will likely make the change as I work with SSIS in the
    future.

  2. Hi,
    I have done the lookup transformation by using upper() function. but by doing so, i lost the original case of the value.
    With your idea, the original case can be retained. thanks for sharing the idea.

  3. Hello Andy.  SQL Server case insensitivity has me spoiled, and, consequently, made the rookie mistake of not respecting the case sensitive nature of SSIS.  Thanks for the reminder.  -john-  

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.