SSIS Design Pattern – Collect Enterprise SQL Server Database Metadata With SSIS

Introduction

A few months back I received an email from a nice lady who was beginning to work with SSIS. She was trying to use SSIS to automate metadata collection in her enterprise and she had made a good start in the package she attached to the email. She had a few questions so I opened the package and began poking around some to answer them. I realized it would take me a lot longer to type out answers to her questions than to simply build the SSIS package for her (or at least make a good start). So I did. This post is about that package.

I’ve built similar stuff in the past to monitor the schemas of data warehouse sources. “Why would you do such a thing, Andy?” I know this will come as a shock to you: Sometimes developers and DBAs make changes to the schemas of data warehouse sources and don’t tell the data warehouse people. Ok, I made that part up – that never happens. But think about how you could address the issue if it ever happened. Hypothetically.

First Things First

There are a couple approaches to collecting SQL Server database metadata. My preference is to automate the process to the point that I can drop it almost anywhere and get results with a minimal amount of tinkering and tweaking. For me, this means my development process is iterative because I never get things right the first time. Ever.

This Version

The current version of this package is striped 0.4. The version I write about here is version 0.1. It serves to introduce the principles and functionality in a less complex way – allowing you the Reader to take the project in any direction you deem cool. But that means there’s stuff missing from this SSIS package that I would normally include. Stuff like error handling and logging, for example.

A Place For My Data 

Before I started I decided to store the data in a SQL Server database. You don’t need the database for the SSIS part of the project described in this post. You can simply skip the tasks that write the metadata to the database. But if you skip them and I decide to write about version 0.2, you will come back and want the database scripts. You can download them here.

In this post, I am not going to focus on the database I designed. Like the SSIS package, this is version 0.1 of the database. There is no attempt at referential integrity in this version. It is a simple, single-pass effort at designing a landing zone for this data.

A. The Servers

Servers? Raise Your Hands Please

Create a new SSIS project and name it EnterpriseDBMetadata. Name the package EnterpriseDBMetadata.dtsx.

Create the following package-scoped variables (in alphabetical order):

  • ColumnDataType (String)
  • ColumnName (String)
  • ColumnNames (Object)
  • ColumnObjectID (Int32)
  • DatabaseId (Int32)
  • DatabaseName (String)
  • DatabaseNames (Object)
  • DatabaseObjectId (Int32)
  • saSQLServers (Object)
  • SchemaId (Int32)
  • SchemaName (String)
  • SchemaNames (Object)
  • SchemaObjectId (Int32)
  • SQLServerId (Int32)
  • SQLServerInstanceName (String)
  • TablesAndViewsId (Int32)
  • TablesAndViewsName (String)
  • TablesAndViewsNames (Object)
  • TablesAndViewsObjectId (Int32)
  • TablesAndViewsType (String)

 

Add a Script Task to the Control Flow and rename it Enumerate SQL Server Instances. Editing the Script Task, add saSQLServers to the ReadWriteVariables property:

 

Click the Design Script button to open the Visual Studio for Applications (VSA) script designer and, in Project Explorer, right-click References to add references to the Microsoft.SqlServer.Smo, Microsoft.SqlServer.SmoEnum, and Microsoft.SqlServer.SqlEnum assemblies:

 

Next add the following VB.Net code to the script designer:

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports Microsoft.SqlServer.Management.Smo
Imports Microsoft.SqlServer.Management.Common

Public Class ScriptMain

   Public Sub Main()

      Dim oSmo As New SmoApplication
      Dim dtSQLServers As DataTable = oSmo.EnumAvailableSqlServers()
      Dim iSQLServerCount As Integer = dtSQLServers.Rows.Count – 1
      Dim i As Integer = 0
      Dim saSQLServers() As String
      Dim sSQLServerInstanceName As String

      ReDim saSQLServers(iSQLServerCount)

      For Each row As DataRow In dtSQLServers.Rows
         sSQLServerInstanceName = row(“Name”).ToString
         ‘MsgBox(sSQLServerInstanceName)
         saSQLServers(i) = sSQLServerInstanceName
         i += 1
      Next

      Dts.Variables(“saSQLServers”).Value = saSQLServers

      Dts.TaskResult = Dts.Results.Success

   End Sub

End Class

This is a handy piece of code. It basically creates a list of all the SQL Server instances it can locate on your network and shoves that list into an SSIS variable called saSQLServers.

Shredding The List

Steve Jones – my friend, fellow SQL Server MVP, and editor of SQL Server Central – was kind enough to publish a recent article I wrote about using the Foreach Loop Container to shred object variables. You can read more about that here. Shredding is just a fancy word (we call them “$3 words” here in Farmville) for reading individual items in a collection.

Next drop a Foreach Loop Container onto the Control Flow and connect the Script Task to it with a Success Precedence Constraint:

 

Double-click the Foreach Loop Container to open the editor. On the General page rename it ForEach SQL Server Instance. Select the Foreach From Variable Enumerator on the Collection page, and then select the saSQLServers variable:

 

On the Variable Mappings page, map the SQLServerInstanceName variable to Index 0. There’s only one column of items in my saSQLServers variable – it’s an SSIS object variable, but it contains a single-dimension String array:

 

Click the Ok button to close the Foreach Loop Container editor.

Let’s take a moment to talk about what we did. We have a variable named saSQLServers that is populated by the Script Task with a list of all the SQL Server instances it can identify on the network. The Foreach Loop Container shreds this variable, walking through each value – each SQL Server instance name, in this case – one at a time. Each pass, it pushes the value of one of those SQL Server instance names into a variable called SQLServerInstanceName.

In and of itself, this is pretty cool.

The Destination Connection

Now we need to connect our SSIS package to SQL Server so we can store the name of the servers we can reach. Right-click inside the Connection Managers window at the bottom of the Control Flow and select New OLE DB Connection:

 

When the Configure OLE DB Connection Manager form displays, click the New button to display the Connection Manager editor. In the Server Name dropdown, enter the name of the SQL Server instance to which you deployed the EnterpriseDBMetadata scripts. In the “Select or enter a database name” dropdown, select EnterpriseDBMetadata:

I always recommend testing the connection. Click the Test Connection button to verify connectivity, then click the Ok button to close the Connection Manager editor. Then click the Ok button again to close the Configure OLE DB Connection Manager.

Rename this connection “EnterpriseDBMetadata.OLEDB”.

The Source Connection

I can hear you thinking “Gosh Andy, you did that backwards! Most people start at the source and then talk about the destination.” Yeah, I know. But I’m an engineer at heart and engineers are not normal people.

How many source Connection Managers do we need?

“That’s a trick question Andy, I know it! But I think we need one for each SQL Server instance out there.”

You are partially correct. It is a trick question – you’re right about that. We do need one source Connection Manager for each SQL Server instance, but we are only going to be connected to one SQL Server instance at a time as we iterate through our Foreach Loop Container. So we can get away with one source Connection Manager if we can only find a way to aim it at the individual SQL Server instance we wish, when we wish it.

So let’s do that.

Create another OLE DB Connection Manager as you did before. This time configure it to connect to a local or developement instance to which you can connect using Windows Authentication. Configure the database name as before and click the Ok buttons to end the Connection Manager creation.

Rename this Connection Manager “SQLServerInstance.OLEDB”.

Right-click the SQLServerInstance.OLEDB Connection Manager and select Properties. Click on the Expressions Property and then click the ellipsis in the Value textbox for the Expressions Property:

 

When the Property Expressions form displays, select the ServerName property from the Property dropdown:

Click on the ellipsis on the Expression textbox to display the Expression Builder form: 

Expand the Variables list and drag the SQLServerInstanceName variable into the Expression textbox as shown above. I recommend always checking the value of an expression. To do so, click the Evaluate Expression button. Click the Ok button to close the Expression Builder, then click the Ok button again to close the Property Expressions editor.

You’ve just done something else that’s pretty cool, so let’s reflect: We have a variable named saSQLServers that is populated by the Script Task with a list of all the SQL Server instances it can identify on the network. The Foreach Loop Container shreds this variable, walking through each value – each SQL Server instance name, in this case – one at a time. Each pass, it pushes the value of one of those SQL Server instance names into a variable called SQLServerInstanceName. And each time it changes the value of the SQLServerInstanceName variable, the SQLServerInstance.OLEDB Connection Manager’s connection properties are updated to now point to that instance of SQL Server.

How cool is that?

Show Me!

I love lamp. And I like Script Tasks. I use Script Tasks a lot (lamps too!). One cool use of them is to show me the value of a variable that’s being changed by a Foreach Loop Container. Since we have a Foreach Loop Container that’s changing a variable, allow me to demonstrate.

Drag a Script Task into the Foreach Loop Container and rename it “Show SQL Server Instance Name”. Open the editor and add SQLServerInstanceName to the ReadOnlyVariables property:

 

Add these two lines of code to Sub Main():

Dim sSQLServerInstanceName As String = Dts.Variables(“SQLServerInstanceName”).Value.ToString
MsgBox(“SQL Server Instance Name: ” & sSQLServerInstanceName)

All this script does is read the value currently stored in SQLServerInstanceName and display it in a message box:

Close the VSA editor and click the Ok button to close the Script Task editor.

Test Run Time!

You can run the package now to see which servers it finds. I did. That’s how I generated the last image.

Store It

Next, drag an Execute SQL Task into the ForEach SQL Server Instance Container and connect the Script Task to it using a Success precedence constraint. Name the Execute SQL Task “Add the SQL Server name” and double-click it to open the editor.

Select the EnterpriseDBMetadata.OLEDB connection in the Connection property dropdown. Click the SQLStatement property and then the ellipsis in the Value. When the Enter SQL Query form displays, paste this T-SQL into the textbox:

If Not Exists(Select SQLServerInstanceName
              from dbo.SQLServerInstances
              where SQLServerInstanceName = ?)
 begin
  Insert Into dbo.SQLServerInstances
  (SQLServerInstanceName)
  Output inserted.Id
  Values(?)
 end
Else
 begin
  Select Id
  from dbo.SQLServerInstances
  where SQLServerInstanceName = ?
 end

See the question marks? This is a parameterized query. The question marks are placeholders for parameters. The collection of parameters comprises an array, and the array is zero-based. This means the first question mark – that one in the first If statement – maps to parameter 0. The second maps to parameter 1, the third to parameter 2. There’s a pattern emerging here… You get the picture.

To map these parameters in the Execute SQL Task, click on the Parameter Mapping page:

 

Map parameters by clicking the Add button to add a parameter mapping. Since we have three question marks in our T-SQL, you need three parameters. The first parameter maps to the SQLServerInstanceName variable, so I select that variable from the Variable Name dropdown. I am supplying the variable to the query, so it’s an Input (Direction). It’s a String data type and Strings map to the VARCHAR Data Type in the OLEDB provider I’m using in the ConnectionType property. I supply the ordinal of the parameter to which I am mapping for Parameter Name. In other words, this is the first question mark, so the Parameter Name is 0. I leave the Parameter Size set to the default (-1).

Now it just so happens that all three of our question marks map to the same SSIS variable value: SQLServerInstanceName. That isn’t always the case. But since it is here, I configure the remaining two parameters identically to the first, the only exception is the Parameter Name –> ordinal values.

The T-SQL in this Execute SQL Task returns a value. If the dbo.SQLServerInstances table does not contain a value for the current SQLServerInstanceName, the current SQLServerInstanceName is added. Did you catch the Output clause in the Insert statement? Output inserted.Id returns the value of the newly inserted Id, which is an Identity(1,1) column in the dbo.SQLServerInstances table. The Output clause rocks. It’s new in T-SQL 2005 and allows us to write a single statement to perform the insert and return the inserted row’s identity value. Note that if the SQLServerInstanceName exists in the dbo.SQLServerInstances table, the Id for that row is returned.

Either way, you get an Id value back from this statement. What do you do with it? Return to the General page of the Execute SQL Task editor and change the ResultSet Property value from “None” to “Single row”. This tells the Execute SQL Task to expect a row back from the T-SQL statement.

We want to capture this Id value for later (patience! You’ll see…). Now that we’ve told the Execute SQL Task to expect a single row from the statement’s execution, click on the Result Set page to configure it.

Similar to the parameters collection, single-row resultsets are a zero-based array and use ordinals to map various columns in the row to SSIS variables. Click the Add button to add a new ResultSet. Change the Result Name to 0 and select the SQLServerId variable:

Let’s review: We configured the Execute SQL Task to find or insert a SQLServerInstanceName into the dbo.SQLServerInstances table and return the Id of the row to us. We used a parameterized query for this. Why? Because it’s inside a Foreach Loop Container and the SQLServerInstanceName value is going to change with each iteration (pass) through the loop. We want to add them all to the dbo.SQLServerInstances table, one at a time. Make sense? Good. We also want to store the Id value for some reason Andy won’t tell us yet. Bad Andy.

The General page of the configured Execute SQL Task looks like this:

Click the Ok button to close the Execute SQL Task editor.

A Pattern Emerges…

What we’ve built so far is actually the template for everything else in this package. The package is a lot more complex than this, but it will help you immensely if you realize that this piece of functionality is merely repeated with minor tweaking to build the remainder of the package.

Now. You can copy and paste a lot here if you want to. And I don’t mind if you do so long as you promise to build really good tests. You should build really good tests anyway but testing SSIS’s a topic for another post. My point is simple: You can introduce a host of errors – some of them very interesting (which is another way of saying “hard to find”) – cutting and pasting. So be careful.

The pattern looks like this:

 

That’s it really. Looks simple when you look at it this way. That’s because it is simple. It just takes work to implement this in SSIS. Or in any other language or platform. Surprise! We call this “software development”. Smile

B. The Databases

Next drag another Execute SQL Task into the ForEach SQL Server Instance Loop Container. Connect the “Add the SQL Server name” Execute SQL Task to the new one using a Success precedence constraint and rename the new Execute SQL Task “Get DatabaseNames”:

Double-click the new Execute SQL Task to open the editor. Set the Connection property to the SQLServerInstance.OLEDB Connection Manager. Set the SQLStatement property to the following T-SQL statement:

select name, Database_Id
from master.sys.databases
where name not in (‘master’,’msdb’,’model’,’tempdb’)

This statement queries the master.sys.databases table for a list of databases and their Database_Id values, explicitly excluding system databases by name. Now there are other ways to explicitly exclude system databases and you are welcome to add them to the comments for this post. I love learning new stuff – teach me something! (In Farmville, we’d say “Learn me something”. We talk funny. I know.)

We do not use any parameters in this statement. It is dynamically aimed at the current instance of SQL Server by the dynamic expression we defined for the ServerName property or the SQLServerInstance.OLEDB Connection Manager. So this query will return a list of database on each SQL Server Instance identified by the “Enumerate SQL Server Instances” Script Task, as saSQLServers is shredded by the “ForEach SQL Server Instance” Loop Container, pushing new values into the SQLServerInstanceName variable, re-aiming the SQLServerInstance.OLEDB Connection Manager. Got it? Good.

What to do with this list of databases… I know – let’s push them into a variable.

Unlike last time, we’re returning a couple columns and several rows. The Single Row ResultSet simply will not do – we need a “Full result set” so set the ResultSet property of the Execute SQL Task to this value. On the Reuslt Set page, click the Add button and name the new Result 0. Select DatabaseNames from the Variable Name column:

So now, the list of database names and IDs are going to be pushed into the DatabaseNames object variable.

Object variables are cool. You can use them for anything. They are like Variants in old VB. They hold scalars or collections, or collections of collections. In this case we’re pushing a full ADO.Net Dataset into the DatabaseNames object variable.

How cool is that?

Click the Ok button to close the Execute SQL Task editor.

A-Shreddin’ We Will Go

Next drag a Foreach Loop Container into the “ForEach SQL Server Instance” Loop Container. Connect the Get DatabaseNames Execute SQL Task to the new Foreach Loop Container using a Success precedence constraint and rename the new Foreach Loop Container “Foreach Database”.

Double-click the Foreach Database Loop Container and navigate to the Collection page. Select the Foreach ADO Enumerator and the DatabaseNames variable in the “ADO object source variable” dropdown. ADO.Net datasets can hold multpile tables but ours only has one table, so accept the default Enumeration mode of “Rows in the first table”:

 

Click the Variable Mappings page and add a couple variable mappings.

Remember: the DatabaseNames object variable contains an ADO.Net dataset that has one DataTable. That table is populated with the results of the T-SQL query in the “Get DatabaseNames” Execute SQL Task. The T-SQL shaped the DataTable contained inside the DatabaseNames variable. That query had two columns in the Select clause. We map the data in the DatabaseNames variable to scalar SSIS variables using the ordinal of the columns in that T-SQL Select clause. Since it read select name, Database_Id, we map name to the DatabaseName SSIS variable using ordinal 0 and Database_Id to the DatabaseObjectId SSIS variable using ordinal 1. Make sense? Good.

Click the Ok button to close the Foreach Loop Container editor.

Let’s look at what we have here. Once we’ve established a connection to a particular instance of SQL Server, we query the master database for a list of database names and Ids. We then shred that list in a nested Foreach Loop Container.

Before we move on, we need to add some more dynamic functionality to our SQLServerInstance.OLEDB Connection Manager. So right click the SQLServerInstance.OLEDB Connection Manager and select Properties. Click the Expressions property and then click the ellipsis in the Value textbox. When the Property Expressions Editor displays select InitialCatalog in the Property dropdown. Click the ellipsis in the Expression textbox:

 

In the Expression Builder, add the DatabaseName variable to the Expression:

This makes the SQLServerInstance.OLEDB Connection Manager even more dynamic. It now is aimed at individual databases as the Foreach Database Loope Container iterates.

Show Me The Databases!

Drag a Script Task into the Foreach Database Loop Container. Rename it “Show Database Name” and double-click to open the editor. On the Script page add the SQLServerInstanceName, DatabaseName, DatabaseObjectId variables to the ReadOnlyVariables property:

Click the Design Script button and add the following code to Sub Main():

Dim sSQLServerInstanceName As String = Dts.Variables(“SQLServerInstanceName”).Value.ToString
Dim sDatabaseName As String = Dts.Variables(“DatabaseName”).Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables(“DatabaseObjectId”).Value)

Dim sMsg As String = “SQL Server Instance Name: ” & sSQLServerInstanceName & vbCrLf & _
             “Database Name: ” & sDatabaseName & vbCrLf & _
             “Database Id: ” & iDatabaseObjectId.ToString

MsgBox(sMsg)

Close the VSA editor and click the Ok button to close the Script Task editor.

This script builds a string containing the values of the SQLServerInstanceName, DatabaseName, and DatabaseObjectId variables; then displays them in a message box:

 

Store The Database Metadata

Drag an Execute SQL Task into the Foreach Database Loop Container. Connect the Show Database Name Script Task to it using a Success precedence constraint and rename the Execute SQL Task “Add the Database Name”. Double-click it to open the editor. Set the following properties:

  • ResultSet: Single row 
  • Connection: EnterpriseDBMetadata.OLEDB
  • SQLStatement:

    If Not Exists(Select DatabaseName
                  from dbo.Databases
                  where DatabaseName = ?
                    and SQLServerInstancesTableId = ?)
     begin
      Insert Into dbo.Databases
      (DatabaseObjectId
      ,SQLServerInstancesTableId
      ,DatabaseName)
      output inserted.Id
      Values(?, ?, ?)
     end
    Else
     begin
      Select Id
      from dbo.Databases
      where DatabaseName = ?
      and SQLServerInstancesTableId = ?
     end

There are a few more parameters this time. Map them as shown in the following list and image:

  • DatabaseName, Input, VARCHAR, 0
  • SQLServerId, Input, LONG, 1
  • DatabaseObjectId, Input, LONG, 2
  • SQLServerId, Input, LONG, 3
  • DatabaseName, Input, VARCHAR, 4
  • DatabaseName, Input, VARCHAR, 5
  • SQLServerId, Input, LONG, 6

 

Click the Result Set page and add a new resultset. Set the Result Name to 0 and assign the result to the variable DatabaseId:

Click the Ok button to close the Execute SQL Task editor.

Is it starting to come together now? Can you see where we’re headed? Do you smell what Andy’s cooking? Wink

C. The Schemas 

Add another Execute SQL Task to the Foreach Database Loop Container and rename it “Get SchemaNames”. Add a Success precedence constraint from the “Add the Database Name” Execute SQL Task to the Get SchemaNames Execute SQL Task, and double-click Get SchemaNames to open the editor. Set the following properties:

  • ResultSet: Full result set
  • Connection: SQLServerInstance.OLEDB
  • SQLStatement:

    select name, Schema_Id
    from sys.schemas
    where schema_id < 16384
    and name not in (‘guest’, ‘INFORMATION_SCHEMA’, ‘sys’)

This statement returns dbo and user schemas and their respective Ids.

Configure the Result Set page with the following resultset properties:

  • Result Name: 0
  • Variable Name: SchemaNames

Click the Ok button to close the Execute SQL Task editor.

Shred The Schemas

Drag a new Foreach Loop Container into the Foreach Database Loop Container and rename it “Foreach Schema”. Connect the “Get SchemaNames” Execute SQL Task to it using a Success precedence constraint. Double-click Foreach Schema to open the editor and set the following properties:

  • Collection page, Enumerator: Foreach ADO Enumerator
  • Collection page, Ado object source variable: SchemaNames
  • Collection page, Enumeration mode: Rows in first table
  • Variable Mappings page
    • Variable: SchemaName; Index: 0
    • Variable: SchemaObjectId; Index: 1

Click the Ok button to close the Foreach Loop Container editor.

Show Me The Schemas!

Drag a Script Task into the Foreach Schema Loop Container and rename it “Show Schema Name”. Double-click to open the editor and add the following SSIS variables to the ReadOnlyVariables property: SQLServerInstanceName, DatabaseName, DatabaseObjectId, SchemaName, SchemaObjectId. Click the Design Script button to open the VSA editor and add the following code to Sub Main():

Dim sSQLServerInstanceName As String=Dts.Variables(“SQLServerInstanceName”).Value.ToString
Dim sDatabaseName As String = Dts.Variables(“DatabaseName”).Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables(“DatabaseObjectId”).Value)
Dim sSchemaName As String = Dts.Variables(“SchemaName”).Value.ToString
Dim iSchemaObjectId As Integer = Convert.ToInt32(Dts.Variables(“SchemaObjectId”).Value)

Dim sMsg As String = “SQL Server Instance Name: ” & sSQLServerInstanceName & vbCrLf & _
                     “Database Name: ” & sDatabaseName & vbCrLf & _
                     “Database Id: ” & iDatabaseObjectId.ToString & vbCrLf & _
                     “Schema Name: ” & sSchemaName & vbCrLf & _
                     “Schema Id: ” & iSchemaObjectId.ToString

MsgBox(sMsg)

Close the VSA editor and click the Ok button to close the Script Task editor.

Store Schema Metadata

Drag an Execute SQL Task into the Foreach Schema Loop Container and connect the “Show Schema Name” Script Task to it using a Success precedence constraint. Rename the Execute SQL Task “Add the Schema Name” and double-click it to open the editor. Set the following properties:

  • ResultSet: Single row 
  • Connection: EnterpriseDBMetadata.OLEDB
  • SQLStatement:

    If Not Exists(Select SchemaName
                  from dbo.Schemas
                  where SchemaName = ?
                    and DatabasesTableId = ?)
     begin
      Insert Into dbo.Schemas
      (SchemaObjectId
      ,DatabasesTableId
      ,SchemaName)
      output inserted.Id
      Values(?, ?, ?)
     end
    Else
     begin
      Select Id
      from dbo.Schemas
      where SchemaName = ?
      and DatabasesTableId = ?
     end

Map the parameters as shown in the following list and image:

  • SchemaName, Input, VARCHAR, 0
  • DatabaseId, Input, LONG, 1
  • SchemaObjectId, Input, LONG, 2
  • DatabaseId, Input, LONG, 3
  • SchemaName, Input, VARCHAR, 4
  • SchemaName, Input, VARCHAR, 5
  • DatabaseId, Input, LONG, 6

Click the Result Set page and add a new resultset. Set the Result Name to 0 and assign the result to the variable SchemaId:

Click the Ok button to close the Execute SQL Task editor.

D. The Tables and Views

Add another Execute SQL Task to the Foreach Schema Loop Container and rename it “Get Table and View Names”. Add a 
Success precedence constraint from the “Add the Schema Name” Execute SQL Task to the “Get Table and View Names” Execute SQL Task, and double-click Get SchemaNames to open the editor. Set the following properties:

  • ResultSet: Full result set
  • Connection: SQLServerInstance.OLEDB
  • SQLStatement:

    select name, Object_Id, type_desc
    from sys.tables
    where schema_id = ?
    union
    select name, Object_Id, type_desc
    from sys.views
    where schema_id = ?

This statement returns tables and views, their respective Ids, and type descriptions (table or view). 

Map the following parameters:

  • SchemaObjectId, Input, LONG, 0, -1
  • SchemaObjectId, Input, LONG, 1, -1

Configure the Result Set page with the following resultset properties:

  • Result Name: 0
  • Variable Name: TablesAndViewsNames

Click the Ok button to close the Execute SQL Task editor.

Shred The Tables and Views

Drag a new Foreach Loop Container into the Foreach Schema Loop Container and rename it “Foreach Table and View”. Connect the “Get Table and View Names” Execute SQL Task to it using a Success precedence constraint. Double-click Foreach Table and View to open the editor and set the following properties:

  • Collection page, Enumerator: Foreach ADO Enumerator
  • Collection page, Ado object source variable: TableAndViewsNames
  • Collection page, Enumeration mode: Rows in first table
  • Variable Mappings page
    • Variable: TableAndViewsName; Index: 0
    • Variable: TableAndViewsObjectId; Index: 1
    • Variable: TableAndViewsType; Index: 2

Click the Ok button to close the Foreach Loop Container editor.

Show Me The Table And Views!

Drag a Script Task into the Foreach Table and View Loop Container and rename it “Show Table and View Name”. Double-click to open the editor and add the following SSIS variables to the ReadOnlyVariables property: SQLServerInstanceName, DatabaseName, DatabaseObjectId, SchemaName, SchemaObjectId, TablesAndViewsName, TablesAndViewsObjectId, TablesAndViewsType. Click the Design Script button to open the VSA editor and add the following code to Sub Main():

Dim sSQLServerInstanceName As String = Dts.Variables(“SQLServerInstanceName”).Value.ToString
Dim sDatabaseName As String = Dts.Variables(“DatabaseName”).Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables(“DatabaseObjectId”).Value)
Dim sSchemaName As String = Dts.Variables(“SchemaName”).Value.ToString
Dim iSchemaObjectId As Integer = Convert.ToInt32(Dts.Variables(“SchemaObjectId”).Value)
Dim sTableAndViewName As String = Dts.Variables(“TablesAndViewsName”).Value.ToString
Dim iTableAndViewObjectId As Integer = Convert.ToInt32(Dts.Variables(“TablesAndViewsObjectId”).Value)
Dim sTableAndViewType As String = Dts.Variables(“TablesAndViewsType”).Value.ToString

Dim sMsg As String = “SQL Server Instance Name: ” & sSQLServerInstanceName & vbCrLf & _
“Database Name: ” & sDatabaseName & vbCrLf & _
“Database Id: ” & iDatabaseObjectId.ToString & vbCrLf & _
“Schema Name: ” & sSchemaName & vbCrLf & _
“Schema Id: ” & iSchemaObjectId.ToString & vbCrLf & _
“Table / View Name: ” & sTableAndViewName & vbCrLf & _
“Table / View Id: ” & iTableAndViewObjectId.ToString & vbCrLf & _
“Table / View Type: ” & sTableAndViewType
MsgBox(sMsg)

Close the VSA editor and click the Ok button to close the Script Task editor.

Store Tables And Views Metadata

Drag an Execute SQL Task into the Foreach Tables and Views Loop Container and connect the “Show Table and View Name” Script Task to it using a Success precedence constraint. Rename the Execute SQL Task “Add the Table and View Name” and double-click it to open the editor. Set the following properties:

  • ResultSet: Single row 
  • Connection: EnterpriseDBMetadata.OLEDB
  • SQLStatement:

    If Not Exists(Select ObjectName
                  from dbo.TablesAndViews
                  where ObjectName = ?
                    and ObjectType = ?
                    and SchemasTableId = ?)
     begin
      Insert Into dbo.TablesAndViews
      (TablesAndViewsObjectId
      ,SchemasTableId
      ,ObjectName
      ,ObjectType)
      output inserted.Id
      Values(?, ?, ?, ?)
     end
    Else
     begin
      Select Id
      from dbo.TablesAndViews
      where ObjectName = ?
        and ObjectType = ?
        and SchemasTableId = ?
     end

Map the parameters as shown in the following list and image:

  • TablesAndViewsName, Input, VARCHAR, 0
  • TablesAndViewsType, Input, VARCHAR, 1
  • SchemaId, Input, LONG, 2
  • TablesAndViewsObjectId, Input, LONG, 3
  • SchemaId, Input, LONG, 4
  • TablesAndViewsName, Input, VARCHAR, 5
  • TablesAndViewsType, Input, VARCHAR, 6
  • TablesAndViewsName, Input, VARCHAR, 7
  • TablesAndViewsType, Input, VARCHAR, 8
  • DatabaseId, Input, LONG, 9

 

Click the Result Set page and add a new resultset. Set the Result Name to 0 and assign the result to the variable TablesAndViewsId:

Click the Ok button to close the Execute SQL Task editor.

E. The Tables and Views

Add another Execute SQL Task to the Foreach Table and View Loop Container and rename it “Get Column Names and Metadata”. Add a Success precedence constraint from the “Add the Table and View Name” Execute SQL Task to the “Get Column Names and Metadata” Execute SQL Task, and double-click “Get Column Names and Metadata” to open the editor. Set the following properties:

  • ResultSet: Full result set
  • Connection: SQLServerInstance.OLEDB
  • SQLStatement:

    select distinct
      c.name
     ,c.column_id
     ,t.name
    from sys.columns c
    inner join sys.types t on t.user_type_id = c.user_type_id
    where c.object_id = ?

This statement returns columns, their respective Ids, and type names. 

Map the following parameters:

  • TablesAndViewsObjectId, Input, LONG, 0, -1

Configure the Result Set page with the following resultset properties:

  • Result Name: 0
  • Variable Name: ColumnNames

Click the Ok button to close the Execute SQL Task editor.

Shred The Columns

Drag a new Foreach Loop Container into the Foreach Table and View Loop Container and rename it “Foreach Column”. Connect the “Get Column Names and Metadata” Execute SQL Task to it using a Success precedence constraint. Double-click Foreach Column to open the editor and set the following properties:

  • Collection page, Enumerator: Foreach ADO Enumerator
  • Collection page, Ado object source variable: ColumnNames
  • Collection page, Enumeration mode: Rows in first table
  • Variable Mappings page
    • Variable: ColumnName; Index: 0
    • Variable: ColumnObjectId; Index: 1
    • Variable: ColumnDataType; Index: 2

Click the Ok button to close the Foreach Loop Container editor.

Show Me The Columns!

Drag a Script Task into the Foreach Column Loop Container and rename it “Show Column Name”. Double-click to open the editor and add the following SSIS variables to the ReadOnlyVariables property: SQLServerInstanceName, DatabaseName, DatabaseObjectId, SchemaName, SchemaObjectId, TablesAndViewsName, TablesAndViewsObjectId, TablesAndViewsType, ColumnName, ColumnObjectId, ColumnDataType. Click the Design Script button to open the VSA editor and add the following code to Sub Main():

Dim sSQLServerInstanceName As String = Dts.Variables(“SQLServerInstanceName”).Value.ToString
Dim sDatabaseName As String = Dts.Variables(“DatabaseName”).Value.ToString
Dim iDatabaseObjectId As Integer = Convert.ToInt32(Dts.Variables(“DatabaseObjectId”).Value)
Dim sSchemaName As String = Dts.Variables(“SchemaName”).Value.ToString
Dim iSchemaObjectId As Integer = Convert.ToInt32(Dts.Variables(“SchemaObjectId”).Value)
Dim sTableAndViewName As String = Dts.Variables(“TablesAndViewsName”).Value.ToString
Dim iTableAndViewObjectId As Integer = Convert.ToInt32(Dts.Variable
(“TablesAndViewsObjectId”).Value)
Dim sTableAndViewType As String = Dts.Variables(“TablesAndViewsType”).Value.ToString
Dim sColumnName As String = Dts.Variables(“ColumnName”).Value.ToString
Dim iColumnObjectId As Integer = Convert.ToInt32(Dts.Variables(“ColumnObjectId”).Value)
Dim sColumnDataType As String = Dts.Variables(“ColumnDataType”).Value.ToString

Dim sMsg As String = “SQL Server Instance Name: ” & sSQLServerInstanceName & vbCrLf & _
“Database Name: ” & sDatabaseName & vbCrLf & _
“Database Id: ” & iDatabaseObjectId.ToString & vbCrLf & _
“Schema Name: ” & sSchemaName & vbCrLf & _
“Schema Id: ” & iSchemaObjectId.ToString & vbCrLf & _
“Table / View Name: ” & sTableAndViewName & vbCrLf & _
“Table / View Id: ” & iTableAndViewObjectId.ToString & vbCrLf & _
“Table / View Type: ” & sTableAndViewType & vbCrLf & _
“Column Name: ” & sColumnName & vbCrLf & _
“Column Id: ” & iColumnObjectId.ToString & vbCrLf & _
“Column DataType: ” & sColumnDataType

MsgBox(sMsg)

Close the VSA editor and click the Ok button to close the Script Task editor.

Store Columns And Metadata

Drag an Execute SQL Task into the Foreach Column Loop Container and connect the “Show Column Name” Script Task to it using a Success precedence constraint. Rename the Execute SQL Task “Add the Column Name” and double-click it to open the editor. Set the following properties:

  • ResultSet: Single row 
  • Connection: EnterpriseDBMetadata.OLEDB
  • SQLStatement:

    If Not Exists(Select ColumnName
                  from dbo.Columns
                  where ColumnName = ?
                    and TablesAndViewsTableId = ?)
     begin
      Insert Into dbo.Columns
      (ColumnsObjectId
      ,TablesAndViewsTableId
      ,ColumnName
      ,ColumnDataType)
      Values(?, ?, ?, ?)
     end

Map the parameters as shown in the following list and image:

  • ColumnName, Input, VARCHAR, 0
  • TablesAndViewsId, Input, LONG, 1
  • ColumnObjectId, Input, LONG, 2
  • TablesAndViewsId, Input, LONG, 3
  • ColumnName, Input, VARCHAR, 4
  • ColumnDataType, Input, VARCHAR, 5

Click the Ok button to close the Execute SQL Task editor.

Conclusion

You’re done!

When you’re done your SSIS package should look something like this:

 

You can get the SSIS package here.

:{> Andy

Andy Leonard

andyleonard.blog

Christian, husband, dad, grandpa, Data Philosopher, Data Engineer, Azure Data Factory, SSIS guy, and farmer. I was cloud before cloud was cool. :{>

16 thoughts on “SSIS Design Pattern – Collect Enterprise SQL Server Database Metadata With SSIS

  1. Great article! But a lot of the screenshots (not all though) appear ‘blacked out’.  Is this just me?

  2. Hi alphatross and Olivier,
      I’m not sure what’s happening with the images. I can see them on all my machines (and I have a few).
      If anyone has any ideas on what’s hindering the display of these images and how to fix it, please let me know.
    Thanks,
    Andy

  3. Great Article.  I wonder if it would be possible to retrieve the Server Name and Instance Name as 2 separate columns in the "Enumerate SQL Server Instances" script task.  I would like to pass Server Name and Instance Name to the For Each Loop.  Right now, it only passes in the SERVER\Instance as 1 column.
    Thanks again for the article.

  4. I downloaded the package and tested it and I get an error
    Error: ForEach Variable Mapping number 1 to variable “User::SQLServerInstanceName” cannot be applied.
    and..
    Error: The type of the value being assigned to variable “User::SQLServerInstanceName” differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
    Anyone else run into this?

  5. Andy, thanks for the great post! I was exactly looking for this information and was happy to see that this was your post.

  6. This is awesome!  Is there an easy way that I can use this example and add IP Address to what is collected from the server when it gets the instance name?

  7. Hi guys.. Do you know what causes this exception when using Nested For Each Loop..
    Error: Variable "User::xNPU_NM_Files" does not contain a valid data object

  8. Great post, but my question is why not simple use information schema to retrieve all this information, I mean why reinvent the wheel.
    Cheers,
    Moeen K.

  9. Andy, this is great article.  I am assumming that the reason that you wouldn’t want to go the information schema route is due to the flexibility of encompassing this code into a package. If you write t-sql and place in stored procedures it will not be very flexible.

  10. Hi, Andy.  I know that it has been a while since anyone has commented on this article, but I had a quick question.
    Any reason that you are setting the Dts variables to local variables, before using them later in the code?  It is possible to just use the Dts variables when defining the string to be displayed in the message box.  I was just curious as to the reason that you are doing it the way that you are.
    Thanks.

  11. Hi Andy.  Thanks for this.  I was trying to nest a foreach – the outer to loop over like tables, and the inner to loop over distinct report dates within the table.  I had that part, but was having difficulty using the inner loop variable in my query within the inner loop.  Don’t know what it was, but reading through your article a light bulb went off, and now it works.  Thanks.

  12. Hi Andy, Great post. I have a question about using the smo to iterate through the domain to identify SQL Servers. Will the SMOapplication iterate across multiple domains in an enterprise or must it be run in each domain. Thanks,  Blake

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.