• Home
  • Modeling for BI
  • DBA
  • Azure
  • SSIS
  • SSAS
  • SSRS - Reporting
  • PBI - Analytics
  • Consulting
  • About
Microsoft Data & AI

SQL Server Integration Services

Error Handling in SSIS, Redirecting Error Rows, Report and Pass

7/12/2017

0 Comments

 
I classify SSIS errors into two groups:
  1. Report and pass -- non-fatal errors that should be reported but not stop the ETL process like (my opinion) truncation or data type mismatch from a text file.
  2. Report and fail -- fatal errors that should fail a package, like "cannot acquire connection from connection manager".  (Honestly, no package is going anywhere after this one!)

This blog post is about "Report and Pass" which I first talked about in Top 5 Best Practices for SSIS.  There are hundreds of SSIS potential errors, and to start with, I STRONGLY recommend BI xPress from Pragmatic Works.  SSIS package deployment and monitoring aside, however, what I'd like to talk to you about here is error handling inside of a data flow so that we are not sacrificing the permanent, the completion of your master-master package, on the alter of the immediate, a dateTime data type that came through last night in a text file's integer column space.

Last month I dusted off an old SQL 2008 SSIS package written about that same time.  I was hoping that error handling in SSIS through VS 2015  had some cool new task to "report and pass".   Not finding anything, I was hoping that
Task Factory, a second "don't deploy without it" SSIS tool, had picked up the slack.  Nothing doing, as some say.  Here then is a review of an old proven SSIS methodology for recording data of redirected errors in a SSIS data flow. 
​

Process Overview
First we need a SQL Server data table to hold our error rows.  You might consider something like this:
CREATE TABLE [etl].[SSISErrorHandlingLog](
[RowID] [int] IDENTITY(1,1) NOT NULL,
[SSISPackageId] [varchar](50) NOT NULL,
[SSISTaskName] [varchar](50) NOT NULL,
[SSISTableName] [varchar](50) NULL,
[ErrorCode] [varchar](25) NULL,
[ErrorColumnID] [varchar](10) NULL,
[ErrorColumnName] [varchar](250) NULL,
[ErrorDescription] [varchar](1200) NOT NULL,
[RowData] [varchar](1500) NULL,
[InsertedDateTime] [datetime] NOT NULL,
Picture
CONSTRAINT [PK_SSISErrorHandlingLog] PRIMARY KEY CLUSTERED 
([RowID] ASC) ON [PRIMARY]) ON [PRIMARY]
GO
ALTER TABLE [etl].[SSISErrorHandlingLog] ADD  CONSTRAINT [DF_SSISErrorHandlingLog_InsertedDateTime]  DEFAULT (getdate()) FOR [InsertedDateTime]
GO
Picture
​We then need to add three or four tasks to each data flow: a union all (optional), a derived column, a script task (optional), and finally an OLE DB destination to our etl.SSISErrorHandlingLog table.  All things completed, a data flow will look something like this (on the left):

​The script component has two inputs: ErrorCode and ErrorColumn.  It has one output: ErrorText.  (You can also add addition C3 for ErrorColumnName.)  Other than that, the script component exposes the error description.
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
Row.ErrorDescription= ComponentMetaData.GetErrorDescription(Row.ErrorCode);
    }


The DC (derived column) task has an expression for [RowData].
"My Table's PK=\"" + REPLACENULL((DT_WSTR,50)IntKey," ") + "\"" + " My User BK=\"" + REPLACENULL(MyStringColumn," ") + "\""
Additional derived columns might be as shown below: (I did not do myself any favors by making my table's string data types VARCHAR() instead of NVARCHAR(), so please follow your own DB data type standards.)
Picture
If you are redirecting error from multiple places, like both your SRC and DST, use a Union All task before the script task making sure to only include needed columns -- you may get SSIS error when trying to union large string values, so only union needed PKs, FKs and BKs consumed by the RowData expression.

IMPORTANT:  All things in moderation!  Here is an error handling example of what NOT to do.
Picture
Where Things Go Wrong:
  1. Your error handling was NOT done in moderation
  2. The RowData expression is incorrect.  Work in small pieces building the script one small part at a time.
  3. The data types of the derived columns or script output do not match the etl.SSISErrorHandlingLog.  Either convert your SSIS output columns or change or destination data types.
  4. Your script component follows a DOT NET component and there is no error description to be had.  The script task will "hang".
  5. The binary code for the script is not found.  I'm still trying to figure this out as it seem intermittent.  Open the script component, edit the script, save the script, save the package, open and close the package.  If you have a more technical solution, I'd love to hear from you.

Conclusion:
There is more than one way to trap and report SSIS errors, but in the end, we are all traditionally using a package event handler or redirecting errors in the data flow.  I like to see the data that caused the problem, so I tend to go a little overboard with the information concatenated for the etl.SSISErrorHandlingLog.RowData column.  I also hook up my error table to a SSRS report and SSRS subscription in an effort to actually act on the data collected, not just store it.
0 Comments



Leave a Reply.

    Categories

    All
    Best Practices
    Design
    Error Handling
    SCD2
    SSIS

    RSS Feed

    Tweets by @SqlServerSME
Powered by Create your own unique website with customizable templates.
  • Home
  • Modeling for BI
  • DBA
  • Azure
  • SSIS
  • SSAS
  • SSRS - Reporting
  • PBI - Analytics
  • Consulting
  • About