• Home
  • Modeling for BI
  • DBA
  • Azure
  • SSIS
  • SSAS
  • SSRS - Reporting
  • PBI - Analytics
  • Consulting
  • About
Microsoft Data & AI

All Things Azure

ADW PolyBase Error: Error converting RCFile Type string to Sql type NVARCHAR

8/10/2018

0 Comments

 
When testing Azure Data Lake (ADL) to Azure Data Warehouse (ADW) file ingestion, this error continued to come up on various external table SELECTs.  The confusion was that the ADL only contained parquet files.  There was only one external file format defined, and that too was obviously for parquet.  From where was an RCFile error originating?  The bottom line, in this particular engagement scenario, was that this error is actually a truncation error.

Things to Verify:
  1. The external file column list is defined in the right order.   If all your parquet file columns are defined as string, then this will not be evident until you look at the returned values from your SELECT query.  However, if you have defined actual data types in the file store, those data types must align with SQL data types in your external table definition.
  2. All NVARCHAR() columns in the external table definition are long enough.

Solution:
  1. Align your external table columns properly with the file store
  2. Enlarge the NVARCHAR() column length in your external table definition.  External tables do not store data, they just point to an external data source.  Therefore, consider going with NVARCHAR(4000) to avoid this error in the future.  The ELT data transforms is where you should CAST() all data to correct formats and lengths anyway.

Supporting t-sql Scripts
If you are new to PolyBase and external tables in SQL Server environments, here are a three t-sql scripts that are supporting references to the error resolution given above.

Example CREATE EXTERNAL DATA SOURCE t-SQL.  Click here for more information.

CREATE EXTERNAL DATA SOURCE [MyDataSourceName]
WITH (TYPE = HADOOP,
LOCATION = N'adl://MyDataLakeName.azuredatalakestore.net',
CREDENTIAL = [MyCredential])
GO


Example CREATE EXTERNAL FILE FORMAT t-SQL.  Click here for more information.
CREATE EXTERNAL FILE FORMAT [MyFileFormatName]
WITH (FORMAT_TYPE = PARQUET
, DATA_COMPRESSION = N'org.apache.hadoop.io.compress.SnappyCodec')
GO


Example CREATE EXTERNAL TABLE t-sql script.  Click here for more information.
BEGIN TRY DROP EXTERNAL TABLE [ext].[MyExternalTableName] END TRY BEGIN CATCH END CATCH
 
CREATE EXTERNAL TABLE [ext].[MyExternalTableName]
(
 [ColumnName1] bigint NULL  
,[ColumnName2] nvarchar(4000) NULL   -- if this value is too small, you will get the conversion error
,[ColumnName3] bit NULL  
,[ColumnName4] datetime NULL  
,[ADLcheckSum] nvarchar(64) NULL   
-- if this value is too small, you will get the conversion error
,[ADFIngestionId] nvarchar(64) NULL   -- if this value is too small, you will get the conversion error
)
WITH (DATA_SOURCE = [MyDataSourceName]
, LOCATION = N'/Folder1/Folder2/'
, FILE_FORMAT = [MyFileFormatName]
, REJECT_TYPE = VALUE
,REJECT_VALUE = 0)
GO 

0 Comments

    Categories

    All
    Agile Methodology
    Azure Blob Storage
    Azure Data Factory
    Azure Data Lake
    Azure Data Warehouse
    Azure SQL Database
    Cosmos DB
    Data Architecture
    Databricks
    Elastic Query
    External Tables
    Linked Services
    Migrating To The Cloud
    Parameters
    PolyBase
    Project Management

    RSS Feed

Powered by Create your own unique website with customizable templates.
  • Home
  • Modeling for BI
  • DBA
  • Azure
  • SSIS
  • SSAS
  • SSRS - Reporting
  • PBI - Analytics
  • Consulting
  • About