<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" >

<channel><title><![CDATA[Microsoft Data & AI - Modeling for BI]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi]]></link><description><![CDATA[Modeling for BI]]></description><pubDate>Sun, 07 Dec 2025 18:16:42 -0800</pubDate><generator>Weebly</generator><item><title><![CDATA[Data Architecture for Azure BI Programs]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/data-architecture-for-azure-bi-programs]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/data-architecture-for-azure-bi-programs#comments]]></comments><pubDate>Fri, 27 Jul 2018 11:00:00 GMT</pubDate><category><![CDATA[Architecture]]></category><category><![CDATA[BI Blueprint]]></category><category><![CDATA[Modeling]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/data-architecture-for-azure-bi-programs</guid><description><![CDATA[A Bit of IntroIf I recall correctly, I completed the first version of this data architecture diagram in 2012 when we used terms like "road map" and "blueprint"&nbsp; Back&nbsp; then, along with different terms, we were also using traditional SSIS, SSAS-MultiD and SSRS tools.&nbsp; Now we live in the world of cloud everything, although we are still driving from SRC-to-DST (source to destination).&nbsp; I'm up for whatever terminology you want to use, but can we agree that we are surely on a diffe [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"><font color="#2a2a2a"><strong>A Bit of Intro</strong><br />If I recall correctly, I completed the first version of this data architecture diagram in 2012 when we used terms like "road map" and "blueprint"&nbsp; Back&nbsp; then, along with different terms, we were also using traditional SSIS, SSAS-MultiD and SSRS tools.&nbsp; Now we live in the world of cloud everything, although we are still driving from SRC-to-DST (source to destination).&nbsp; I'm up for whatever terminology you want to use, but can we agree that we are surely on a different highway?&nbsp; For my classical BI Blueprint, click <a href="https://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation" target="_blank">here</a>, but to see an Azure road map for BI, please take a look below.<br /><br /><strong>Disclaimer: </strong>I create a different diagram for every engagement, so think of this as a suggestion, not a mold.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/data-architecture-for-azure-bi-programs_1_orig.jpg" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><strong><font color="#2a2a2a">Azure Data Architecture BI Talking Points:</font></strong><ol><li><font color="#2a2a2a">Start thinking "event", "file based ingestion", "streaming" and "near real time" replacing the former batch mode&nbsp; thought process.</font></li><li><font color="#2a2a2a">Adopt an "I can, but I won't" methodology as you reach for optimal Azure solutions.&nbsp; I</font><font color="#2a2a2a">dentify the intended purpose for each Azure tool and stick with it.&nbsp; There shouldn't be data transforms happening in every column, and stand alone semantic layers growing in every Power BI report.</font></li><li><font color="#2a2a2a">Simplistic repeatability&nbsp;is the key to successful CI/CD (continuous integration, continuous delivery).&nbsp; Data enters ABS (Azure Blob Storage) in different ways, but <em>all data moves through the remainder of the ingestion pipeline in a uniform process</em>.&nbsp;&nbsp;</font></li><li><font color="#2a2a2a">Consider hiring&nbsp;a former web developer.&nbsp; More and more Azure offerings are coming with a GUI, but many will always require .NET, R, Python, Spark, PySpark, and JSON developer skills (just to name a few).&nbsp; You will need these skills for columns #2 and #5 above.</font></li><li><font color="#2a2a2a">Be prepared to replace SSIS functionality with ADFv2, v3, v4 (eventually),&nbsp;and Sql Server User Stored Procedures.&nbsp; I think of Azure Data Factory v2 as an "orchestrator" right now, but I ADF Data Flows is Microsoft's next step in replacing SSIS.</font></li><li><font color="#2a2a2a">Build current and future state data architectures.&nbsp; This foresight&nbsp;helps to ensure a solid foundation as you build your BI house in increments.&nbsp;&nbsp;</font><span style="color:rgb(42, 42, 42)">It is generally a misstep to plan a Taj Madashboard, that needs information for every system in your company, as your first deliverable.&nbsp; This is also contrary to an Agile manifesto.&nbsp; Keep something in your back pocket.&nbsp; Deliver&nbsp;business value in consistent increments.</span></li><li><font color="#2a2a2a">Model your data stores around reporting and security requirements, not what is easiest for data ingestion.</font></li><li><font color="#2a2a2a">Start small and scale up with all your Azure resources.&nbsp; &nbsp;This is the premise of an Azure cost-effective solution.</font></li></ol><br /><font color="#2a2a2a"><strong>BI Advice from the University of Hard Knocks:</strong></font><ol><li><font color="#2a2a2a">Every decision point should be what is best for reporting and analytics, not data transform and load.&nbsp; Please see my <a href="https://www.delorabradish.com/modeling-for-bi/the-big-picture-what-is-in-the-center-of-your-bi-wheel" target="_blank">BI Wheel</a> for success.&nbsp; The wheel spokes change with Azure, but the theory does not.</font></li><li><font color="#2a2a2a">Require uniformity and avoid one-off creative solutions.&nbsp; Make your first exception on the last day of the 5th year after you have gone to production.</font></li><li><font color="#2a2a2a">Don't complicate your data architecture just because it's new, challenging and fun.&nbsp; Always produce a finished product that can be handed off to an entry-level developer.&nbsp; More than one person in the BI team should be able to service each part of the architecture.</font></li><li><font color="#2a2a2a">Don't design your data architecture around 10% of your user base i.e. the number of people who may hold a master's degree in statistics at your company.</font></li><li><font color="#2a2a2a">Design a BI solution without end-user input, and they will not come.&nbsp; Attitude is 50% of the success of your BI solution.&nbsp; Give strategic company users an investment in the project, and <em>then</em> they will adopt it.</font></li></ol><br /><font color="#2a2a2a"><strong>Conclusion of the Matter:&nbsp;</strong>&nbsp;I am not explaining every column in the data architecture because the columns in the above diagram are not applicable to everyone.&nbsp; For example, almost everyone needs a semantic layer, but not everyone needs a logical data store for operational reporting.&nbsp; &nbsp;Column #5 can be done in Spark as well as Data Bricks; instead of my telling you what the best solution is, let's talk about it.&nbsp; For every column there is a good, better and best solution, and good heavens (!) not everyone needs a thirteen point data architecture!&nbsp; All things in moderation, right?&nbsp;&nbsp;<br /><br />I am asking, if you have taken the time to read this, please <em>start planning before you start building</em>!&nbsp; Opening Power BI and mashing up data from three different sources is generally not a scalable solution.&nbsp;&nbsp;Get started with a data architecture diagram and <a href="https://www.delorabradish.com/modeling-for-bi/the-big-picture-a-bi-project-is-like-building-a-house" target="_blank">build a better BI house</a>!</font></div>]]></content:encoded></item><item><title><![CDATA[Denormalizing Snowflake 3NF Dimension Tables]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/denormalizing-snowflake-3nf-dimension-tables]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/denormalizing-snowflake-3nf-dimension-tables#comments]]></comments><pubDate>Mon, 29 May 2017 19:00:04 GMT</pubDate><category><![CDATA[Uncategorized]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/denormalizing-snowflake-3nf-dimension-tables</guid><description><![CDATA[I have a PPTX slide that I use when speaking about data modeling for BI. &nbsp;(You can find my Pragmatic Works webinar here.) &nbsp;The slide is OLTP vs OLAP and is an 10K foot view of an actual denormalized &nbsp;ERD.         I'm sharing the above slide for those that are new to denormalization, but I think more can be said about how to handle dimensions that "snowflake" or daisy chain to each other. &nbsp;You can see this happen in AdventureWorks between the Product, ProductSubCategory and Pr [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"><font color="#2a2a2a">I have a PPTX slide that I use when speaking about data modeling for BI. &nbsp;(You can find my Pragmatic Works <u><a href="http://pragmaticworks.com/Training/Details/Build-Your-BI-House-with-a-Blueprint-Structuring-Your-Data-Warehouse-for-BI" target="_blank">webinar here</a></u>.) &nbsp;The slide is OLTP vs OLAP and is an 10K foot view of an actual denormalized &nbsp;ERD.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/denormalization1_1_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#2a2a2a">I'm sharing the above slide for those that are new to denormalization, but I think more can be said about how to handle dimensions that "snowflake" or daisy chain to each other. &nbsp;You can see this happen in AdventureWorks between the Product, ProductSubCategory and ProductCategory tables. &nbsp;When life is simple and all fact tables relate to Product on ProductKey, denormalization is easy to model.&nbsp;</font><br /><br /><strong><font color="#c2743b">Option #1: Combine the parent and child tables into a single subject area dimension.</font></strong></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/denormalization2_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#2a2a2a">In the above solution, all three product tables were joined together into a single DimProductDenormalized. &nbsp;The new table contains columns and keys from all three original tables. &nbsp;This works well until a second fact table does not have a ProductKey, but only a ProductSubcategorykey. &nbsp;Now we are in a bit of a fix.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/denormalization5_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#2a2a2a">1. &nbsp;SSAS multidimensional cubes are designed to effectively and efficiently handle this exact scenario through attribute and cube relationships.</font><br /><font color="#2a2a2a">2. &nbsp;SSAS tabular&nbsp;models will require&nbsp;a second dimension at the higher, product category, grain.</font><br /><font color="#2a2a2a">&#8203;&#8203;3. &nbsp;The SQL database does not support a relationship between the denormalized DIM and the FactSnapshot and in fact will throw a "The columns in table &lsquo;DimProductDenormalized&rsquo; do not match an existing primary key or UNIQUE constraint" error. &nbsp;For me, this is not an issue as I only keep SQL-defined table relationships in a data warehouse for the first year as a second measure of protection for the ETL which should be enforcing referential integrity anyway. &nbsp; &nbsp;<br />4. &nbsp;By combining small dimensions into a single subject area dimension, dimension count in the semantic layer has decreased and natural hierarchies are now available.</font><br /><br /><font color="#2a2a2a"><strong>Key Takeaway: </strong>The above scenario only works well for small code + description tables that can be combined to form a subject area dimension.&nbsp;</font><br /><br /><strong><font color="#c2743b">Option #2: Pull the individual dimension keys into the fact table</font></strong><font color="#2a2a2a"> whereby removing the dimension snowflake and creating a star schema around each fact table.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/denormalization6_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#2a2a2a">1. &nbsp;Snowflaked relationships are no longer used although they can still exist on disc.</font><br /><font color="#2a2a2a">2. &nbsp;Fact tables of different grains both have a true star schema</font><br /><font color="#2a2a2a">3. &nbsp;Dimension count in SSAS has increased.</font><br /><font color="#2a2a2a">4. &nbsp;"Subject area dimension" advantage of option #1 is lost.</font><br /><br /><strong><font color="#c2743b">Leaving the Land of AdventureWorks</font></strong><br /><font color="#2a2a2a">How might we implement these ideas in a more complex scenario? &nbsp;What happens when each snowflaked dimension is already a subject area dimension and contains ten, twenty or more dimension attributes? &nbsp;Please allow me to jump over to Visio now and bring in a conceptual diagram.</font></div>  <span class='imgPusher' style='float:left;height:0px'></span><span style='display: table;width:515px;position:relative;float:left;max-width:100%;;clear:left;margin-top:0px;*margin-top:0px'><a><img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/published/denormalization7_1.png?1496087645" style="margin-top: 5px; margin-bottom: 10px; margin-left: 0px; margin-right: 10px; border-width:1px;padding:3px; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="display:block;"><font color="#2a2a2a">1. &nbsp;Multidimensional cubes handle 3NF through dimension design and referenced cube relationships. &nbsp;All is well.<br />2. &nbsp;Tabular models can consume 3NF by default design.<br />&#8203;3. &nbsp;SSAS, SSRS and Power BI can all handle this ERD effectively. &nbsp;Changing this data model is not a requirement for a data warehouse design. &nbsp;In fact, this is what I think of as a Bill Inmon, the official father of data warehousing, design.<br />4. &nbsp;If you are familiar with my <u><a href="http://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation" target="_blank">BI Blueprint</a></u>, you will usually find this data warehouse design in column 5.<br />5. &nbsp;This data model is NOT optimized for reporting, but is CAN WORK just fine.</font><br /><br /><font color="#2a2a2a"><strong>Significant Issue:</strong>&nbsp;Type 2 SCD (slowly changing dimensions) can explode row counts if perpetuated down all referenced relationships. &nbsp; For example, when type2 DimCostCenter has a change and inserts a new row, DimLineOfService has to react and insert a new row as does DimCustomer. &nbsp;Regardless of 2NF, 3NF or worst normal form, type 2 data warehouse model with layers of parents and grandparent dimension tables will have this problem. &nbsp;This needs its own blog post. &nbsp;Staying focused on snowflaking dimensions ...</font><br /><br /><strong><font color="#c2743b">Option #2 </font></strong><font color="#2a2a2a">solution shown with larger dimensions that cannot be combined.</font></div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>  <span class='imgPusher' style='float:left;height:0px'></span><span style='display: table;width:515px;position:relative;float:left;max-width:100%;;clear:left;margin-top:0px;*margin-top:0px'><a><img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/published/denormalization8.png?1496088497" style="margin-top: 5px; margin-bottom: 10px; margin-left: 0px; margin-right: 10px; border-width:1px;padding:3px; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="display:block;"><br /><font color="#2a2a2a">1. &nbsp;Just like the AdventureWorks star schema, this is the same concept and gives exceptional query performance because this design is optimized for reporting.<br />2. &nbsp;</font><span style="color:rgb(42, 42, 42)">The dim-to-dim relationships still exist, although not pictured, and are used only for ETL.</span><br /><span style="color:rgb(42, 42, 42)">3. &nbsp;Very Kimball-ish FK (foreign key) heavy fact tables</span><br /><font color="#2a2a2a">4. </font><strong><font color="#c2743b">&nbsp;</font><font color="#2a2a2a">Key Concept:</font></strong><font color="#2a2a2a"> Parent and grandparent dimension FKs are brought into the fact table -- </font><em style="color:rgb(42, 42, 42)">including the many role playing dimension keys that may exist in one or more dimension layers</em><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp;FactTable,CustomerIndustry1Key</font><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp;FactTable.CustomerIndustry2Key</font><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp;FactTable.CustomerIndustry3Key</font><br /><font color="#2a2a2a">5. &nbsp;Be sure to prefix your role playing dimension keys or the cost center associated with the transaction will get confused for the default cost center associated with the customer.<br />6. &nbsp;In my </font><u><a href="http://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation" target="_blank">BI Blueprint</a></u>, <font color="#2a2a2a">you will find this data warehouse design in column 7.<br />7. &nbsp;Type 2 SCD challenges still exist<br />8. &nbsp;This is a good idea (and my personal preference) but everyone IS NOT DOING IT and there are very effective 3NF data warehouses that function daily. &nbsp; For a company that has an extraordinary amount of snowflaking dimensions or strict (or unknown) type 2 requirements, this star schema may become a disadvantage.</font></div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>]]></content:encoded></item><item><title><![CDATA[Substituting Integers for Source System Primary Key Varchar() Data Types]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/substituting-integers-for-source-system-primary-key-varchar-data-types]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/substituting-integers-for-source-system-primary-key-varchar-data-types#comments]]></comments><pubDate>Thu, 09 Feb 2017 21:09:42 GMT</pubDate><category><![CDATA[Architecture]]></category><category><![CDATA[Modeling]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/substituting-integers-for-source-system-primary-key-varchar-data-types</guid><description><![CDATA[Situation: Many CRM data source use varchar() GUID-looking values for primary keys. &nbsp;This blog post applies to any source system for a reporting and analytics project that uses text/string/character values to join transactional tables. &nbsp;Below are example PKs from the SalesForce Opportunity table.         If you have the privilege of a data warehouse, the extract, transform and load (ETL) process often, as best practice, replaces source system PKs with data warehouse identity seed integ [...] ]]></description><content:encoded><![CDATA[<div class="paragraph"><font color="#da8044"><strong>Situation:</strong> </font><font color="#2a2a2a">Many CRM data source use varchar() GUID-looking values for primary keys. &nbsp;This blog post applies to any source system for a reporting and analytics project that uses text/string/character values to join transactional tables. &nbsp;Below are example PKs from the SalesForce Opportunity table.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/salesforce-pks_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#2a2a2a">If you have the privilege of a data warehouse, the extract, transform and load (ETL) process often, as best practice, replaces source system PKs with data warehouse identity seed integer values. &nbsp;However, with the trends in data mashups (Excel Power Query and Power BI Query Editor), this may not be happening. &nbsp;Also, some PKs, like the ones pictured above, are often brought forward into a data warehouse as a secondary "business key" and users are pulling them into their report data sources for drill-down / source system lookup capabilities.</font></div>  <div class="paragraph"><strong style="color:rgb(153, 153, 153)"><font color="#da8044">Problem:</font></strong><font color="#2a2a2a">&nbsp;String values do not compress as well as integer values, so when using these varchar() PKs in multidimensional cubes, tabular models, Excel Power Pivot and Power BI (PBI) Desktop,</font><strong style="color:rgb(42, 42, 42)"> file or memory sizes increase exponentially</strong><font color="#2a2a2a">. &nbsp;As of January 2017, PBI in memory files have a maximum file size of 250MB. &nbsp;This can be highly problematic as explained by my Pragmatic Works colleague, </font><a href="http://www.rachaelmartino.com/">Rachael Martino</a><font color="#2a2a2a">, in her SQL Saturday presentation </font><em><font color="#2a2a2a">Tips and Techniques for Power BI</font></em><font color="#2a2a2a">. &nbsp;(You can find a corresponding blog post from Rachael </font><a href="http://www.rachaelmartino.com/2016/09/performance-tips-and-techniques-for.html">here</a><font color="#2a2a2a">.) With her permission, I have borrowed the following screen print which shows the problem and resolution&nbsp;result clearly.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/salesforce-pks-2_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#da8044"><strong>Summary Resolution:&nbsp;</strong></font><font color="#2a2a2a">Assign a unique integer value to each varchar() primary&nbsp;key value. &nbsp;This may be easier said then done, but look at the result above. &nbsp;On the left is memory consumption "Before" by a SalesForce varchar() PK. &nbsp;When an integer value was substituted "After", memory size dropped from 25,563.63KB to 0.12KB.</font><br /><br /><font color="#da8044"><strong>Resolution&nbsp;Illustrated:</strong></font><font color="#2a2a2a"> For the next screen print I totally cheated and used the t-sql ROW_NUMBER() and RANK() functions to illustrate my point and assign a unique integer to each varchar() value. &nbsp;However, there are at least three potential problem here:</font><br /><font color="#2a2a2a">1. &nbsp;NewAccountID and NewRecordTypeID share the same integer value. &nbsp;This may be okay -- it depends how your ETL is written. &nbsp;</font><br /><font color="#2a2a2a">2. &nbsp;If you are working in Excel Power Pivot, SSAS data source or Power BI query editor, you do not have the ETL capabilities that will push these </font><em style="color:rgb(42, 42, 42)">same integer values into </em><em style="color:rgb(42, 42, 42)"><em>multiple</em> child tables</em><font color="#2a2a2a">.</font><br /><font color="#2a2a2a">3. &nbsp;If you are working in Azure DW, as of January 2017 Azure DW did not have auto-incrementing identity seed capabilities, but that is a semi-related topic for another day.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/salesforce-pks-3_1_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph"><font color="#da8044"><strong>Creative Problem Solving, Please: </strong></font><font color="#2a2a2a">When I find myself in a bit of a fix like this, the answer is always the same: <strong>What is best for reporting and analytics (R&amp;A)? </strong>&nbsp;ETL (or ELT) is not the spoke of my </font><em><a href="http://www.delorabradish.com/modeling-for-bi/the-big-picture-what-is-in-the-center-of-your-bi-wheel"><font color="#da8044">BI Wheel</font></a></em><font color="#2a2a2a">. &nbsp;In fact, data transformation frequently writes a check payable to Father Time to make a better R&amp;A experience. &nbsp;This is another one of those instances. &nbsp;You should handle this in your source-to-data warehouse data integration step.<br /><br />As a last resort, you can play ROW_NUMBER() and RANK() games inside your data source views. &nbsp;You can also continue to use these varchar() PK values for table relationships inside of SSAS tabular models, but be sure to 'Hide from Client Tools' so they don't end up being pulled into PBI memory or used as slicers or column values. &nbsp;If you are using tabular models and include these columns in your design, there is no way around paying the memory price in your SSAS processed model. &nbsp;Multidimensional cubes will throw a warning for bad cardinality of a dimension attribute, but if you do not place them inside any *.dim, and only use them for relationships in your DSV, you should be okay. &nbsp;There really is no happy ending here if you cannot get rid of these things from within medium to large sized data sets.</font><br /><br /><font color="#2a2a2a">Let's remember, each MS BI tool is designed for a specific purpose. &nbsp;SSRS is a reporting tool although it can also provide dashboards. &nbsp;Power BI Desktop is designed for analysis of aggregated data -- not paginated granular reporting. &nbsp;Consequently, if we use each MS BI tool for what it does best, a SSAS Action or Power BI link to a granular SSRS report can be a good solution here. &nbsp;"Simply" pass a set of input parameters to SSRS and present to the user only the varchar() values needed.</font><br /></div>]]></content:encoded></item><item><title><![CDATA[Data Modeling for BI: Dimensions vs Facts﻿]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/data-modeling-for-bi-dimensions-vs-facts]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/data-modeling-for-bi-dimensions-vs-facts#comments]]></comments><pubDate>Sat, 07 Nov 2015 03:10:19 GMT</pubDate><category><![CDATA[Architecture]]></category><category><![CDATA[Dimensions]]></category><category><![CDATA[Mesaures]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/data-modeling-for-bi-dimensions-vs-facts</guid><description><![CDATA[The intent of this blog post isn't to rewrite the Kimball Group Reader. &nbsp;Below is just my simple summary of what constitutes a subject area dimension vs a fact (a group of measures).        [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;"><font color="#2a2a2a">The intent of this blog post isn't to rewrite the</font> <a target="_blank" href="http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/books/kimball-reader/">Kimball Group Reader</a>. &nbsp;<font color="#2a2a2a">Below is just my simple summary of what constitutes a<em> subject area</em> dimension vs a fact (a group of measures).</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/2012977_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>]]></content:encoded></item><item><title><![CDATA[Data Modeling for BI Analytics vs Reporting]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/bi-reporting-vs-analytics]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/bi-reporting-vs-analytics#comments]]></comments><pubDate>Sat, 07 Nov 2015 02:33:54 GMT</pubDate><category><![CDATA[Analytics]]></category><category><![CDATA[Denormalization]]></category><category><![CDATA[Modeling]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/bi-reporting-vs-analytics</guid><description><![CDATA[To create data models for business intelligence, you first need to understand your BI Blueprint, then it is pretty critical to truly understand the difference between reporting and analytics (R&amp;A) data models. &nbsp;Modeling data for R&amp;A happens in pipes #5, #7 and sometimes (but not optimally) in the DSVs (data source views) found in pipe #8. &nbsp;Keeping in mind that tables arranged in a circle does not a star schema make, below is a slide that articulates in part the difference betwe [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;"><font color="#2a2a2a">To create data models for business intelligence, you first need to understand your <a href="http://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation" target="_blank">BI Blueprint</a>, then it is pretty critical to truly understand the difference between reporting and analytics (R&amp;A) data models. &nbsp;Modeling data for R&amp;A happens in pipes #5, #7 and sometimes (but not optimally) in the DSVs (data source views) found in pipe #8. &nbsp;<br /><br />Keeping in mind that tables arranged in a circle does not a star schema make, below is a slide that articulates in part the difference between the two. &nbsp;Think of reporting as a pile of Tinker Toys -- you SELECT tables and JOIN...JOIN...JOIN to a bunch more. &nbsp;Analytics is about flattened "denormalized" data arranged into <em>subject area </em>dimensions and measure groups, preferably with pre-processed totals, stored like Rubik's Cube.</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/9082028_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>  <div class="paragraph" style="text-align:left;"><font color="#2a2a2a">If you are serious about data modeling for business intelligence, dig deep into the following concepts each which are worthy of&nbsp;individual&nbsp;blog posts.</font><ol><li><font color="#2a2a2a"><em>Subject area d</em>imensions vs tables that contain codes and&nbsp;relate to transactional records</font></li><li><font color="#2a2a2a">Slowly changing dimensions</font></li><li><font color="#2a2a2a">Correct modeling of deleted rows</font></li><li><font color="#2a2a2a">Denormalization techniques</font></li><li><font color="#2a2a2a">Degenerate&nbsp;dimension and techniques for handling large dimensions in cubes</font></li><li><font color="#2a2a2a">Modeling for many-to-many relationships</font></li><li><font color="#2a2a2a">Modeling for predictive analytics</font></li><li><font color="#2a2a2a">Modeling for ABC&nbsp;(audit, balance and control) aka metadata and data verification</font></li></ol><font color="#2a2a2a"><br />When I get my blogging juices on, I'd like to post a bit about each one. &nbsp;In the interim, you can contact me in</font>&nbsp;<a href="http://www.delorabradish.com/about.html" target="_blank">About</a>.<br /><br /></div>]]></content:encoded></item><item><title><![CDATA[Your BI Blueprint: Road to a Successful BI Implementation]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation#comments]]></comments><pubDate>Sat, 07 Nov 2015 01:41:18 GMT</pubDate><category><![CDATA[Architecture]]></category><category><![CDATA[BI Blueprint]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/your-bi-blueprint-road-to-a-successful-bi-implementation</guid><description><![CDATA[Inserted below is a slide I use when talking about data modeling for MS BI. &nbsp;(If you have brought me into your company for MS BI mentoring or training, you already have a version. &nbsp;:-) ). &nbsp;I am posting it here because if you are planning a BI project, you need your own version of one of these! &nbsp;Why?Avoid common mistakes, like jumping from pipe #1 to pipe #9 (that is the&nbsp;roadmap&nbsp;for operational reporting).Make a conscious decision to possibly skip certain pipes vs an [...] ]]></description><content:encoded><![CDATA[<div class="paragraph" style="text-align:left;"><font color="#2a2a2a">Inserted below is a slide I use when talking about data modeling for MS BI. &nbsp;(If you have brought me into your company for MS BI mentoring or training, you already have a version. &nbsp;:-) ). &nbsp;I am posting it here because if you are planning a BI project, you need your own version of one of these! &nbsp;Why?</font><ol><li><font color="#2a2a2a">Avoid common mistakes, like jumping from pipe #1 to pipe #9 (that is the&nbsp;roadmap&nbsp;for operational reporting).</font></li><li><font color="#2a2a2a">Make a conscious decision to possibly skip certain pipes vs an oversight that might require you to backtrack and redo later on in your project.</font></li><li><font color="#2a2a2a">Provide adequate timelines to administration because you haven't missed planning a critical step.</font></li><li><font color="#2a2a2a">See the need and plan for hardware for each pipeline.</font></li><li><font color="#2a2a2a">Provide a non-technical explanation of BI steps to Administration.</font></li></ol><font color="#2a2a2a"><br />Please allow me to encourage you -- open Visio and get blueprinting!! &nbsp;For a deeper dive into a BI blueprint for your company, drop me a note under the <a href="http://www.delorabradish.com/about.html" target="_blank">About</a>&nbsp;&nbsp;section of this site.<br />&#8203;</font></div>  <div><div class="wsite-image wsite-image-border-none " style="padding-top:10px;padding-bottom:10px;margin-left:0;margin-right:0;text-align:center"> <a> <img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/8535255_orig.png" alt="Picture" style="width:auto;max-width:100%" /> </a> <div style="display:block;font-size:90%"></div> </div></div>]]></content:encoded></item><item><title><![CDATA[The Big Picture: What is in the Center of Your BI Wheel?]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/the-big-picture-what-is-in-the-center-of-your-bi-wheel]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/the-big-picture-what-is-in-the-center-of-your-bi-wheel#comments]]></comments><pubDate>Mon, 25 May 2015 01:19:45 GMT</pubDate><category><![CDATA[Architecture]]></category><category><![CDATA[BI Blueprint]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/the-big-picture-what-is-in-the-center-of-your-bi-wheel</guid><description><![CDATA[ What purpose is driving your BI project? &nbsp;(I am talking about true OLAP in this blog post, not OLTP.) &nbsp;When you have an ETL or hardware choice to make, you make your decision based on what is best for what? &nbsp;Please allow me to suggest to you that a BI project should have one (1) central purpose: reporting and analytics (R&amp;A). &nbsp;Period. &nbsp;The end.If this is true, (and I ask this with kindness), how then can hardware, network, your data model, data integration and data  [...] ]]></description><content:encoded><![CDATA[<span class='imgPusher' style='float:right;height:0px'></span><span style='display: table;width:403px;position:relative;float:right;max-width:100%;;clear:right;margin-top:0px;*margin-top:0px'><a><img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/published/oledb-vs-ado-net-02.png?1499897587" style="margin-top: 5px; margin-bottom: 10px; margin-left: 0px; margin-right: 10px; none; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="text-align:justify;display:block;"><font color="#2a2a2a">What purpose is driving your BI project? &nbsp;(I am talking about true OLAP in this blog post, not OLTP.) &nbsp;When you have an ETL or hardware choice to make, you make your decision based on what is best for what? &nbsp;Please allow me to suggest to you that a BI project should have one (1) central purpose: </font><font color="#8d2424"><strong>reporting and analytics (R&amp;A)</strong></font><font color="#2a2a2a">. &nbsp;Period. &nbsp;The end.</font><br /><font color="#2a2a2a"><br />If this is true, (and I ask this with kindness), how then can hardware, network, your data model, data integration and data visualization choices be made without R&amp;A clearly defined?<br /><br />For instance, I have often thought often that the&nbsp;most critical wheel spoke&nbsp;of a successful BI implementation is the data&nbsp;model. &nbsp;"Build it and they will come" is not a really good catch phrase for a BI project because you may end up building a football stadium when your users intended to play basketball. &nbsp;You can&nbsp;retrofit&nbsp;your football field, but&nbsp;wouldn't it have been a lot better (and cheaper) if you had built a basketball court to start?<br /><br /><strong><em>Possible&nbsp;indicators&nbsp;</em>that a BI model was not written with R&amp;A in mind:</strong><br /></font><ol style=""><li style=""><font color="#2a2a2a"><span style="">There are composite keys used to join tables</span><br /></font></li><li style=""><span style=""><font color="#2a2a2a">My time dimension table (like DimDate or DimCalendar) does not have a smart PK (primary key) of YYYYMMDD.</font></span></li><li style=""><font color="#2a2a2a"><span style="">99.9% of the reports have to remember to filter "WHERE IsDeleted = False"</span><br /></font></li><li style=""><span style=""><font color="#2a2a2a">I have to UNION measures together from multiple FACT tables.</font></span></li><li style=""><span style=""><font color="#2a2a2a"><span style="">There is no single-view-per-table design methodology in my EDW whereby&nbsp;</span>omitting<span style="">&nbsp;a necessary level of protection&nbsp;between my EDW and my consuming applications</span></font></span></li><li style=""><span style=""><font color="#2a2a2a">I have to replicate simple addition and/or subtraction calculations in my SSAS, Excel and SSRS DSVs (data source views)</font></span></li><li style=""><span style=""><font color="#2a2a2a">When the majority of my R&amp;A requirements are "current attribute", not "attribute value at time of fact", yet the FK (foreign keys) in my FACT table(s) all require me to join "WHERE fact.FK = dim.PK and fact.date between dim.EffDate and dim.ExpDate"</font></span></li><li style=""><span style=""><font color="#2a2a2a">My dimension foreign key in my FACT table(s) does not allow me to join to my SCD (slowly changing dimension) table(s) &nbsp;with</font></span></li></ol><font color="#2a2a2a">Let's talk about the infrastructure team for a minute. &nbsp;When we told them we were putting up a SQL Server data warehouse, did we provide anticipated data size over the next 12 and 24 months, AND take the time to explain that SQL Server works best when a table to assigned to a filegroup that in turn is dedicated to physical resources, not virtual? &nbsp;Did we draw out our BI blueprint showing separate DEV and PROD environments and sizing all requirements for reporting workloads taking into consideration time zones and fluctuating business hours? &nbsp;I find that most&nbsp;infrastructure&nbsp;teams are eager to please, but it is my&nbsp;responsibility&nbsp;to look at the big picture and plan hardware&nbsp;and&nbsp;network requirements NOT FOR ETL alone, but&nbsp;ultimately&nbsp;for R&amp;A.<br /></font><br /><font color="#2a2a2a"><strong><em>Possible indicators</em> that hardware was not&nbsp;specified&nbsp;with R&amp;A in mind:</strong></font><br /><ol><li><span style="color: rgb(42, 42, 42); line-height: 1.5; background-color: initial;">There are not separate DEV, QA prePROD and PROD environments</span><br /></li><li><span style="color: rgb(42, 42, 42); line-height: 1.5; background-color: initial;">SSIS and SSAS are running on the same box&nbsp;</span><br /></li><li><span style="color: rgb(42, 42, 42); line-height: 1.5; background-color: initial;">There is only one SSRS server or SSRS is running on the SharePoint box</span><br /></li><li><span style="color: rgb(42, 42, 42); line-height: 1.5; background-color: initial;">The PROD environment is virtualized</span></li><li><font color="#2a2a2a">The BI team is expected to create an enterprise solution using SQL Server STD edition</font></li></ol><br /><font color="#2a2a2a">I am sure you can come up with actual indicators for your own projects, but the point is this: &nbsp;EVERYTHING we do in our BI projects should take into consideration R&amp;A. &nbsp;Pick a wheel spoke -- any wheel spoke -- when you draw your blueprint for that spoke, UNDERSTAND your complex business logic and KNOW your reporting requirements.</font><br /><br /><font color="#2a2a2a">What do you think? &nbsp;Here are a few </font><font color="#8d2424"><strong>talking points</strong></font><font color="#2a2a2a"> for your BI team:<br /></font><ol style=""><li><font color="#2a2a2a"><span style="line-height: 1.5; background-color: initial;">Returning to the top of the post,&nbsp;</span><span style="line-height: 1.5; background-color: initial;">how will you make hardware, network, data model, data integration and data visualization choices without R&amp;A clearly defined?</span></font></li><li><span style="line-height: 1.5; background-color: initial;"><font color="#2a2a2a">Whiteboard your own BI wheel. &nbsp;What are your wheel spokes?</font></span></li><li><font color="#2a2a2a"><span style="line-height: 1.5; background-color: initial;">Now put an inner tube on your wheel called "</span><span style="line-height: 1.5; background-color: initial;">audit, balance and control".</span></font></li><li style=""><font color="#2a2a2a"><span style="">Put a tire on your wheel called "</span><span style="line-height: 1.5; background-color: initial;">performance expectations".</span></font></li><li style=""><span style=""><font color="#2a2a2a">Last, shine up that wheel with "data governance".</font></span></li></ol><br /><font color="#2a2a2a">How do all of your choices support R&amp;A?</font></div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>]]></content:encoded></item><item><title><![CDATA[The Big Picture: A BI Project is Like Building a House]]></title><link><![CDATA[https://www.delorabradish.com/modeling-for-bi/the-big-picture-a-bi-project-is-like-building-a-house]]></link><comments><![CDATA[https://www.delorabradish.com/modeling-for-bi/the-big-picture-a-bi-project-is-like-building-a-house#comments]]></comments><pubDate>Sun, 24 May 2015 21:15:21 GMT</pubDate><category><![CDATA[Architecture]]></category><category><![CDATA[BI Blueprint]]></category><guid isPermaLink="false">https://www.delorabradish.com/modeling-for-bi/the-big-picture-a-bi-project-is-like-building-a-house</guid><description><![CDATA[ When you build a house, you start from the foundation and work up. &nbsp;When you build a BI solution, it is logical to start from the foundation and build up as well. &nbsp;However, what I see often is someone working on the house roof (reporting) before there is a foundation (data model, integration, security ...).We all understand that some houses are pre-fabricated and the individual pieces are build independent of each other and then somehow come together in one miraculous final push to pr [...] ]]></description><content:encoded><![CDATA[<span class='imgPusher' style='float:right;height:60px'></span><span style='display: table;z-index:10;width:303px;position:relative;float:right;max-width:100%;;clear:right;margin-top:20px;*margin-top:40px'><a><img src="https://www.delorabradish.com/uploads/5/3/4/3/53431729/4455225.jpg?287" style="margin-top: 5px; margin-bottom: 10px; margin-left: 0px; margin-right: 10px; none; max-width:100%" alt="Picture" class="galleryImageBorder wsite-image" /></a><span style="display: table-caption; caption-side: bottom; font-size: 90%; margin-top: -10px; margin-bottom: 10px; text-align: center;" class="wsite-caption"></span></span> <div class="paragraph" style="text-align:justify;display:block;"><font color="#2a2a2a">When you build a house, you start from the foundation and work up. &nbsp;When you build a BI solution, it is logical to start from the foundation and build up as well. &nbsp;However, what I see often is someone working on the house roof (reporting) before there is a foundation (data model, integration, security ...).<br /><br />We all understand that some houses are pre-fabricated and the individual pieces are build independent of each other and then somehow come together in one miraculous final push to production. &nbsp;However, in my opinion, that is not the "industry standard" and as a BI consultant, I will rarely recommend any other BI build method other than "from the ground up".&nbsp;</font><br /><br /><font color="#2a2a2a">I consider the house roof, reporting and analytics, to be the "fun" part of every BI project because it is the most visible. &nbsp;(I have observed that those who get to write reports and create dashboards often progress quickly to hero status.) However, it is the responsibility of <u>each team member</u> of a BI project to produce a product that has these characteristics:</font><br /><font color="#2a2a2a">&nbsp; &nbsp; 1. &nbsp;<strong>Accurate </strong>(is dependable and truthful)</font><br /><font color="#2a2a2a">&nbsp; &nbsp; 2. &nbsp;<strong>Scalable</strong> (can grow and change in step with business fluctuations)</font><br /><font color="#2a2a2a">&nbsp; &nbsp; 3. &nbsp;<strong>Discoverable</strong> (you can find things you want, such as metadata, measures and attributes)</font><br /><br /><font color="#8d2424"><strong>Talking Points:</strong></font><br /><ol><li><span style="color: rgb(42, 42, 42); line-height: 1.5; background-color: initial;">If you have data transformations happening in SSIS, your EDW views, your SSAS MultiD and tabular model DSVs, in MDX and DAX formulas,, and in Excel Power Query and PowerPivot, how accurate do you anticipate your DW information to truly be? &nbsp;Where there is a question and you have to <strong>prove the actual version of the truth</strong>, how many places do you want to 1.) search and then 2.) fix?</span><br /></li><li><span style="background-color: initial;"><font color="#2a2a2a"><span style="line-height: 1.5;">If you have star schema, snowflake schema and 3NF (3rd normal form) data models in addition to aggregated data marts and&nbsp;</span>disparate<span style="line-height: 1.5;">&nbsp;data stored only in Excel or SharePoint lists (for possibly very good reasons), how easy will it&nbsp;be to add a new subject area, corporate department, or newly acquired company? &nbsp;&nbsp;</span></font></span><br /></li><li><span style="background-color: initial;"><font color="#2a2a2a"><span style="line-height: 1.5;">Very few companies can build the Taj MaBIsolution right out of the gate. &nbsp;Often MDM (master data management) falls prey to budget and time constraints. &nbsp;However, we are&nbsp;</span>building<span style="line-height: 1.5;">&nbsp;a house here and if my house has&nbsp;four kitchens (customer tables), where should we tell the marketing department to eat their lunch (discover the most recent customer data)?</span></font></span></li><li><span style="color: rgb(42, 42, 42); line-height: 1.5; background-color: initial;">In MS BI, there are many ways to get to the same finish line, but where will you decide is <strong>the proper place for ...</strong></span><br /></li></ol><font color="#2a2a2a">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Data integration -- SSIS, SSAS, views, USPs, UDFs, DSVs, Excel</font><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Self-Service BI -- PPS (performance point services), Excel, Report Builder</font><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Automated Reporting -- SSRS, Excel</font><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Source code control -- TFS (team foundation server), corporate file store</font><br /><font color="#2a2a2a">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Documentation -- in-line code, TFS, file store, DMVs, Visio, Word documents</font></div> <hr style="width:100%;clear:both;visibility:hidden;"></hr>]]></content:encoded></item></channel></rss>