Tuesday, 22 April 2014

Power Pivot with DAX Expressions


PowerPivot is an innovative adaptation of the "Business Intelligence" software used by large companies to analyze data.   It is available from Microsoft as a free add-in to Microsoft Excel 2010.
You do not need to be a programmer to use PowerPivot. It was designed for the end user.
Every quarter your team spends countless hours preparing reports for the board meeting. Getting data from different reports, entering data into Excel, preparing exhibits and checking and rechecking formulas. After you get the first set of charts ready then you need even more – to explain this quarters variations. So your team spends more hours slicing and dicing the data, preparing exhibits and checking results.  

Best of all, to update the report for a new month or quarter all you have to do is press a “refresh” button. No more entering data, rebuilding charts of rechecking formulas. When you need more detail to explain the results you can use the “slicer buttons” on the right to instantly change the analysis.

To convince him you prepared a walk thru of the four steps you used to prepare the dashboard. Each to the four steps is covered in an appendix to this document.
Step 1:  Think about the data you want to measure.
This is the most important step in the process. Whatever you want to improve, start by measuring it. An easy to maintain dashboard will help to focus everyone’s attention on improvement.
After deciding what to measure you need to find the data. This might require meeting with a technical resource. In this example you got the data from six interconnected database tables and two Excel worksheets.
Appendix 1 - Think about the data you want to measure.

You want to compare actual sales data with the sales budget and with marketing expenses. 

  • Actual sales data is in the company’s database.
  • The sales budget and marketing expense data are in Excel spreadsheets.
You will want to meet with the database technical support person to identify the tables where the data is stored. If your database is like most you will have to use several tables to get the data you want. This example uses a database sample from Microsoft where the Sales Order Detail contains the sales amount and product code, the Sales Order Header table contains the sales date and territory, the Product table contains the Product Name and Product Sub Category Code, the Product Subcategory contains the Subcategory name and Product Category Code and the Product Category table contains the Category name.
This all sounds complicated but when PowerPivot imports the data from the database it also imports the relationships. There usually is nothing for you to do before you can actually use the data. You don’t need to understand anything about databases!
As in most analyses data you need is not all in the database. In this example the sales budget and marketing expense data are in an Excel spreadsheet. No problem, the PowerPivot database can combine data from multiple sources. Since each row of the Excel data contains the same Territory ID and Product ID codes that are in the database, PowerPivot can combine the data.
Step 2: Connect Excel to the data

PowerPivot has all you need to connect to databases, Excel, Access, text files or sources of data from the web. Once connected PowerPivot stores a copy of the data in Excel. Millions of rows of data can be effectively stored by PowerPivot using a new technology that compresses the data.
When you are ready to update the data you just click on the PowerPivot “Refresh” button and an updated copy of the data in copied into the PowerPivot database.
Appendix 2 – Connect Excel to the data
Starting from a blank workbook, you first open the PowerPivot Window, connect to the sales database and select the sales order header and detail tables, the sales territory table and the product, product sub category and product category tables:
Opening the PowerPivot Window and connecting to the database. 

After connecting to the database, data from the six tables has been imported to the PowerPivot database in the Excel workbook.

Even though PowerPivot has lots of data (121,317 Sales Order Detail records in the above example), workbook size remains manageable.
Pressing the refresh button will reimport an updated copy of the data.
Now you connect to the Excel data. First step is to copy the sales budget into the blank workbook. Make the copied list into a table and create a liked table in PowerPivot.
 
Sales budget copied to Excel.
Add the Excel table to the PowerPivot database.

And the table is included in the PowerPivot database                            .

 
You follow the same steps to copy the Marketing Expenses into Excel and then add to PowerPivot.

One last thing to do in the database is to make a “Date Table”. This is always a good idea for two reasons:
You will want to use a different date measure in your report than exists in the data. In this example you want to report by month and the data is by month, day and year. Also, PowerPivot will sort February before January so it will be helpful for the date table to format the months as yyyy – mm (2011 -12).
You need to filter the actual sales data, budget data and marketing expense data on sales date. Having all three types of data linked to a common date table will speed this process.

 
You build the date table in Excel and link to PowerPivot.
PowerPivot has already imported these relationships from the database:

You need to add the relationships in yellow below so that you can filter across the data month and year.
 
The last three relationships relate the sales budget and the marketing expenses to the sales territory and product tables. These relationships would be automatically added by PowerPivot the first time you produced a report of the data. Since you had to add the date relationships you thought it was easier to add them at the same time.
 

Step 3: Prepare Excel charts to show the measures
You can make Excel pivot tables and pivot charts from the data in the PowerPivot database. Add one pivot table to a worksheet. If you need a pivot chart add the chart to the same worksheet. These worksheets will be hidden in the final dashboard. Use them to get your data and measures right.
Appendix 3 – Prepare Excel charts to show the measures
You request a new pivot table from the PowerPivot window.
The PowerPivot pivot table can combine data for all 9 data tables.
You choose to sum line total and budget sales by category.
Next you add a Pivot Chart

 
 

Step 4: Copy the charts to a single worksheet and add slicers
Copy charts from the pivot table worksheet to your dashboard sheet. Arrange the charts on the dashboard. Then add “Slicers” which are groups of buttons used so filter the data in your charts. Each slicer can be attached to multiple pivot tables. When you click on a year in the slicer on the dashboard then each of the pivot tables attached to that slicer will show data only for that year. In this way slicer buttons on the dashboard worksheet control what data is shown on the dashboard charts.
Your boss is not very fluent in Excel so as you expected his first question was “Why couldn’t you do this with Excel 2003?” You explained that Excel 2010 has three new features that are all part of the new Excel PowerPivot Self-Service BI capability.
Feature 1: Excel PowerPivot allows you to create a simple database within Excel. In the example above you created a database inside of Excel with 9 tables: 6 from the company’s database system and 3 from Excel worksheet data. You were able to specify relationships between the tables so that data could be combined in my charts.
Feature 2: Excel PowerPivot pivot table and charts allowed you to quickly prepare the charts. Unlike the lightweight pivot tables in Excel 2003, these pivot tables can combine data from different tables and come with new formulas that are essential for serious analysis.
At first glance PowerPivot pivot tables and pivot charts are similar to the pivot table and charts that have been in Excel since 1993. But the new PowerPivot pivot table and pivot charts have two very significant differences:
  1. They can combine data from multiple tables. (The old pivot tables required you to get all the data in one table.)
  2. The new Data Analysis Expression (DAX) formula language that makes it easy to do calculations within the pivot table and line up comparative measures like “”Prior Year To Date”.
Feature 3: Excel Slicers provide a way of selecting data so that multiple charts can be shown together and controlled by one set of buttons.
You told your boss that this is the level of BI that fits your company’s current needs. You know it wouldn’t meet the needs of a much larger company with more IT resources but it can make a real difference in what your company is able to accomplish with Excel.
Appendix 4 - Copy the charts to a single worksheet and add slicers
You copy the Pivot Chart and paste to a new blank worksheet. Then you add slicers.
One slicer can attach to multiple pivot tables.
 

Your dashboard is well underway!


Data Analysis Expressions (DAX) is the standard PowerPivot formula language that supports custom calculations in PowerPivot tables and Excel PivotTables. While many of the functions used in Excel are included, DAX also offers additional functions for carrying out dynamic aggregation and other operations with your data. Bill's new series serves as a progressive introduction to PowerPivot and DAX formulas, examining the functions, operators and values involved, and offering a wealth of practical examples.

As we know DAX and MDX expressions can be used to query Tabular Model in Analysis services. Keep in mind that DAX does not work in SSAS Multidimensional Cube for querying data. We will understand basics of producing results from DAX same as MDX. DAX is promisingly giving quick result than MDX.  So if you are using Tabular model cube solution, then prefer writing DAX unless a complex logic came up which can only be resolved through MDX.

Behaviour of DAX expressions:
  • DAX is working as a SQL language with difference of functions and syntaxes
  • We need not to write JOINS and GROUP BY clauses in DAX as it already understand that the purpose is to retrieve data in this form. Relationships among the tables of Tabular Model/ Power Pivot model are used for getting JOINS in the DAX query execution.
  • We need to understand that DAX is not a query language, it’s a query expression. So we are using some table expressions in DAX to retrieve data in table format and further it is used as result-set/ dataset for the SSRS report.
  • To evaluate DAX expressions use MDX query pane in SSMS
Functions

We can now move on to some useful functions for writing DAX as in query language. Please note that list of DAX function is very big but here we are trying to find out the functions which can be used to get data in the form of result set, same as MDX and DAX.
In this series next functions are following:

EVALUATE:
  • First step to start with DAX Query Reference. This function can be used to get the data in result set from a table.
    EVALUATE <<Tabular Expression or Name of the table>>
    Argument to the Evaluate function can only be a table name or any expression that return a table. Evaluate   function is added up with different other functions (like VALUES, SUMMARIZE, ADDCOLUMNS, TOPN etc.) to produce different results as per requirements.
    e.g.  Get table data of Employee table in “AdventureWorks Tabular Model SQL 2012”.
    EVALUATE 'Employee'
    As per behaviour of DAX, this is similar to SQL as:
    SELECT * from [Employee]
    We cannot compare here MDX for this DAX, as there is no way of getting table level attributes in MDX. We need to specify attributes required in MDX SELECT statement.
SUMMARIZE:
  • Function return required totals and counts, based on several groups (of different columns or say attributes from different tables in Tabular Model).
    We use SUMMARIZE along with EVALUATE for getting desired result in SSMS.
    SUMMARIZE (<<Table>>, <<Grouping Column1>>, <<Grouping Column2>>,,,<<Name for desired Aggregation1>>, <<Expression for Aggregation or Calculated MeasureName1>>, <<Name for desired Aggregation2>>, <<Expression for Aggregation or Calculated MeasureName2>>)
    Although it looks like a complex syntax but use is very simple and there are so many things we can do with this function.
    E.g. Let say you want to see Currency, Customer ID and Internet Sales from Adventure Works tabular model. Here a SQL query on database can be like:
    SELECT [CurrencyName], [CustomerID],SUM([InternetTotalSales])
    FROM [InternetSales] A
    LEFT JOIN Currency B ON B.CurrencyID = A.CurrencyID
    LEFT JOIN Customer C ON C.CustomerID = A.CustomerID
    GROUP BY [CurrencyName], [CustomerID]
    Easy, but in MDX it can be as:

    SELECT
    NON EMPTY
    [Currency].[CurrencyName].[CurrencyName].MEMBERS*
    [Customer].[Customer Id].[Customer Id].MEMBERS ON 1
    ,[Measures].[Internet Total Sales] ON 0
    FROM [AdventureWorks_Test]
    Again easy.
    But in DAX, don’t worry it’s again easy. We need to keep in mind SUMMARIZE will need first Primary table (more of table which can be base of relationships among desired columns from different tables). Mostly these primary tables can be fact table, which relates different tables. Other than that JOINS and GROUB BYs are not required as SUMMARIZE take care of creating relationships and aggregating data for you.
    DAX for same situation is:
    EVALUATE
    SUMMARIZE('Internet Sales'
    ,'Currency'[CurrencyName]
    ,'Customer'[Customer Id]
    ,"Internet Sales",'Internet Sales'[Internet Total Sales])
    ORDER BY 'Currency'[CurrencyName], 'Customer'[Customer Id]
    Here 'Internet Sales'[Internet Total Sales] is already a calculated measure, so Name for Measure is given as "Internet Sales". We can replace it as SUM([InternetTotalSales]), which is aggregation that we need.
    ORDER BY is same as SQL ORDER BY.
Most of the complex queries in DAX revolve around SUMMARIZE, so it’s important to understand its basics.
ADDCOLUMNS:
  • The coolest function I’ll say. It gives you a power of creating calculated Column in a table (or tabular expression) at query Run time. I believe there is no way of doing this in MDX (can’t create an Attribute itself in MDX). ADDCOLUMNS(<Table or Tabular Expression>, “<Calculated column Name1>”, <Expression for Calculated column1>,. . . . . . )
    As in SQL we can create a new column while writing query.
    For example (simplest as possible):

    SELECT
    [FirstName], [MiddleName], [LastName]
    , ([FirstName] + ' ' + [MiddleName] + ' ' + [LastName])
    AS FullName
    FROM [dbo].[DimEmployee]
    Part of expected Result is:
    Now in DAX same query will be:
    EVALUATE
    SUMMARIZE(
    ADDCOLUMNS('Employee'
    ,"Full Name"
    ,CONCATENATE('Employee'[First Name], CONCATENATE(" ",
    CONCATENATE('Employee'[Middle Name], CONCATENATE(" ",
    'Employee'[Last Name])
    ))))
    ,[Full Name]
    )
    I know that String manipulation in DAX is quite a job, but there is no other way.
    Here if we do not use SUMMARIZE and EVALUATE on ADDCOLUMNS, it will give all the Columns in table Employee with added column as Full Name.
    But if you just want to see one column out of your new table (with added column) use SUMMARIZE to be specific.
    This function makes your life easy to get what you want at the level of Table, and solved many of my issues in SSRS reporting.
  • Some other functions are
  • Format
  • Search
  • Filter
  • Top N

  • Compare DAX vs MDX

  • Once I was asked to give an opinion on MDX vs DAX. But we know in Software world everything has some pros and cons when compared to other, otherwise if it just have cons it would be thrown to dustbin (say deprecated).
  • Well I tried a lot to look for direct comparison of MDX and DAX and to bring my best opinion as per Project Requirements but as usual,no answer in Google for “Compare MDX vs DAX”.
    I prepared a comparison as per my experience in both the languages. Keep in mind DAX is not a query language but it can be used as one, using DAX Query Syntax Reference.
    So below are some comparison points and I wish these will help you out.
    S.no
    DAX
    MDX
    Prefer
    1
    Performance
    DAX has some part of better performance on Tabular Model. I was looking at CPU Time comparison for same result set given by MDX and DAX. DAX wins all the time. Performance is good but when we compare with DAX (on tabular model), it’s a bit low.
    DAX
    2
    Support:
    Support In-Memory, DirectQuery and Hybrid query modes of Tabular Model No support for DirectQuery modes of Tabular Model.       DAX
    3
    Query Language:
    Not a query language, still we can leverage it by few work around (using tabular expressions):a)      Use it as query language in SSRS by writing it in place of DMXb)      Writing and executing it on MDX pane of SSMS to validate the query and data A well known query language and being used in SSRS and SSMS as separate query option of MDX.
    MDX
    4
    Ease of writing:
    Difficult to write as compare to MDX (although many blogs says it’s easy, with perspective of end user).Reason:
    1. It behaves as SQL query and we expect a Multidimensional Behaviour.
    2. Need use of lot of functions for even few small requirements.
    3. Need to know the appropriate relationship between the tables used (in case of non-related table in query, no result will come)
    4. Difficult to handle Non Empty behaviour through query
    5. Difficult to manipulate filtration in queries. Specially multi-valued filters.
    Easy to write as compared to DAX.
    MDX
    5
    Reporting supports:
    Long way to go for this investigation. (Decide your reporting tools requirement and investigate or query here in blog)It works with SSRS ( in place of DMX) and ADOMD.net (please refer the link) Works with SSRS, PPS and ADOMD.net code and lot many other reporting tools and codes.
    MDX
    6
    Support for Ad-hoc reporting:
    a)      Power View Reporting: No user interaction is required for querying the Tabular Model. So no use of DAX or MDX.b)      Power Pivot: End user needs to know DAX for creating new measure a)      Power View Reporting: No use.  b)      Power Pivot: No use
    Depends on end user, if he/ she know DAX.
    7
    Data Mining:
    Can’t be used for Data Mining purposes even in Excel. MDX is required for Data Mining
    MDX
    Assumptions:
    Project is on SQ Server 2012, BI Semantic Tabular Model. Else DAX will not work on Multidimensional SSAS solution anyhow.
    Conclusion:
    Out of 7 points, 4 support MDX and 2 DAX.
    But along with that, I believe for a long run MDX is better. It’s been used in many legacy system, developers are more conformable and it provides more flexibility on different reporting services (as per my knowledge).
    Note: I will soon post on query behaviors of MDX and DAX. Saying DAX has better performance than MDX is not always true.
  • Slicers


    Excel 2010 makes it possible to insert slicers to quickly and easily filter pivot tables. However, using the report filter gives the exact same result.
    Below you can find a two-dimensional pivot table. Go back to Pivot Tables to learn how to create this pivot table.
    Two-dimensional Pivot Table in Excel
    To insert a slicer, execute the following steps.
    1. Click any cell inside the pivot table. The PivotTable Tools contextual tab activates.
    2. On the Options tab, insert a slicer.
    Insert a Slicer
    3. Check Category and click OK.
    Check Category
    4. For example, Click Fruit to only show the fruit exported to each country.
    Slicer Example
    Note: notice how the report filter changed to Fruit. Hold down CTRL to include fruit and vegetables.
    Did you find this information helpful? Show your appreciation, vote for us.
For more see this



2 comments: