Wednesday, 25 June 2014

SSIS Component Error Outputs

Overview

Error outputs are a secondary path through which the data flow can send rows that do not conform to data type, length, or transformation standards defined by the ETL developer.  That’s a lengthy way to say that it’s where you can send your junk data.  In the SSIS designer, clicking on a source or transformation will often show not one but two possible outputs: the primary output (the “good” data, indicated by the blue line) and the error output (identified by the red line).  As shown on the flat file source below, when selecting a source or transformation, those that have an available error output will appear with both output connectors ready for selection.
Using this design will allow you to handle errors such as:
  • Incorrectly typed data
  • String data longer than its specification
  • Invalid transformation logic
  • Relational constraint errors at the destination
Error outputs provide a great way to handle two things at once: providing a secondary path through which bad data can be either triaged or cleaned up, and preventing the package from failing based on just a few bad rows of data.
When connecting an error output to a downstream component, you’ll be prompted with a configuration window in which you can select the error output options.  The most important thing to remember here is that default behavior is to fail the component upon any error, even when you connect the source or transformation’s error output.  The default options are shown below.
As mentioned, you must explicitly set the behavior to Redirect row for the column(s) that will be handled by the error output.  You can do this on a column-by-column basis, if you want to handle error conditions on that level, but in most cases I see that all columns are set to either fail or redirect as a group.  In the screenshot below, I’ve set all columns to redirect upon error or truncation.

Practical use

So what happens when we connect a component’s error output?  In most cases, any rows in error would be redirected to that output, which we can use as a secondary path for cleanup or triage.  Assuming we change the error and truncation behavior to Redirect row, errors that are captured by the error output would not result in a failure of that source.
There are several design patterns surrounding the use of error outputs on sources or transformations.  Among the most common:
Cleanse inline.  If possible, the data can be sanitized inline using other SSIS components.  If the ETL can be built to handle most of the known data deficiencies, this is usually the cleanest way to handle error rows.  Very often, data quality tools such as SQL Server Data Quality Services can be useful for inline cleansing.
Send to triage.  If the package does not have cleansing logic, or there are rows that still cannot be cleansed using that logic, the bad data can be sent to triage for manual review or cleansing.  Though this pattern does require manual intervention, it also allows us to audit and review the reasons for those failures by preserving the data.
Ignore. It is possible to redirect error rows into the bit bucket, resulting in a loss of data.  In rare cases, this is acceptable; however, in most cases, there should be some measure of data capture to improve the quality of the ETL process.
Which is the best pattern to use?  As always, it depends.  Factors such as the criticality of the data, the types of errors expected, the frequency of the ETL process, and many others must be used to decide how, if at all, to use the error paths in SSIS.

Caveats

As with any ETL function, there are a few caveats and cautions around using error outputs.  First of all, I don’t recommend adding an error output path to every component that supports it simply because it’s available.  Redirecting error rows is not always appropriate – sometimes it makes sense to fail the package rather than trying to programmatically clean up or triage error data.
Additionally, you’ll need to familiarize yourself with the different types of error outputs generated by each component.  For example, if you use the error output of the flat file source, it will not break apart the data into individual columns (even for those elements not in error for that row) – because the error output is a relatively low-level function, what you get is a single string with all of the data in it, and if you have processes that consume this data downstream of the error output, you’ll need to do some transformation on the erroredrows before you can do any cleanup.
Finally, it is important to note that not all components support error outputs.  Because of the differences in the types of operations performed by each source, destination, or transformation component, some of them do not include an error output path.
http://blog.sqlauthority.com/2014/06/26/sql-server-ssis-component-error-outputs-notes-from-the-field-034/

Tuesday, 6 May 2014

SQL SERVER – Query to Find First and Last Day of Any month

Following queries will run respective to today’s date. It will return Last Day of Previous Month, First Day of Current Month, Today, Last Day of Previous Month and First Day of Next Month respective to current month...

First Day of Current Month

select dateadd(mm,datediff(mm,0,getdate()),0)

Last Day of Current Month

select dateadd(s,-1,dateadd(mm,datediff(mm,0,getdate())+1,0))

First Day of Last Month

select dateadd(mm,datediff(mm,0,getdate())-1,0)

Last Day of Last Month

select dateadd(s,-1,dateadd(mm,datediff(mm,0,getdate()),0))

First Day of Next Month

select dateadd(mm,datediff(mm,0,getdate())+1,0)

Last Day of Next Month

select dateadd(s,-1,dateadd(mm,datediff(mm,0,getdate())+2,0))

Major Date functions in SQL Server are

DateAdd()
DateDiff()
DataPart()
DateName()
Getdate()


 select CONVERT(DATE, dateadd(mm, datediff(mm,0, getdate())+1,-1))

 select CONVERT(DATE, CAST(YEAR(getdate()) AS VARCHAR(4)) + '/' + CAST(MONTH(getdate()) AS VARCHAR(2)) + '/' + '01', 111)  

Monday, 5 May 2014

How to get difference between two columns from two different tables?

Here you can find the differences between two columns from two different tables
Table t1
id price                            
1 12
2 10
3 18
Table t2
id price
1 22
2 6
3 18
o/p:
itemid    priceint1 priceint2 diff
1             12 22        -10
2              10 6          4
3              18 18           0

ABS It is  mathematical function that returns the absolute (positive) value of the specified numeric expression.

select coalesce(t1.id,t2.id) as itemid ,t1.price as priceint1,t2.price as priceint2,
abs(coalesce(t1.price,0)-coalesce(t2.price,0)) as [diff]
 from t1 full join t2
on t1.id=t2.id

 If you want with Negative values remove abs function from the query..

select coalesce(t1.id,t2.id) as itemid ,t1.price as priceint1,t2.price as priceint2,
(coalesce(t1.price,0)-coalesce(t2.price,0)) as [diff]
 from t1 full join t2
on t1.id=t2.id

2nd Way
DECLARE @T1 TABLE (Marks CHAR(2), Price SMALLINT);
DECLARE @T2 TABLE (Marks CHAR(2), Price SMALLINT);

INSERT INTO @T1 VALUES('A1',10);
INSERT INTO @T1 VALUES('A2',14);
INSERT INTO @T1 VALUES('A3',21);
INSERT INTO @T2 VALUES('A1',12);
INSERT INTO @T2 VALUES('A2',10);
INSERT INTO @T2 VALUES('A3',24);

SELECT t1.Marks AS ItemID, t1.Price AS PriceInT1
,ABS(t1.Price - t2.Price) AS Diff, t2.Price AS PriceInT2
FROM @T1 AS t1
INNER JOIN @T2 AS t2
ON t1.Marks = t2.Marks

Wednesday, 30 April 2014

Difference between SSIS 2008 and SSIS 2012


Since its initial release in Microsoft SQL Server 2005, Integration Services has had incremental changes in each subsequent version of the product. However, those changes were trivial in comparison to the number of enhancements, performance improvements, and new features introduced in SQL Server 2012 Integration Services.
The first change that you notice as you create a new Integration Services project is that Business Intelligence Development Studio (BIDS) is now a Microsoft Visual Studio 2010 shell called SQL Server Data Tools (SSDT). The Visual Studio environment alone introduces some slight user-interface changes from the previous version of BIDS.These enhancements to the interface help you to learn about the package-development process if you are new to Integration Services, and they enable you to develop packages more easily if you already have experience with Integration Services. If you are already an Integration Services veteran, you will also notice the enhanced appearance of tasks and data flow components with rounded edges and new icons.

Add New Project Dialog Box

To start working with Integration Services in SSDT, you create a new project by following the same
steps you use to perform the same task in earlier releases of Integration Services. From the File menu,

point to New, and then select Project.













General Interface Changes

After creating a new package, several changes are visible in the package-designer interface
 SSIS Toolbox You now work with the SSIS Toolbox to add tasks and data flow components to a package, rather than with the Visual Studio toolbox that you used in earlier versions of Integration Services. You learn more about this new toolbox in the “SSIS Toolbox” section of this chapter.
Parameters The package designer includes a new tab to open the Parameters window for a package. Parameters allow you to specify run-time values for package, container, and task properties or for variables, as you learn in the ”Parameters” section of this chapter.
Variables button This new button on the package designer toolbar provides quick access to the Variables window. You can also continue to open the window from the SSIS menu or by right-clicking the package designer and selecting the Variables command.
SSIS Toolbox button This button is also new in the package-designer interface and allows you to open the SSIS Toolbox when it is not visible. As an alternative, you can open the SSIS Toolbox from the SSIS menu or by right-clicking the package designer and selecting the SSIS Toolbox command.
Getting Started This new window displays below the Solution Explorer window and provides access to links to videos and samples you can use to learn how to work with Integration Services. This window includes the Always Show In New Project check box, which you can clear if you prefer not to view the window after creating a new project. You learn more about using this window in the next section, “Getting Started Window.”
 Zoom control Both the control flow and data flow design surface now include a zoom control in the lower-right corner of the workspace. You can zoom in or out to a maximum size of 500 percent of the normal view or to a minimum size of 10 percent, respectively. As part of the zoom control, a button allows you to resize the view of the design surface to fit the window.
Shared Connection Managers
If you look carefully at the Solution Explorer window, you will notice that the Data Sources and Data Source Views folders are missing, and have been replaced by a new file and a new folder. The new file is Project.params
Undo and Redo
A minor feature, but one you will likely appreciate greatly, is the newly added ability to use Undo and Redo while developing packages in SSDT. You can now make edits in either the control flow or data flow designer surface, and you can use Undo to reverse a change or Redo to restore a change you had just reversed. This capability also works in the Variables window, and on the Event Handlers and Parameters tabs. You can also use Undo and Redo when working with project parameters.To use Undo and Redo, click the respective buttons in the standard toolbar. You can also use Ctrl+Z and Ctrl+Y, respectively. Yet another option is to access these commands on the Edit menu.
Note The Undo and Redo actions will not work with changes you make to the SSIS Toolbox, nor will they work with shared connection managers.

Expression Task
Many of the developer experience enhancements in Integration Services affect both control flow and data flow, but there is one new feature that is exclusive to control flow. The Expression Task is a new item available in the SSIS Toolbox when the control flow tab is in focus. The purpose of this task is to make it easier to assign a dynamic value to a variable. Rather than use a Script Task to construct a variable value at runtime, you can now add an Expression Task to the workflow and use the SQL Server Integration Services Expression Language.When you edit the task, the Expression Builder opens. You start by referencing the variable and including the equals sign (=) as an assignment operator. Then provide a valid expression that resolves to a single value with the correct data type for the selected variable. Figure 6-6 illustrates an example of a variable assignment in an Expression Task.
Note The Expression Builder is an interface commonly used with other tasks and data flow components. Notice in Figure 6-6 that the list on the left side of the dialog box includes both variables and parameters. In addition, system variables are now accessible from a separate folder rather than listed together with user variables.
Execute Package Task
               
                                  The Execute Package Task has been updated to include a new property, Reference Type,which appears on the Package page of the Execute Package Task Editor. You use this property to specify the location of the package to execute. If you select External Reference, you configure the path to the child package just as you do in earlier versions of Integration Services. If you instead select Project Reference, you then choose the child package from the drop-down list. In addition, the Execute Package Task Editor has a new page for parameter bindings, as shown in Figure 6-7. You use this page to map a parameter from the child package to a parameter value or variable value in the parent package.



Source and Destination Assistants
The Source Assistant and Destination Assistant are two new items available by default in the Favorites folder of the SSIS Toolbox when working with the data flow designer. These assistants help you easily create a source or a destination and its corresponding connection manager

ODBC Source and Destination
The ODBC Source and ODBC Destination components, shown in Figure 6-8, are new to Integration
Services in this release and are based on technology licensed by Attunity to Microsoft. Configuration of these components is similar to that of OLE DB sources and destinations. The ODBC Source supports Table Name and SQL Command as data-access modes, whereas data-access modes for the ODBC Destination are Table Name – Batch and Table Name – Row By Row.
Embedded qualifiers Another challenge with the Flat File source in previous versions of Integration Services was the use of a qualifier character inside a string encapsulated within qualifiers. For example, consider a flat file that contains the names of businesses. If a single quote is used as a text qualifier but also appears within the string as a literal value, the common practice is to use another single quote as an escape character, as shown here.
ID,BusinessName
404,'Margie''s Travel'
406, 'Kickstand Sellers'

In the first data row in this example, previous versions of Integration Services would fail to interpret the second apostrophe in the BusinessName string as an escape character, and instead would process it as the closing text qualifier for the column. As a result, processing of the flat file returned an error because the next character in the row is not a column delimiter.This problem is now resolved in the current version of Integration Services with no additional configuration required for the Flat File source.

DQS Cleansing Transformation
The DQS Cleansing transformation is a new data flow component you use in conjunction with Data Quality Services (DQS). Its purpose is to help you improve the quality of data by using rules that are established for the applicable knowledge domain. You can create rules to test data for common misspellings in a text field or to ensure that the column length conforms to a standard specification.
To configure the transformation, you select a data-quality-field schema that contains the rules to apply and then select the input columns in the data flow to evaluate. In addition, you configure error handling. However, before you can use the DQS Cleansing transformation, you must first install and configure DQS on a server and create a knowledge base that stores information used to detect data anomalies and to correct invalid data, which deserves a dedicated chapter. We explain not only how DQS works and how to get started with DQS, but also how to use the DQS Cleansing transformation in Chapter 7, “Data Quality Services.”

Collapsible Grouping
Sometimes the data flow contains too many components to see at one time in the package designer,
depending on your screen size and resolution. Now you can consolidate data flow components into groups and expand or collapse the groups. A group in the data flow is similar in concept to a sequence container in the control flow, although you cannot use the group to configure a common property for all components that it contains, nor can you use it to set boundaries for a transaction or to set scope for a variable.
To create a group, follow these steps:
1. On the data flow design surface, use your mouse to draw a box around the components that you want to combine as a group. If you prefer, you can click each component while pressing the Ctrl key.
2. Right-click one of the selected components, and select Group. A group containing the components
displays in the package designer, as shown here:



3. Click the arrow at the top right of the Group label to collapse the group.

Data Viewer
The only data viewer now available in Integration Services is the grid view. The histogram, scatter plot,
and chart views have been removed.
To use the data viewer, follow these steps:
1. Right-click the path, and select Enable Data Viewer. All columns in the pipeline are automatically included.
2. If instead you want to display a subset of columns, right-click the new Data Viewer icon (a magnifying glass) on the data flow design surface, and select Edit.
3. In the Data Flow Path Editor, select Data Viewer in the list on the left.
4. Move columns from the Displayed Columns list to the Unused Columns list as applicable (shown next), and click OK.
Variables
A common problem for developers when adding a variable to a package has been the scope assignment.
If you inadvertently select a task in the control flow designer and then add a new variable in the Variables window, the variable is created within the scope of that task and cannot be changed. In these cases, you were required to delete the variable, clear the task selection on the design surface, and then add the variable again within the scope of the package. Integration Services now creates new variables with scope set to the package by default. To change the variable scope, follow these steps:
1. In the Variables window, select the variable to change and then click the Move Variable button
in the Variables toolbar (the second button from the left), as shown here:
2. In the Select New Scope dialog box, select the executable to have scope—the package, an
event handler, container, or task—as shown here, and click OK:

Expression Result Length
                       
                             Prior to the current version of Integration Services, if an expression result had a data type of DT_WSTR or DT_STR, any characters above a 4000-character limit would be truncated. Furthermore, if an expression contained an intermediate step that evaluated a result exceeding this 4000-character limit, the intermediate result would similarly be truncated. This limitation is now removed.

The SQL Server Integration Services Expression Language now has four new functions:
LEFT You can now more easily return the leftmost portion of a string rather than use the SUBSTRING function:
LEFT(character_expression,number)
REPLACENULL You can use this function to replace NULL values in the first argument with the expression specified in the second expression:
REPLACENULL(expression, expression)
TOKEN This function allows you to return a substring by using delimiters to separate a string into tokens and then specifying which occurrence to return:
TOKEN(character_expression, delimiter_string, occurrence)
 TOKENCOUNT This function uses delimiters to separate a string into tokens and then returns the count of tokens found within the string:
TOKENCOUNT(character_expression, delimiter_string)

Supported Deployment Models
The latest version of Integration Services supports two deployment models:
 Package deployment model The package deployment model is the deployment model used in previous versions of Integration Services, in which the unit of deployment is an individual package stored as a DTSX file. A package can be deployed to the file system or to the MSDB database in a SQL Server database instance. Although packages can be deployed as a group and dependencies can exist between packages, there is no unifying object in Integration Services that identifies a set of related packages deployed using the package model. To modify properties of package tasks at runtime, which is important when running a package in different environments such as development or production, you use configurations saved as DTSCONFIG files on the file system. You use either the DTExec or the DTExecUI utilities to execute a package on the Integration Services server, providing arguments on the command line or in the graphical interface when you want to override package property values at run time manually or by using configurations.

Project deployment model With this deployment model, the unit of deployment is a project, stored as an ISPAC file, which in turn is a collection of packages and parameters. You deploy the project to the Integration Services catalog, which we describe in a separate section of this chapter. Instead of configurations, you use parameters (as described later in the “Parameters” section) to assign values to package properties at runtime. Before executing a package, you create an execution object in the catalog and, optionally, assign parameter values or environment references to the execution object. When ready, you start the execution object by using a graphical interface in SQL Server Management Studio by executing a stored procedure or by running managed code.
For More
Microsoft_Press_eBook_Introducing_Microsoft_SQL_Server_2012

Tuesday, 22 April 2014

Power Pivot with DAX Expressions


PowerPivot is an innovative adaptation of the "Business Intelligence" software used by large companies to analyze data.   It is available from Microsoft as a free add-in to Microsoft Excel 2010.
You do not need to be a programmer to use PowerPivot. It was designed for the end user.
Every quarter your team spends countless hours preparing reports for the board meeting. Getting data from different reports, entering data into Excel, preparing exhibits and checking and rechecking formulas. After you get the first set of charts ready then you need even more – to explain this quarters variations. So your team spends more hours slicing and dicing the data, preparing exhibits and checking results.  

Best of all, to update the report for a new month or quarter all you have to do is press a “refresh” button. No more entering data, rebuilding charts of rechecking formulas. When you need more detail to explain the results you can use the “slicer buttons” on the right to instantly change the analysis.

To convince him you prepared a walk thru of the four steps you used to prepare the dashboard. Each to the four steps is covered in an appendix to this document.
Step 1:  Think about the data you want to measure.
This is the most important step in the process. Whatever you want to improve, start by measuring it. An easy to maintain dashboard will help to focus everyone’s attention on improvement.
After deciding what to measure you need to find the data. This might require meeting with a technical resource. In this example you got the data from six interconnected database tables and two Excel worksheets.
Appendix 1 - Think about the data you want to measure.

You want to compare actual sales data with the sales budget and with marketing expenses. 

  • Actual sales data is in the company’s database.
  • The sales budget and marketing expense data are in Excel spreadsheets.
You will want to meet with the database technical support person to identify the tables where the data is stored. If your database is like most you will have to use several tables to get the data you want. This example uses a database sample from Microsoft where the Sales Order Detail contains the sales amount and product code, the Sales Order Header table contains the sales date and territory, the Product table contains the Product Name and Product Sub Category Code, the Product Subcategory contains the Subcategory name and Product Category Code and the Product Category table contains the Category name.
This all sounds complicated but when PowerPivot imports the data from the database it also imports the relationships. There usually is nothing for you to do before you can actually use the data. You don’t need to understand anything about databases!
As in most analyses data you need is not all in the database. In this example the sales budget and marketing expense data are in an Excel spreadsheet. No problem, the PowerPivot database can combine data from multiple sources. Since each row of the Excel data contains the same Territory ID and Product ID codes that are in the database, PowerPivot can combine the data.
Step 2: Connect Excel to the data

PowerPivot has all you need to connect to databases, Excel, Access, text files or sources of data from the web. Once connected PowerPivot stores a copy of the data in Excel. Millions of rows of data can be effectively stored by PowerPivot using a new technology that compresses the data.
When you are ready to update the data you just click on the PowerPivot “Refresh” button and an updated copy of the data in copied into the PowerPivot database.
Appendix 2 – Connect Excel to the data
Starting from a blank workbook, you first open the PowerPivot Window, connect to the sales database and select the sales order header and detail tables, the sales territory table and the product, product sub category and product category tables:
Opening the PowerPivot Window and connecting to the database. 

After connecting to the database, data from the six tables has been imported to the PowerPivot database in the Excel workbook.

Even though PowerPivot has lots of data (121,317 Sales Order Detail records in the above example), workbook size remains manageable.
Pressing the refresh button will reimport an updated copy of the data.
Now you connect to the Excel data. First step is to copy the sales budget into the blank workbook. Make the copied list into a table and create a liked table in PowerPivot.
 
Sales budget copied to Excel.
Add the Excel table to the PowerPivot database.

And the table is included in the PowerPivot database                            .

 
You follow the same steps to copy the Marketing Expenses into Excel and then add to PowerPivot.

One last thing to do in the database is to make a “Date Table”. This is always a good idea for two reasons:
You will want to use a different date measure in your report than exists in the data. In this example you want to report by month and the data is by month, day and year. Also, PowerPivot will sort February before January so it will be helpful for the date table to format the months as yyyy – mm (2011 -12).
You need to filter the actual sales data, budget data and marketing expense data on sales date. Having all three types of data linked to a common date table will speed this process.

 
You build the date table in Excel and link to PowerPivot.
PowerPivot has already imported these relationships from the database:

You need to add the relationships in yellow below so that you can filter across the data month and year.
 
The last three relationships relate the sales budget and the marketing expenses to the sales territory and product tables. These relationships would be automatically added by PowerPivot the first time you produced a report of the data. Since you had to add the date relationships you thought it was easier to add them at the same time.
 

Step 3: Prepare Excel charts to show the measures
You can make Excel pivot tables and pivot charts from the data in the PowerPivot database. Add one pivot table to a worksheet. If you need a pivot chart add the chart to the same worksheet. These worksheets will be hidden in the final dashboard. Use them to get your data and measures right.
Appendix 3 – Prepare Excel charts to show the measures
You request a new pivot table from the PowerPivot window.
The PowerPivot pivot table can combine data for all 9 data tables.
You choose to sum line total and budget sales by category.
Next you add a Pivot Chart

 
 

Step 4: Copy the charts to a single worksheet and add slicers
Copy charts from the pivot table worksheet to your dashboard sheet. Arrange the charts on the dashboard. Then add “Slicers” which are groups of buttons used so filter the data in your charts. Each slicer can be attached to multiple pivot tables. When you click on a year in the slicer on the dashboard then each of the pivot tables attached to that slicer will show data only for that year. In this way slicer buttons on the dashboard worksheet control what data is shown on the dashboard charts.
Your boss is not very fluent in Excel so as you expected his first question was “Why couldn’t you do this with Excel 2003?” You explained that Excel 2010 has three new features that are all part of the new Excel PowerPivot Self-Service BI capability.
Feature 1: Excel PowerPivot allows you to create a simple database within Excel. In the example above you created a database inside of Excel with 9 tables: 6 from the company’s database system and 3 from Excel worksheet data. You were able to specify relationships between the tables so that data could be combined in my charts.
Feature 2: Excel PowerPivot pivot table and charts allowed you to quickly prepare the charts. Unlike the lightweight pivot tables in Excel 2003, these pivot tables can combine data from different tables and come with new formulas that are essential for serious analysis.
At first glance PowerPivot pivot tables and pivot charts are similar to the pivot table and charts that have been in Excel since 1993. But the new PowerPivot pivot table and pivot charts have two very significant differences:
  1. They can combine data from multiple tables. (The old pivot tables required you to get all the data in one table.)
  2. The new Data Analysis Expression (DAX) formula language that makes it easy to do calculations within the pivot table and line up comparative measures like “”Prior Year To Date”.
Feature 3: Excel Slicers provide a way of selecting data so that multiple charts can be shown together and controlled by one set of buttons.
You told your boss that this is the level of BI that fits your company’s current needs. You know it wouldn’t meet the needs of a much larger company with more IT resources but it can make a real difference in what your company is able to accomplish with Excel.
Appendix 4 - Copy the charts to a single worksheet and add slicers
You copy the Pivot Chart and paste to a new blank worksheet. Then you add slicers.
One slicer can attach to multiple pivot tables.
 

Your dashboard is well underway!


Data Analysis Expressions (DAX) is the standard PowerPivot formula language that supports custom calculations in PowerPivot tables and Excel PivotTables. While many of the functions used in Excel are included, DAX also offers additional functions for carrying out dynamic aggregation and other operations with your data. Bill's new series serves as a progressive introduction to PowerPivot and DAX formulas, examining the functions, operators and values involved, and offering a wealth of practical examples.

As we know DAX and MDX expressions can be used to query Tabular Model in Analysis services. Keep in mind that DAX does not work in SSAS Multidimensional Cube for querying data. We will understand basics of producing results from DAX same as MDX. DAX is promisingly giving quick result than MDX.  So if you are using Tabular model cube solution, then prefer writing DAX unless a complex logic came up which can only be resolved through MDX.

Behaviour of DAX expressions:
  • DAX is working as a SQL language with difference of functions and syntaxes
  • We need not to write JOINS and GROUP BY clauses in DAX as it already understand that the purpose is to retrieve data in this form. Relationships among the tables of Tabular Model/ Power Pivot model are used for getting JOINS in the DAX query execution.
  • We need to understand that DAX is not a query language, it’s a query expression. So we are using some table expressions in DAX to retrieve data in table format and further it is used as result-set/ dataset for the SSRS report.
  • To evaluate DAX expressions use MDX query pane in SSMS
Functions

We can now move on to some useful functions for writing DAX as in query language. Please note that list of DAX function is very big but here we are trying to find out the functions which can be used to get data in the form of result set, same as MDX and DAX.
In this series next functions are following:

EVALUATE:
  • First step to start with DAX Query Reference. This function can be used to get the data in result set from a table.
    EVALUATE <<Tabular Expression or Name of the table>>
    Argument to the Evaluate function can only be a table name or any expression that return a table. Evaluate   function is added up with different other functions (like VALUES, SUMMARIZE, ADDCOLUMNS, TOPN etc.) to produce different results as per requirements.
    e.g.  Get table data of Employee table in “AdventureWorks Tabular Model SQL 2012”.
    EVALUATE 'Employee'
    As per behaviour of DAX, this is similar to SQL as:
    SELECT * from [Employee]
    We cannot compare here MDX for this DAX, as there is no way of getting table level attributes in MDX. We need to specify attributes required in MDX SELECT statement.
SUMMARIZE:
  • Function return required totals and counts, based on several groups (of different columns or say attributes from different tables in Tabular Model).
    We use SUMMARIZE along with EVALUATE for getting desired result in SSMS.
    SUMMARIZE (<<Table>>, <<Grouping Column1>>, <<Grouping Column2>>,,,<<Name for desired Aggregation1>>, <<Expression for Aggregation or Calculated MeasureName1>>, <<Name for desired Aggregation2>>, <<Expression for Aggregation or Calculated MeasureName2>>)
    Although it looks like a complex syntax but use is very simple and there are so many things we can do with this function.
    E.g. Let say you want to see Currency, Customer ID and Internet Sales from Adventure Works tabular model. Here a SQL query on database can be like:
    SELECT [CurrencyName], [CustomerID],SUM([InternetTotalSales])
    FROM [InternetSales] A
    LEFT JOIN Currency B ON B.CurrencyID = A.CurrencyID
    LEFT JOIN Customer C ON C.CustomerID = A.CustomerID
    GROUP BY [CurrencyName], [CustomerID]
    Easy, but in MDX it can be as:

    SELECT
    NON EMPTY
    [Currency].[CurrencyName].[CurrencyName].MEMBERS*
    [Customer].[Customer Id].[Customer Id].MEMBERS ON 1
    ,[Measures].[Internet Total Sales] ON 0
    FROM [AdventureWorks_Test]
    Again easy.
    But in DAX, don’t worry it’s again easy. We need to keep in mind SUMMARIZE will need first Primary table (more of table which can be base of relationships among desired columns from different tables). Mostly these primary tables can be fact table, which relates different tables. Other than that JOINS and GROUB BYs are not required as SUMMARIZE take care of creating relationships and aggregating data for you.
    DAX for same situation is:
    EVALUATE
    SUMMARIZE('Internet Sales'
    ,'Currency'[CurrencyName]
    ,'Customer'[Customer Id]
    ,"Internet Sales",'Internet Sales'[Internet Total Sales])
    ORDER BY 'Currency'[CurrencyName], 'Customer'[Customer Id]
    Here 'Internet Sales'[Internet Total Sales] is already a calculated measure, so Name for Measure is given as "Internet Sales". We can replace it as SUM([InternetTotalSales]), which is aggregation that we need.
    ORDER BY is same as SQL ORDER BY.
Most of the complex queries in DAX revolve around SUMMARIZE, so it’s important to understand its basics.
ADDCOLUMNS:
  • The coolest function I’ll say. It gives you a power of creating calculated Column in a table (or tabular expression) at query Run time. I believe there is no way of doing this in MDX (can’t create an Attribute itself in MDX). ADDCOLUMNS(<Table or Tabular Expression>, “<Calculated column Name1>”, <Expression for Calculated column1>,. . . . . . )
    As in SQL we can create a new column while writing query.
    For example (simplest as possible):

    SELECT
    [FirstName], [MiddleName], [LastName]
    , ([FirstName] + ' ' + [MiddleName] + ' ' + [LastName])
    AS FullName
    FROM [dbo].[DimEmployee]
    Part of expected Result is:
    Now in DAX same query will be:
    EVALUATE
    SUMMARIZE(
    ADDCOLUMNS('Employee'
    ,"Full Name"
    ,CONCATENATE('Employee'[First Name], CONCATENATE(" ",
    CONCATENATE('Employee'[Middle Name], CONCATENATE(" ",
    'Employee'[Last Name])
    ))))
    ,[Full Name]
    )
    I know that String manipulation in DAX is quite a job, but there is no other way.
    Here if we do not use SUMMARIZE and EVALUATE on ADDCOLUMNS, it will give all the Columns in table Employee with added column as Full Name.
    But if you just want to see one column out of your new table (with added column) use SUMMARIZE to be specific.
    This function makes your life easy to get what you want at the level of Table, and solved many of my issues in SSRS reporting.
  • Some other functions are
  • Format
  • Search
  • Filter
  • Top N

  • Compare DAX vs MDX

  • Once I was asked to give an opinion on MDX vs DAX. But we know in Software world everything has some pros and cons when compared to other, otherwise if it just have cons it would be thrown to dustbin (say deprecated).
  • Well I tried a lot to look for direct comparison of MDX and DAX and to bring my best opinion as per Project Requirements but as usual,no answer in Google for “Compare MDX vs DAX”.
    I prepared a comparison as per my experience in both the languages. Keep in mind DAX is not a query language but it can be used as one, using DAX Query Syntax Reference.
    So below are some comparison points and I wish these will help you out.
    S.no
    DAX
    MDX
    Prefer
    1
    Performance
    DAX has some part of better performance on Tabular Model. I was looking at CPU Time comparison for same result set given by MDX and DAX. DAX wins all the time. Performance is good but when we compare with DAX (on tabular model), it’s a bit low.
    DAX
    2
    Support:
    Support In-Memory, DirectQuery and Hybrid query modes of Tabular Model No support for DirectQuery modes of Tabular Model.       DAX
    3
    Query Language:
    Not a query language, still we can leverage it by few work around (using tabular expressions):a)      Use it as query language in SSRS by writing it in place of DMXb)      Writing and executing it on MDX pane of SSMS to validate the query and data A well known query language and being used in SSRS and SSMS as separate query option of MDX.
    MDX
    4
    Ease of writing:
    Difficult to write as compare to MDX (although many blogs says it’s easy, with perspective of end user).Reason:
    1. It behaves as SQL query and we expect a Multidimensional Behaviour.
    2. Need use of lot of functions for even few small requirements.
    3. Need to know the appropriate relationship between the tables used (in case of non-related table in query, no result will come)
    4. Difficult to handle Non Empty behaviour through query
    5. Difficult to manipulate filtration in queries. Specially multi-valued filters.
    Easy to write as compared to DAX.
    MDX
    5
    Reporting supports:
    Long way to go for this investigation. (Decide your reporting tools requirement and investigate or query here in blog)It works with SSRS ( in place of DMX) and ADOMD.net (please refer the link) Works with SSRS, PPS and ADOMD.net code and lot many other reporting tools and codes.
    MDX
    6
    Support for Ad-hoc reporting:
    a)      Power View Reporting: No user interaction is required for querying the Tabular Model. So no use of DAX or MDX.b)      Power Pivot: End user needs to know DAX for creating new measure a)      Power View Reporting: No use.  b)      Power Pivot: No use
    Depends on end user, if he/ she know DAX.
    7
    Data Mining:
    Can’t be used for Data Mining purposes even in Excel. MDX is required for Data Mining
    MDX
    Assumptions:
    Project is on SQ Server 2012, BI Semantic Tabular Model. Else DAX will not work on Multidimensional SSAS solution anyhow.
    Conclusion:
    Out of 7 points, 4 support MDX and 2 DAX.
    But along with that, I believe for a long run MDX is better. It’s been used in many legacy system, developers are more conformable and it provides more flexibility on different reporting services (as per my knowledge).
    Note: I will soon post on query behaviors of MDX and DAX. Saying DAX has better performance than MDX is not always true.
  • Slicers


    Excel 2010 makes it possible to insert slicers to quickly and easily filter pivot tables. However, using the report filter gives the exact same result.
    Below you can find a two-dimensional pivot table. Go back to Pivot Tables to learn how to create this pivot table.
    Two-dimensional Pivot Table in Excel
    To insert a slicer, execute the following steps.
    1. Click any cell inside the pivot table. The PivotTable Tools contextual tab activates.
    2. On the Options tab, insert a slicer.
    Insert a Slicer
    3. Check Category and click OK.
    Check Category
    4. For example, Click Fruit to only show the fruit exported to each country.
    Slicer Example
    Note: notice how the report filter changed to Fruit. Hold down CTRL to include fruit and vegetables.
    Did you find this information helpful? Show your appreciation, vote for us.
For more see this