未加星标

Create a Cross Tab Query to Summarize Data in SQL Server

字体大小 | |
[数据库(mssql) 所属分类 数据库(mssql) | 发布者 店小二04 | 时间 2017 | 作者 红领巾 ] 0人收藏点击收藏

In this puzzle, we’re going to learn how to create a cross tab query using SQL Server. Cross tabs are a great way to summarize data. And given that Business Intelligence is a hot topic, knowing how to create one is key.

We’ll learn to create a cross tab query. Before you read the entire article, try the puzzle. Any work you do, if it you just get part of the answer, helps to reinforce the concepts you’ll learn.

Solving puzzles is a great way to learn SQL. Nothing beats practicing what you’ve learned. Once you have figured out the puzzle, post your answer in the comments so we all can learn from one another. We also discuss puzzle and more in Essential SQL Learning Group on Facebook. Be sure to find us there!

SQL Puzzle Question

The Adventure Works human resources director is starting a wage study. As part of this study, she would like a summary of annual wages by Department Group name and Life Status.

For our purposes, Life status is defined the combination of marital status and gender (e.g. Married-Male, Married-Female).

Using the AdventureWorks database as the basis for your answer, what SQL would you write to product the following table?


Create a Cross Tab Query to Summarize Data in SQL Server

Note: Here is a great resource if you need help getting started with SQL Server and the Adventure Works database .

Answer Using Cross Tab Query

Before we get into how to create the cross tab query, let’s first define the underlying query we’ll use as a source.

Once you study the adventure works database you’ll notice there are several key tables you’ll need to do your search. To help understand the database relationships, let’s create a database diagram.

These diagrams really help you understand relationships, and come in handy when writing the queries.

Below is the one to use for this puzzle. Notice how the Employee table is central to the query. Related to this are two separate dimensions, the first being pay history, the second, employee department history.


Create a Cross Tab Query to Summarize Data in SQL Server
Cross Tab Query Detail Rows

The first step in putting together the summarization is to construct the query used to pull in the detailed data. After you study the diagram, you’ll see we’ll need several joins per the relationships shown.

To make it easier to read, I color coded the joins related to pay history in red, and those with department group name in blue.

SELECT E.BusinessEntityID,
D.GroupName,
H.EndDate,
E.MaritalStatus,
E.Gender,
P.Rate,
P.PayFrequency,
P.RateChangeDate,
P.Rate * 40 * 52 / P.PayFrequency as AnnualPay
FROM HumanResources.Employee E
INNER JOIN HumanResources.EmployeePayHistory P ON E.BusinessEntityID = p.BusinessEntityID
INNER JOIN HumanResources.EmployeeDepartmentHistory H ON E.BusinessEntityID = H.BusinessEntityID
INNER JOIN HumanResources.Department D ON H.DepartmentID = D.DepartmentID
WHERE E.CurrentFlag = 1;

There are several issues with this result. First, the query is returning more than one row per employee. To avoid double counting summary results, there should be a single row.


Create a Cross Tab Query to Summarize Data in SQL Server

This is due the query returning each employee’s complete pay and departmental history. Multiple records, for one employee are included in the result one for each pay raise or departmental move.

Removing Duplicates from Cross Tab Query Detail Results

The results should contain the latest pay and department for each employee.

The latest department, the once corresponding to the department history with a NULL end date. Adding a filter to the join condition addresses this issue.

To only select the latest pay rate, we’ll partition the data by employee (BusinessEntityID) sorted by RateChangeDate. With the data sorted in descending order, the latest pay is always ROW_NUMBER() = 1.

Wrapping this query in a CTE (Common Table Expression) allows us to query the detailed data and filter on the partitioned data. This means we can select rows where ROW_NUMBER() = 1.

With cte_EmployeePay as
(
SELECT E.BusinessEntityID,
D.GroupName,
H.EndDate,
E.MaritalStatus,
E.Gender,
P.Rate,
P.PayFrequency,
P.RateChangeDate,
P.Rate * 40 * 52 / P.PayFrequency as AnnualPay,
ROW_NUMBER() OVER(PARTITION BY E.BusinessEntityID ORDER BY P.RateChangeDate Desc) RowNumber
FROM HumanResources.Employee E
INNER JOIN HumanResources.EmployeePayHistory P ON E.BusinessEntityID = p.BusinessEntityID
INNER JOIN HumanResources.EmployeeDepartmentHistory H ON E.BusinessEntityID = H.BusinessEntityID AND H.EndDate is NULL
INNER JOIN HumanResources.Department D ON H.DepartmentID = D.DepartmentID
WHERE E.CurrentFlag = 1
)
SELECT *
FROM cte_EmployeePay
WHERE RowNumber = 1;

Using a common table expression makes this easier to manage and we avoid having to create a view of the partitioned data. Now our results show the latest pay and departmental assignments.


Create a Cross Tab Query to Summarize Data in SQL Server

Notice there is just a single row for BusinessEntityID 4.

The final step is to create a cross tab query. I choose to create one using the PIVOT statement.

To prepare my data, I calculated the LifeStatus by combining MaritalStaus with Gender.

You see further on in the pivot statement I explicitly call out each combination. This makes it a static pivot table. In a later puzzle, we’ll learn how construct a dynamic pivot.

Using PIVOT to Create a Cross Tab Query

We’re now able to put together the summary data using a PIVOT. To help you read the query, I’ve color coded the source query gray.

The PIVOT is color coded Blue.

With cte_EmployeePay as
(
SELECT D.GroupName,
IIF(E.MaritalStatus = 'M','Married','Single') + '-' +
IIF(E.Gender = 'M', 'Male', 'Female') as LifeStatus,
P.Rate * 40 * 52 / P.PayFrequency as AnnualPay,
ROW_NUMBER() OVER(PARTITION BY E.BusinessEntityID ORDER BY P.RateChangeDate Desc) RowNumber
FROM HumanResources.Employee E
INNER JOIN HumanResources.EmployeePayHistory P ON E.BusinessEntityID = p.BusinessEntityID
INNER JOIN HumanResources.EmployeeDepartmentHistory H ON E.BusinessEntityID = H.BusinessEntityID AND H.EndDate is NULL
INNER JOIN HumanResources.Department D ON H.DepartmentID = D.DepartmentID
WHERE E.CurrentFlag = 1
)
SELECT *
FROM (SELECT GroupName, LifeStatus, AnnualPay
FROM cte_EmployeePay
WHERE RowNumber = 1) as SourceTable
PIVOT
(
Avg(AnnualPay)
FOR LifeStatus IN ([Married-Male], [Married-Female], [Single-Male], [Single-Female])
) AS AvgSales
ORDER BY GroupName Notice that the PIVOT operator takes the derived table, which is named SourceTable, as

本文数据库(mssql)相关术语:熊片数据库 mssql数据库 oracle数据库 pubmed数据库 access数据库 万方数据库

主题: SQLSQL ServerFacebookTI
分页:12
转载请注明
本文标题:Create a Cross Tab Query to Summarize Data in SQL Server
本站链接:http://www.codesec.net/view/533041.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(mssql) | 评论(0) | 阅读(84)