Selecting Records Over Partitions
Selecting Records Over Partitions
Using Groups of Records
One of the most common features of a list is that it may have repeating values. For example, if you have a table of employees where the information of each employee includes the department where he works, you will likely find a number of employees in a certain department and another series of employees in another department.
If you have that type of list, you may be interested in getting a certain piece of information for each group of records. For our example of employees, you may want to know the number of employees in each department, the number of female employees in each department, the minimum salary of employees in each department, etc. This is different from simply calling an aggregate function on the table because the aggregate function would apply to all records. This time, you want to get the information for each group in the list.
In this and the next lesson, we will use the word "group" or the expression "group of records". The actual word to use is "partition"; and in some cases, we will use "partition". The problem is that the word partition has a particular meaning in the database world and it has to do with "Distributed Databases". Therefore, to reduce confusion, we will mostly use "group" or "group of records" or "groups of records". You are free to substitute "group" with "partition". |
Practical Learning: Introducing Record Partitioning
SELECT * FROM Rentals.Tenants t; GO
USE LambdaPropertiesManagement1; GO SELECT props.PropertyNumber, props.Bedrooms, props.MonthlyRate FROM Listing.Properties props ORDER BY props.Bedrooms; GO
USE LambdaPropertiesManagement1; GO SELECT MIN(props.MonthlyRate) "Cheapest Monthly Rate" FROM Listing.Properties props; GO
SELECT COUNT(props.PropertyType) "Total Listing", MIN(props.MonthlyRate) "Cheapest Monthly Rate", MAX(props.MonthlyRate) "Highest Monthly Rate", AVG(props.MonthlyRate) "Average Monthly Rate" FROM Listing.Properties props; GO
Selecting Records Over Aggregates
As mentioned above, the primary reason to analyse groups of records is to get a piece of information about the groups. This is done using an aggregate function combined with at least the OVER keyword. The primary formula to follow is:
SELECT ... AggregateFunction(WhatField) OVER() ... FROM WhatObject
After the SELECT keyword, you can call an aggregate function applied to a column of the table whose records you want to analyze. After calling the aggregate function, you must call the OVER macro as a function. This means that OVER is followed by parentheses. You can leave them empty or pass some of the options we will learn.
Besides calling an aggregate function, you can also access any field of the table. You can do this before or after the aggregate function (of course, we will see many examples).
Remember that when specifying the fields used in a SQL statement, if a column is a foreign key, you can use a join to get a more significant value from the parent table.
Practical Learning: Selecting Records Over an Aggregate Function
SELECT MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate" FROM Listing.Properties props; GO
SELECT DISTINCT(MIN(props.MonthlyRate) OVER()) "Cheapest Monthly Rate" FROM Listing.Properties props; GO
SELECT props.Bedrooms,
MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate"
FROM Listing.Properties props;
GO
SELECT props.Bedrooms,
MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate"
FROM Listing.Properties props
GROUP BY props.Bedrooms, props.MonthlyRate;
GO
Selecting Ordered Records Over
If you simply call the parameter-less OVER, it produces the list of all records as they were created. Here is an example:
This result does not show records in groups. As we know already, to show records in groups, you can arrange them, which is done by adding an ORDER BY clause. In this case, specify the ordering in the parentheses of OVER(). The formula to follow is:
SELECT ... AggregateFunction(WhatField) OVER(ORDER BY Field) ... FROM WhatObject
Remember that you want to analyse records in groups of values. Obviously the records should have similar values. And obviously a column that includes unique values (such as a primary key column) or rarely repeating values (like a last name column) is not a good candidate. This means that you should arrange the records based on a column that has repeating values.
Practical Learning: Selecting Ordered Records Over an Aggregate Function
USE LambdaPropertiesManagement1;
GO
SELECT props.Bedrooms,
MIN(props.MonthlyRate) OVER(ORDER BY props.MonthlyRate) "Cheapest Monthly Rate"
FROM Listing.Properties props;
GO
USE LambdaPropertiesManagement1;
GO
SELECT props.Bedrooms,
MIN(props.MonthlyRate) OVER(ORDER BY props.Bedrooms) "Cheapest Monthly Rate"
FROM Listing.Properties props;
GO
USE LambdaPropertiesManagement1; GO SELECT props.PropertyType, MIN(props.MonthlyRate) OVER(ORDER BY props.PropertyType) "Cheapest Monthly Rate" FROM Listing.Properties props; GO
USE LambdaPropertiesManagement1;
GO
SELECT props.PropertyType,
COUNT(props.PropertyType) OVER(ORDER BY props.PropertyType) "Range in Category"
FROM Listing.Properties props;
GO
In order to analyze records in series, you must create the necessary groups. A group is also referred to as a partition. Therefore, to create a group or partition, you must specify what column will be used; that is, what column holds the repeating values. To do this, pass an expression as PARTITION BY to the parentheses of OVER(). The PARTITION BY expression is followed by the name of the field.
When partitioning the records, if you want, you can arrange the groups based on a field of your choice. To do this, after the column applied to the partition, add an ORDER BY clause that uses the column of your choice.
Practical Learning: Selecting Records Over a Partition
USE LambdaPropertiesManagement1; GO SELECT PropertyType [Property Type], COUNT(PropertyType) OVER(PARTITION BY PropertyType) Total FROM Listing.Properties; GO
SELECT DISTINCT(PropertyType) [Property Type], COUNT(PropertyType) OVER(PARTITION BY PropertyType) Total FROM Listing.Properties; GO
SELECT props.Bedrooms, MIN(props.MonthlyRate) OVER() "Cheapest Monthly Rate" FROM Listing.Properties props; GO
SELECT props.Bedrooms,
MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Cheapest Monthly Rate Based on Number of Bedrooms"
FROM Listing.Properties props;
GO
SELECT props.Bedrooms,
MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms ORDER BY props.PropertyNumber) N'Cheapest Monthly Rate Based on Number of Bedrooms'
FROM Listing.Properties props;
GO
SELECT DISTINCT(props.Bedrooms), MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Cheapest Monthly Rate Based on Number of Bedrooms" FROM Listing.Properties props; GO
SELECT DISTINCT(props.Bedrooms), MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms ORDER BY props.PropertyNumber) N'Cheapest Monthly Rate Based on Number of Bedrooms' FROM Listing.Properties props; GO
SELECT DISTINCT(props.Bedrooms) "Number of Bedrooms", COUNT(props.PropertyNumber) "Number of Houses or Apartments", MIN(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Cheapest Monthly Rate" FROM Listing.Properties props GROUP BY props.Bedrooms, props.MonthlyRate; GO
SELECT DISTINCT(props.Bedrooms), AVG(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms) "Avergage Monthly Rate Based on Number of Bedrooms" FROM Listing.Properties props; GO
SELECT DISTINCT(props.Bedrooms), FORMAT(AVG(props.MonthlyRate) OVER(PARTITION BY props.Bedrooms), N'C') "Avergage Monthly Rate Based on Number of Bedrooms" FROM Listing.Properties props; GO
SELECT DISTINCT(pmts.RegistrationNumber), SUM(pmts.AmountPaid) OVER(PARTITION BY pmts.RegistrationNumber) [Total Payments] FROM Rentals.Payments pmts; GO
SELECT DISTINCT(regs.TenantCode), SUM(pmts.AmountPaid) OVER(PARTITION BY pmts.RegistrationNumber) [Total Payments] FROM Rentals.Payments pmts INNER JOIN Rentals.Registrations regs ON pmts.RegistrationNumber = regs.RegistrationNumber; GO
USE LambdaPropertiesManagement1; GO SELECT DISTINCT(CONCAT(tens.FirstName, N' ', tens.LastName, N', occupying ', LOWER(props.PropertyType), N' #', props.PropertyNumber, N' with a monthly rate of $', props.MonthlyRate)) [Tenant and Registration Information], SUM(pmts.AmountPaid) OVER(PARTITION BY pmts.RegistrationNumber) [Total Payments] FROM Rentals.Payments pmts INNER JOIN Rentals.Registrations regs ON pmts.RegistrationNumber = regs.RegistrationNumber INNER JOIN Rentals.Tenants tens ON tens.TenantCode = regs.TenantCode INNER JOIN Listing.Properties props ON regs.PropertyNumber = props.PropertyNumber; GO