Transact-SQL provides many built-int functions used to get statistics. These functions are used in various circumstances, depending on the nature of the column being investigated. This means that you should first decide what type of value you wand to get, then choose the appropriate function. To call the function in SQL code, start a SELECT statement and pass the column to the function. The minimum formula to follow is: SELECT FunctionName(FieldName) FROM TableName; To visually create an aggregate query, in the Object Explorer, expand the database you want to use. Right-click Views and click New View... On the Add Table dialog box, select the table(s) (or view(s)) and close it. To start a summary query:
This would add a new column titled Group By in the Criteria section. From that column, you can select the function you want to use. Later, we will review what aggregate functions are available.
Probably the most basic piece of information you may want to get about a list is the number of records it has. In statistics, this is referred to as the number of samples. To help you get this information, Transact-SQL provides a function named Count. It counts the number of records in a column and produces the total. This function also counts NULL fields. The syntax of the Count() function is: int COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )
This function takes one argument. The Count() function returns an int value. Here is an example: USE rosh;
GO
SELECT COUNT(stds.StudentNumber) N'Number of Students'
FROM Registration.Students stds;
GO
This would produce:
To get the count of occurrences of a value, in the Criteria pane, you can select COUNT(*). If you are working on a large number of records, you can call the Count_Big() function. Its syntax is: bigint COUNT_BIG ( { [ ALL | DISTINCT ] expression } | * )
If you have a list of values, you may want to get the
lowest value. For example, in a list of houses of a real estate company
with each property having a price, you may want to know which house is the
cheapest. To let you get this information, Transact-SQL provides a
function named MIN. Its syntax is: DependsOnType MIN ( [ ALL | DISTINCT ] expression ) The return value of the MIN() function depends on the type of value that is passed to it. For example, if you pass a column that is number-based, the function returns the highest number. Here is an example: USE DepartmentStore1;
GO
SELECT MIN(si.UnitPrice) N'Cheapest'
FROM Inventory.StoreItems si;
GO
If you pass a string-based column, the function returns the the last value in the alphabetical order. Here is an example: USE rosh;
GO
SELECT MIN(stds.LastName) [First Student]
FROM Registration.Students stds;
GO
In the same way, you can pass a date/time-based column. Here is an example: USE rosh;
GO
SELECT MIN(stds.DateOfBirth) "Youngest Student"
FROM Registration.Students stds;
GO
Be careful when passing a value to an aggregate function such as MIN(). For example, if the name of a column is processed by a function, the returned value would be used by the aggregate function. Consider the following call: SELECT MIN(FORMAT(Studs.BirthDate, N'D')) [Earliest Birthdate] FROM Studs; GO This would produce:
Notice that the name Friday, as a string, is the one being processed by the MIN() function, instead of the actual date.
The opposite of the lowest is the highest value of a
series. To assist you with getting this value, Transact-SQL provides the
Max() function. Its function is: DependsOnType MAX ( [ ALL | DISTINCT ] expression ) This function follows the same rules as its MIN() counterpart, but in reverse order (of the rules). Here is an example: USE DepartmentStore1;
GO
SELECT MAX(si.UnitPrice) N'Most Expensive'
FROM Inventory.StoreItems si;
GO
The sum of the values of a series is gotten by adding all values. In algebra and statistics, it is represented as follows: ∑x To let you calculate the sum of values of a certain
column of a table, Transact-SQL provides a function named Sum. The
syntax of the Sum() function is: Number SUM ( [ ALL | DISTINCT ] expression ) Unlike the MIN() and the MAX() functions that can receive a column of almost any type, the column passed to the SUM() function must be number-based.
In algebra and statistics, the mean is the average of the numeric values of a series. To calculate it, you can divide the sum by the number of values of the series. It is calculated using the following formula:
From this formula:
To support this operation, Transact-SQL provides the
Avg function. Its syntax is: Number AVG ( [ ALL | DISTINCT ] expression )
Imagine you have a column with numeric values. You already know how to get the sum and the mean. The standard deviation is a value by which the elements vary (deviate) from the mean. The formula to calculate the standard deviation is:
From this formula:
The above formula wants you to first calculate the mean. As an alternative, you can use a formula that does not require the mean. It is:
Instead of creating your own function, Transact-SQL
can assist you. First there are two types of standard deviations. The
sample standard deviation relates to a sample. To let you calculate it,
Transact-SQL provides a function named STDDEV. Its syntax is: float STDEV ( [ ALL | DISTINCT ] expression ) The other standard deviation relates to a population. To help you calculate it, Transact-SQL provides the STDDEVP() function. Its syntax is: float STDEVP ( [ ALL | DISTINCT ] expression )
The variance is the square of the standard deviation. This means that, to calculate it, you can just square the value of a standard deviation. As seen with the standard deviation, there are two types of variances. A sample variance relates to a sample. To help you calculate a sample variance of records, Transact-SQL provides VAR function. Its syntax is: float VAR ( [ ALL | DISTINCT ] expression ) The function used to calculate a population variance is VARP and its syntax is: float VARP ( [ ALL | DISTINCT ] expression )
|
|
|||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||