This section discusses and demonstrates the five aggregation functions where they return more than a single row of data, which involves grouping by non-aggregated columns. This section covers the GROUP BY and HAVING clauses. Like the other sections, you unfold this by clicking on the bold text above.
COUNT function
Counting by Groups: The following example demonstrates how we can count rows or values by groups. The example counts rows by reference and value, but it does so by grouping the return values on the unique string values in the LIST_SET column.
1
2
3
4
5
| SELECT list_set AS grouping_by_column
, COUNT(*)
, COUNT(list_value)
FROM ordinal
GROUP BY list_set; |
It returns the following values:
GROUPING_BY_COLUMN COUNT(*) COUNT(LIST_VALUE)
-------------------- ---------- -----------------
Value SET A 11 10
Value SET B 11 10
The results show that there are eleven rows for each set, and that ten of the rows contain values while one holds a null value.
You can also group on columns that contain null values or sets of columns. The following example groups by two columns, where one contains a null value.
1
2
3
4
5
6
7
8
| SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, COUNT(*)
, COUNT(list_value)
FROM ordinal
WHERE list_set = 'Value Set A'
GROUP BY list_set
, list_name; |
It returns the following values:
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO COUNT(*) COUNT(LIST_VALUE)
-------------------- ----------------- ---------- -----------------
Value SET A Zero 1 1
Value SET A Five 1 1
Value SET A Three 1 1
Value SET A Four 1 1
Value SET A One 1 1
Value SET A Two 1 1
Value SET A Eight 1 1
Value SET A Nine 1 1
Value SET A Seven 1 1
Value SET A Six 1 1
Value SET A 1 0
The last row returned is the one where the grouping by column value and counted column value are null. You should note that the GROUP BY applies to a string column and the return set includes a row grouped by its null value. The count of null column values is always zero.
Another thing that you may have not noticed is that the rows are no longer in ascending numeric order by LIST_NAME or LIST_VALUE columns. There’s a multiple edge sword when you examine why this occurs. Basically with regard to the LIST_NAME column, Oracle and many other databases use advanced sorting algorithms which may leave non-ordered sets as outcomes. The LIST_VALUE column is not in the GROUP BY clause and therefore can’t be used as an ORDER BY key.
When you attempt to use a column outside of the GROUP BY clause to order the return set, like this:
1
2
3
4
5
6
7
8
9
| SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, COUNT(*)
, COUNT(list_value)
FROM ordinal
WHERE list_set = 'Value Set A'
GROUP BY list_set
, list_name
ORDER BY list_value; |
You would see the following error in an Oracle Database 10g instance:
ORDER BY list_value
*
ERROR at line 9:
ORA-00979: NOT a GROUP BY expression
You can filter this particular set because the conditions are limited, but this wouldn’t be a good idea in many other solution sets. However, it certainly highlights some of the potential for extraordinary sorting of result sets.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, COUNT(*)
, COUNT(list_value)
FROM ordinal
WHERE list_set = 'Value Set A'
GROUP BY list_set
, list_name
ORDER BY CASE
WHEN list_name = 'Zero' THEN 0
WHEN list_name = 'One' THEN 1
WHEN list_name = 'Two' THEN 2
WHEN list_name = 'Three' THEN 3
WHEN list_name = 'Four' THEN 4
WHEN list_name = 'Five' THEN 5
WHEN list_name = 'Six' THEN 6
WHEN list_name = 'Seven' THEN 7
WHEN list_name = 'Eight' THEN 8
WHEN list_name = 'Nine' THEN 9
END; |
It returns the following values:
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO COUNT(*) COUNT(LIST_VALUE)
-------------------- ----------------- ---------- -----------------
Value SET A Zero 1 1
Value SET A One 1 1
Value SET A Two 1 1
Value SET A Three 1 1
Value SET A Four 1 1
Value SET A Five 1 1
Value SET A Six 1 1
Value SET A Seven 1 1
Value SET A Eight 1 1
Value SET A Nine 1 1
Value SET A 1 0
A better way to sort this type of data would involve putting it into an inline view and then joining it against the base data set. This allows you to use a non-grouping column as the sort key in an ORDER BY clause. Here’s an example of that approach but note it does take system overhead and should only be done when SQL must perform the sort operation.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| SELECT ilv.grouping_by_not_null
, ilv.group_by_null_too
, ilv.row_count
, ilv.column_count
FROM (SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, COUNT(*) AS row_count
, COUNT(list_value) AS column_count
FROM ordinal
WHERE list_set = 'Value Set A'
GROUP BY list_set
, list_name) ilv LEFT JOIN ordinal o
ON ilv.grouping_by_not_null = o.list_set
AND ilv.group_by_null_too = o.list_name
ORDER BY o.list_value; |
That returns a perfectly ordered set because the LEFT JOIN lets us capture the null value in the inline view without matching the two columns in the join condition. It returns the following set:
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO ROW_COUNT COLUMN_COUNT
-------------------- ----------------- ---------- ------------
Value Set A Zero 1 1
Value Set A One 1 1
Value Set A Two 1 1
Value Set A Three 1 1
Value Set A Four 1 1
Value Set A Five 1 1
Value Set A Six 1 1
Value Set A Seven 1 1
Value Set A Eight 1 1
Value Set A Nine 1 1
Value Set A 1 0
SUM, AVERAGE, MAX, and MIN functions
Math Operations by Groups: These math aggregations are done with the SUM, AVERAGE, MAX, and MIN aggregation functions. Like the previous single row return examples, they’re restricted to columns that contain numeric values. While they also support the default ALL or DISTINCT keywords, it didn’t seem necessary to demonstrate more of those behaviors here.
The following sums, averages, and gets both the minimum and maximum value by unique LIST_SET column values:
SELECT list_set AS grouping_by_not_null
, SUM(list_value) AS ordinal_sum
, AVG(list_value) AS ordinal_avg
, MIN(list_value) AS ordinal_min
, MAX(list_value) AS ordinal_max
FROM ordinal
GROUP BY list_set;
It returns the following two rows:
GROUPING_BY_NOT_NULL ORDINAL_SUM ORDINAL_AVG ORDINAL_MIN ORDINAL_MAX
-------------------- ----------- ----------- ----------- -----------
Value SET A 45 4.5 0 9
Value SET B 45 4.5 0 9
Expanding the GROUP BY criteria to include the LIST_NAME column, you see what happens when the SUM and AVG functions work with only null values. They always return null when adding only column values with a null value. This is handy as you’ll see in the selective aggregation discussion in a subsequent tutorial.
As done in the prior example, this filters out the second value set, to focus on what’s happening with the rows grouped on two columns.
1
2
3
4
5
6
7
8
9
10
| SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, SUM(list_value) AS ordinal_sum
, AVG(list_value) AS ordinal_avg
, MIN(list_value) AS ordinal_min
, MAX(list_value) AS ordinal_max
FROM ordinal
WHERE NOT list_set = 'Value Set B'
GROUP BY list_set
, list_name; |
It returns the following set of records, which are clearly out of order based on the algorithms that optimize aggregation. It’s not hard to read because there aren’t too many rows returned.
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO ORDINAL_SUM ORDINAL_AVG ORDINAL_MIN ORDINAL_MAX
-------------------- ----------------- ----------- ----------- ----------- -----------
Value SET A Zero 0 0 0 0
Value SET A Five 5 5 5 5
Value SET A Three 3 3 3 3
Value SET A Four 4 4 4 4
Value SET A One 1 1 1 1
Value SET A Two 2 2 2 2
Value SET A Eight 8 8 8 8
Value SET A Nine 9 9 9 9
Value SET A Seven 7 7 7 7
Value SET A Six 6 6 6 6
Value SET A
If SQL must sort these into numeric order, it can do so like the prior example with the result from the COUNT function. You must instrument a solution for it through an inline view. The following shows you how to do that.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| SELECT ilv.grouping_by_not_null
, ilv.group_by_null_too
, ilv.ordinal_sum
, ilv.ordinal_avg
, ilv.ordinal_min
, ilv.ordinal_max
FROM (SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, SUM(list_value) AS ordinal_sum
, AVG(list_value) AS ordinal_avg
, MIN(list_value) AS ordinal_min
, MAX(list_value) AS ordinal_max
FROM ordinal
WHERE list_set = 'Value Set B'
GROUP BY list_set
, list_name) ilv LEFT JOIN ordinal o
ON ilv.grouping_by_not_null = o.list_set
AND ilv.group_by_null_too = o.list_name
ORDER BY o.list_value; |
Now you should see an ordered set like this:
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO ORDINAL_SUM ORDINAL_AVG ORDINAL_MIN ORDINAL_MAX
-------------------- ----------------- ----------- ----------- ----------- -----------
Value SET B Zero 0 0 0 0
Value SET B One 1 1 1 1
Value SET B Two 2 2 2 2
Value SET B Three 3 3 3 3
Value SET B Four 4 4 4 4
Value SET B Five 5 5 5 5
Value SET B Six 6 6 6 6
Value SET B Seven 7 7 7 7
Value SET B Eight 8 8 8 8
Value SET B Nine 9 9 9 9
Value SET B
Math Operations with the HAVING clause: The HAVING clause is very useful when you want to find rows that duplicate a column or set of column values. It’s also useful for sorting data sets.
The HAVING clause effects all aggregate rows and acts somewhat like a filter in the WHERE clause of a non-aggregated query. The difference is that the WHERE clause filters the base result set, while the HAVING clause filters the aggregated result set.
The following example captures the sum and average for only the odd LIST_VALUE column values. It does that by using modular (or modulo) mathematics through the MOD function in Oracle’s SQL implementation. Modulo math works by calculating the remainder of integer division. When you divide any number by two and the result is one, then you have an odd number.
Here’s the sample program that illustrates the HAVING clause in a CASE statement. The CASE statement only adds the number when it is an odd number. Any even number is zeroed out. The comparison operator ensures that only rows are returned when the SUM is greater than zero, which only happens for odd numbers.
1
2
3
4
5
6
7
8
9
10
11
| SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, SUM(list_value) AS ordinal_sum
, AVG(list_value) AS ordinal_avg
FROM ordinal
WHERE list_set = 'Value Set A'
HAVING SUM(CASE
WHEN MOD(list_value,2) = 1 THEN list_value ELSE 0
END) > 0
GROUP BY list_set
, list_name; |
The data set, as you’ve seen with others isn’t sorted in ascending order because of how the algorithms work. Since there are only five rows returned, it is easy to see that the HAVING clause let us narrow the return set to odd numbers. What we might overlook is that the HAVING is always a filtering statement. This means we need to have a comparative operator tied to the HAVING return value.
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO ORDINAL_SUM ORDINAL_AVG
-------------------- ----------------- ----------- -----------
Value Set A Five 5 5
Value Set A Three 3 3
Value Set A One 1 1
Value Set A Nine 9 9
Value Set A Seven 7 7
As you’ve seen in early examples, you can also put this inside an inline view and sort the aggregated results. Here’s the code to accomplish that.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| SELECT ilv.grouping_by_not_null
, ilv.group_by_null_too
, ilv.ordinal_sum
, ilv.ordinal_avg
FROM (SELECT list_set AS grouping_by_not_null
, list_name AS group_by_null_too
, SUM(list_value) AS ordinal_sum
, AVG(list_value) AS ordinal_avg
FROM ordinal
WHERE list_set = 'Value Set A'
HAVING SUM(CASE
WHEN MOD(list_value,2) = 1 THEN list_value ELSE 0
END) > 0
GROUP BY list_set
, list_name) ilv LEFT JOIN ordinal o
ON ilv.grouping_by_not_null = o.list_set
AND ilv.group_by_null_too = o.list_name
ORDER BY o.list_value; |
It naturally returns the ordered set of aggregated results from only odd numbers in the base set of values.
GROUPING_BY_NOT_NULL GROUP_BY_NULL_TOO ORDINAL_SUM ORDINAL_AVG
-------------------- ----------------- ----------- -----------
Value Set A One 1 1
Value Set A Three 3 3
Value Set A Five 5 5
Value Set A Seven 7 7
Value Set A Nine 9 9