• +49-(0)721-402485-12
Ihre Experten für XML, XQuery und XML-Datenbanken

Aggregation functions

In XQuery the aggregation is based on the formulation in OQL, whereby the "classic" aggregation functions fn:sum(), fn:avg(), fn:count(), fn:min() and fn:max() are provided which expect as argument a sequence and realise the following specifics:

Calculation of the cardinality (fn:count())

The fn:count() function returns the number of elements within the sequence passed on as parameter. An empty sequence results in a return value of 0.

Calculation of the average value (fn:avg())

The fn:avg() function defined in the function library returns a sequence in case an empty sequence is passed on as argument; if at least one element of the sequence passed on is of xs:float or xs:double data type and if it has a NaN value, a NaN value is returned as result. In general, all elements must have the same data type.
If this is not the case, an adjustment is made among one another which particularly applies for elements of the xs:untypedAtomic data type.

If all elements are of xs:untypedAtomic data type, they are implicitly converted to a value of xs:double data type. In principle, time indications must be of xsd:yearMonthDuration or xsd:dayTimeDuration data type. The following examples illustrate the semantics of the calculation of an average value:

fn:avg((3,4,5))

obviously returns the value 4 with xs:decimal data type.

fn:avg((xdt:yearMonthDuration("P20Y"),
xdt:yearMonthDuration("P10M") ))

returns a value of 125 months of xsd:yearMonthduration data type.

Determination of minimum and maximum values (fn:min() and fn:max())

The two functions each return the smallest or greatest element from the sequence passed on in terms of value. In case of a stalemate, the selection depends on the implementation. As with the calculation of the average, it is required in the general case that all elements are of the same type; values of xdt:untypedAtomic data type are converted to values of xs:double data type; the rules for elements of NaN value apply similarly.

With regard to time indications, a determination of the maximum or minimum value is also permitted for values of xs:dateTime, xs:date and xs:time data type. The two functions are to be simulated by the following FLWOR expression as, for example, user-defined function. In this way, fn:max($x) can be substituted by the following expression:

let $y := for $e in $x
order by $e (: optional: collation for character strings :)
return $e
return $y[fn:last()]

As a consequence, a modification of the return clause to

return $y[1]

results in a determination of the minimum value. The following examples additionally illustrate the possibility of XQuery for aggregation regarding minimum and maximum:

fn:max((3,4,5))

obviously returns the value 5 of xs:integer data type.

fn:max((3,4,"Zero"))

results in an error message since the data types are not the same. However, the expression

fn:max(("a", "b", "c"))

returns "c", whereby the result can be modified by indicating a specific sorting order.

Summation (fn:sum())

The summation requires that all elements are of the same type; elements of xdt:untypedAtomic type are converted to values of xs:double data type. An empty sequence as parameter results in the return of the value 0.0E0 of xs:double type. The summation of time indications exclusively refers to time intervals, for example:

fn:sum((xdt:yearMonthDuration("P20Y"),
xdt:yearMonthDuration("P10M")))

results in a time interval of 250 months as manifestation of xsd:yearMonthDuration. The numerical summation works as expected:

fn:sum((4,5,6))

returns 15.

fn:sum((1,(2 to 9)[.<5], 10))

returns 20.

The first expression obviously returns the value 15. The second expression shows that each expression which returns a sequence is also permitted as parameter when calling the function. Here, the result has the value 1+2+3+4+10=20.

SignatureDescription
fn:count(
$arg as item()*)
as xs:integer
returns the number of the elements of the sequence passed on
fn:avg(
$arg as xdt:anyAtomicType*)
as xdt:anyAtomicType?
returns the average value of all elements of the sequence passed on:
sum($arg) div count($arg)
fn:max(
$arg as xdt:anyAtomicType*[,
$collation as string])
as xdt:anyAtomicType?
optionally returns the maximum value in terms of value with regard to a sorting order
fn:min(
$arg as xdt:anyAtomicType*[,
$collation as string])
as xdt:anyAtomicType?
optionally returns the minimum value in terms of value with regard to a sorting order
fn:sum(
$arg as xdt:anyAtomicType*[,
$zero as xdt:anyAtomicType?])
as xdt:anyAtomicType?
returns the summary value of all element values contained in the sequence;
if the second parameter is not indicated,
the value 0.0E0 is returned for an empty sequence; otherwise the value of the second parameter

Table: Aggregation functions

It should be noted here that aggregation functions expect a sequence as parameter. The expression fn:sum(4,5,6), for example, would produce an error because not a sequence, but three values as one parameter each would be interpreted as arguments.

 

Source: "XQuery – Grundlagen und fortgeschrittene Methoden", dpunkt-Verlag, Heidelberg (2004)

<< backnext >>