• +49-(0)721-402485-12
Your experts for XML, XQuery and XML Databases

Symmetric join

In XQuery, a join is specified as a "nested for loop", whereby in the for clause(s) the control variables are bound, in the where clause the join predicate is evaluated and in the return clause the document parts identified by the variables are used in order to generate a fragment of the connected overall document.

In order to illustrate an explicit join, again reference is made to the example of the assignment of managing personnel to the respective ward of the hospital, but without tracing the references per fn:id():

let $h := fn:doc("Hochwaldklinik.xml")
for $w in $h//Ward,
$n in $h//Nurse
where $w/@Manager = $n/@ID
return
<Ward>
{$w/Name}
<Manager>
{$n//FirstName}
{$n//LastName}
</Manager>
</Ward>

In this example, firstly all wards in the outer loop are run through and for each ward the list of all nurses is conceptually searched for items with a corresponding number.

It should be noted here that without using ID/IDREF constructs at schema level and without using this join query, the 1:N relation (i.e. a ward is actually managed by only one nurse) is no longer ensured. Alternatively, this property could be enforced analogously by using key/keyref constructs at schema level.

Since in this simple case the join predicate is a test for equality, alternatively the where clause can be omitted and the check of the join predicate can be included in the evaluation of the path expression in the inner loop. For the example above, the following join query can be generated:

let $h := fn:doc("Hochwaldklinik.xml")
for $w in $h//Ward,
$n in $h//Nurse[@ID = $w/@Manager]
return
<Ward>
{$w/Name}
<Manager>
{$n//FirstName}
{$n//LastName}
</Manager>
</Ward>

Since in this example the semantics of the inner join is implicitly applied, because it can be expected that each ward has an acutally existing nurse as managing personnel, the order of the for clause can be arbitrarily changed and the corresponding path expressions adapted. As a consequence, the scenario above can be written as follows:

let $h := fn:doc("Hochwaldklinik.xml")
for $n in $h//Nurse,
$w in $h//Ward[@Manager = $n/@ID]
return
<Ward>
{$w/Name}
<Manager>
{$n//FirstName}
{$n//LastName}
</Manager>
</Ward>

However, this symmetry can only be applied if at schema level an existence of the referenced entities is ensured via primary/foreign key concepts. In case left/right outer joins or complete outer joins must be taken into account, the concept of the nesting of FLWOR expressions is to be used. Outer joins even return a result item if one or both join partners do not find an appropriate item and, therefore, would not be taken over in the result set.
From the application' s point of view, outer joins are always required if a data stock shall be enriched by further – possibly optionally existing – information. A left/right outer join between a personnel master file and a telephone list returns to each personnel item the appropriate telephone number; contrary to the semantics of the inner join, an item is even taken over into the result if no telephone number is assigned to a person.

As a further remark it should be stated that the projection of the "double columns" being common in the relational join is resolved in a constructive way for a natural join in the context of XQuery by formulating the result by any nearly free structure of the result expression and as a consequence, the elimination of equal values falls to the return clause. Furthermore, it should be noted here that the optional order of the for clauses and the "involving" of join predicates from the FLWOR expression in the path expressions is part of the optimisation of a XQuery processor. It rests on the user to specify the final result in a descriptive way.

 

Source: "XQuery – Grundlagen und fortgeschrittene Methoden", dpunkt-Verlag, Heidelberg (2004)

<< backnext >>