• +49-(0)721-402485-12
Ihre Experten für XML, XQuery und XML-Datenbanken

The structure of a XML document

The element is the basic construct of a XML document and consists of the following components:

  • Start tag
    A start tag starts with a left angle bracket followed by the name of the element, optionally by attributes and in any case by a right angle bracket.
  • Content
    The content of an element may consist of text and nested elements.
  • End tag
    An end tag starts with a left angle bracket and a slash followed by the name of the element and a right angle bracket.

In case an element contains further elements as well as text it is referred to as mixed content, as in the following example in which formatting instructions are contained in the text:

<Finding>Feebleness, fever. Suspected <em>severe flu</em></Finding>

There is a short notation for empty elements. The two following notations are equivalent:

<Bed ID="Bed_reha_25_001" RoomNumber="025"></Bed>
<Bed ID="Bed_reha_25_001" RoomNumber="025"/>

Attributes consists of a name and value(s), connected by an equal sign (in the example above the attribute is RoomNumber="025"). The attribute value must be enclosed in single or double quotation marks. An attribute name may occur only once per element. While the order of the elements carries the semantics and must not be changed without changing the XML document, the order of the attributes within an element can be changed at will.

For the example document shown above, a structure description is unknown. XML expressly allows this. In order to become well-formed, a XML document has to meet only a few syntax requirements. The most important ones are:

  • XML declaration
    A XML document may start with a XML declaration which, for example, may look as follows:

    <?xml version="1.0" encoding="utf-8" standalone="yes"?>

    In this declaration, the optional encoding indicates which character set has been used for the document and the likewise optional standalone directive indicates whether the document contains external declarations.
  • DTD
    After the XML declaration, the language definition in the form of the Document Type Definition (DTD) may follow.
  • Root element
    A XML document must contain (at the top level) exactly one element (which normally contains further elements). This element is called root element.
  • Comments
    At any point outside the markup, which means also before and after the root element, comments may occur. A comment starts with <!-- and ends with -->.
  • Processing instructions
    A processing instruction gives an application instructions of how to deal with a XML document. The processing instruction binding a stylesheet to a XML document is well-known:

    <?xml:stylesheet type="text/xsl" href="stylesheets/print.xsl" ?>

    A processing instruction always starts with <? followed by the target — here xml:stylesheet —, by means of which an application decides whether it is able to interpret the processing instruction. Then the actual content of the processing instruction follows and the closing character sequence ?>.

 

Source: "XQuery – Grundlagen und fortgeschrittene Methoden", dpunkt-Verlag, Heidelberg (2004)

<< backnext >>