• +49-(0)721-402485-12
Ihre Experten für XML, XQuery und XML-Datenbanken

Textual content of a XML element

As shown before, the content of an XML element consists of text or of child elements or of a mixture of both. However, the text in an element may contain characters which normally have a specific meaning in XML, such as the "<" character. In order to make it recognisable for a XML processor that this shall not be markup but textual content of a document, such a text can be embedded in a CDATA section. In elements, such CDATA sections are allowed wherever characters are permitted. They start with <![CDATA[ and end with ]]> and cannot be nested:

<Ward Manager="Nurse_01">
<Name><![CDATA[Intensive care & emergency medicine]]></Name>
</Ward>

In a DTD so-called "parsed entities" can be defined which are used like the macros in programming languages. If they are completely defined in the internal DTD of the document, they are called internal, otherwise external.

<!ENTITY advice "<Advice>All information is subject to change</Advice>">
<!ENTITY external SYSTEM "http://www.xquery-book.de/myentity">

Parsed entities my contain markup as shown in the example above (Entity advice). However, the value of a parsed entity must be well-formed.

An entity is refereced by the "&" symbol, followed by the entity name and finished with a semicolon. Such a reference is resolved by the XML processor, where appropriate, by applying it recursively. The document fragment

<Course>&advice;20.0</Course>

becomes with the above entity definition:

<Course><Advice>All information is subject to change</Advice> 20.0</Course>

In XML, five entities are predefined in order to facilitate the application of characters having a specific meaning in XML. These entities are: lt (for the less-than sign/left angle bracket), gt (for the greater-than sign/right angle bracket), amp (for &), quot (for ") and apos (for '). The above XML fragment can also be written in the following way:

<Ward Manager="Nurse_01">
<Name>Intensive care &amp; emergency medicine</Name>
</Ward>

The predefined entities are the only entities supported by XQuery!

Character references have the same syntax as entity references. The Unicode character U+00FF (the letter ÿ) may be written, for example, in hexadecimal form as &#xFF; or in decimal form as &#255;. Character references are preferred for characters which are missing on the keyboard but can be presented in the coding of the source system or which may lead to side effects when transferring the document. Character references can also be used in XQuery.

 

Source: "XQuery – Grundlagen und fortgeschrittene Methoden", dpunkt-Verlag, Heidelberg (2004)

<< backnext >>