Publications

XML & RDF

Combining graph and tree: writing SHAX, obtaining SHACL, XSD and more

Abstract

The Shapes Constraint Language (SHACL) is a data modeling language for describing and validating RDF data.This paper introduces SHAX, which is an XML syntax for SHACL. SHAX documents are easy to write andunderstand. They cannot only be translated into executable SHACL, but also into XSD describing XML dataequivalent to the RDF data constrained by the SHACL model. Similarly, SHAX can be translated into JSONSchema describing a JSON representation of the data. SHAX may thus be viewed as an abstract data modelinglanguage, which does not prescribe a concrete representation language (RDF, XML, JSON, …), but can be translated into concrete models validating concrete model instances.

Download: shax.pdf

XQuery

Java Integration of XQuery – an Information Unit-Oriented Approach

Abstract

An infrastructure for integrating XQuery into Java systems is described. The infrastructure comprises a new API (XQJPLUS, built on the standard API XQJ) and a tool for Java code generation. The basic idea of the approach is to deliver query results not in terms of query result items, but in terms of "information units", ready-to-use entities assembled from the result items. The assembly process is guided by control information embedded into the query result, so that the query controls exactly what will be delivered, and in which form. Information units can represent information in a great variety of forms, including many map types and custom objects. The information units produced by a query are collected into a special container ("info tray") which offers name-based, intuitive access to the units. The query-specific structure of an info tray may be formally defined by a tray schema from which an "info shape" can be generated, a Java class representing a specific kind of info tray and offering compiler checked data access. Info trays also support data integration, as their possibly very heterogeneous contents can be addressed in a uniform way, using path-like expressions.

Download: balisage2010-java-xquery.pdf

XQuery as a data integration language

Abstract

The appropriateness of the XQuery language for data integration is explored. The starting point is an assessment of integration capabilities in an XML-only environment. The next step is an evaluation of the degree to which one may extend these capabilities to heterogeneous environments with multiple media types and various data access protocols. This leads to the identification of a key challenge, which is the structured representation of non-XML data formats by items of the XQuery data model. The current support for such representation is reviewed, and a conceptual base is proposed for modeling the relationship between data model items and instances of non-XML formats. As special facets of data integration, the roles of REST and RDF in XQuery-based integration are discussed, and general limitations of XQuery as an integration language are acknowledged.

Download: data integration.pdf

XQuery topic tools – concept, user interface, development framework

Abstract

This paper defines the concept of topic tools, which are command line tools providing a single point of access to a range of functionality. Topic tools conform to a generic model of invocation syntax and basic tool behaviour, concerning user assistence, error diagnostics and invocation reuse. The paper proposes a comprehensive model of the user-perspective – syntax and behaviour – and it introduces a simple development framework making the creation of XQuery topic tools simple and fast. The support offered by the framework includes code generation and the use of a message interface which cleanly isolates the application code from user input and gives it access to validated and augmented information, rather than the raw data of user input. Key properties of framework-based topic tools are early availability, extensibility, user convenience, behavioural consistency and reliability based on very thorough and fully automated input validation.

Download: balisage2014-topic-tools.pdf

FOXpath

FOXpath – an expression language for selecting files and folders

Abstract

A new expression language (FOXpath, short for folder XPath) enables XPath-like addressing of files and folders in a file system. The first version of the language is a modified copy of XPath 3.0, with node navigation removed and file system navigation added. The language is based on the data model XDM 3.0, without assuming any modifications of the model. In a second step, the language was merged back into XPath 3.0, resulting in FOXpath 3.0, which is a superset of XPath 3.0. The new expression language supports node navigation, file system navigation and a free combination of both functionalities within a single path expression. A reference implementation is described, and the possibility of extending the new functionality beyond file systems is discussed.

Download: balisage2016-foxpath.pdf

FOXpath navigation of physical, virtual and literal file systems

Abstract

The FOXpath language extends the XPath language by adding support for file system navigation. This paper explores possibilities how to extend file system navigation beyond physical file systems and include logical file systems like jar files, SVN repositories or github projects. The extension is based on a set of simple concepts related to URIs and their processing, and it is implemented as a FOXpath processor which supports the navigation of physical and various types of logical file systems.

XML Technologies – miscellaneous

The XML info space

Abstract

XML-related standards imply an architecture of distributed information which integrates all accessible XML resources into a coherent whole. Attempting to capture the key properties of this architecture, the concept of an info space is defined. The concept is used as a tool for deriving desirable extensions of the standards. The proposed extensions aim at a fuller realization of the potential offered by the architecture. Main aspects are better support for resource discovery and the integration of non-XML resources. If not adopted by standards, the extensions may also be emulated by application-level design patterns and product-specific features. Knowledge of them might therefore be of immediate interest to application developers and product designers.

Download: balisage2013-infospace.pdf

XDML – an extensible markup language and processor for XDM

Abstract

XDML is a set of rules how XDM values can be built which are more useful entities as compared to ordinary XDM values. The key idea is to insert into the XDM values control information which guides the interpretation and processing of the data. In particular, it structures the XDM value into named parts and associates these parts with metadata. The control information is evaluated by an XDML processor, which reports and processes the data accordingly. The processing of a part is organized as the execution of operations which the control data bind to the part, but whose actual invocation depends on API calls of the XDML user. The bindings are represented by request messages which encode the actual input to operations selected from an extensible library of available "XDML operations". The operation bindings of a part can be regarded as a specific interface dynamically attached to the data of the part. The net result of this approach is to enable the creation of self-describing XDM values: they encode the way how they are presented to applications, as well as how they should or might be processed. This means that the XDM producer – e.g. XQuery programs – can emit "rich" data whose downstream processing is significantly simplified.

Download: balisage2011-xdml.pdf

From XML to UDL: a unified document language, supporting multiple markup languages

Abstract

A proposal is made how to extend the XML node model in order to be compatible with JSON markup as well as XML markup. As XML processing technology (XPath, XQuery, XSLT, XProc) sees instances of the node model, but does not see syntax, it is thus enabled to handle JSON as well as XML. The extended node model is dubbed a Unified Document Language, as it defines the construction of documents from building blocks (nodes) which can be encoded in various markup languages (XML, JSON, HTML).

Download: balisage2012-udl.pdf

Node search preceding node construction – XQuery inviting non-XML technologies

Abstract

We propose an approach how to complement XPath navigation with a node search which does not require node construction. Node search is based on a set of external properties (a "p-face") which a node may assume in the context of a node collection. Being external, these properties can be retrieved without node construction, and being stored outside the nodes they can be maintained and queried by non-XML technologies, e.g. relational and NOSQL databases. A small set of concepts, carefully aligned with the XQuery data model, allows the seamless integration of various non-XML technologies driving node selection, without introducing any dependencies of XQuery code on any particular technology. A first implementation of the concepts is presented.

Download: node search.pdf

Location trees enable XSD based tool development

Abstract

Conventional use of XSD documents is mostly limited to validation, documentation and the generation of data bindings. The possibility of additional uses is little considered. This is probably due to the difficulty of processing XSD, caused by its arcane graph structure. An effective solution might be a generic transformation of XSD documents into a tree-structured representation, capturing the model contents in a transformation-friendly way. Such a tree-structured schema derivative is offered by location trees, a format defined in this paper and generated by an opensource tool. The intended use of location trees is an intermediate to be transformed into interesting artifacts. Using a chemical image, location trees can play the role of a catalyst, dramatically lowering the activation energy required to transform XSD into valuable substances. Apart from this capability, location trees are composed of a novel kind of model components inviting the attachment of metadata. The resulting metadata trees enable innovative tools, including source code generators. A few examples illustrate the new possibilities, tentatively summarized as XSD based tool development.

Rethinking transformation – the potential of code generation

Abstract

A code generator for document to document transformation is introduced. It reduces the development effort to editing a set of metadata items attached to a tree model of the target documents. Metadata values are XQuery expressions which are typically so simple that they do not require genuine programming skills. Nevertheless, expressions are more difficult to provide than static values, and therefore possibilities of further simplifying the development task are explored, striving to enable subject matter experts to define the transformation without writing XQuery expressions. This can be achieved by generating the expressions from assertions about alignments between source and target nodes, although specific requirements will often necessitate additional information. As alignments can be represented graphically by connecting lines, the approach amounts to a solid conceptual foundation for graphical mapping tools. Finally, the underlying model of code generation driven by target document structure is generalized into a conceptual framework which is not restricted to XML data sources. Its usefulness is demonstrated by a simple code generator for transforming RDF data into XML documents.

Download: rethinking transformation.pdf

Project Management

Differences between the classic and the agile world

There are many texts dealing with the difference between agile and classical project management. Now, another text about this topic. I try to demonstrate the differences on two levels: on the mindset level and on the theory level.

Agile project managment

Agility as an ability

parsQube's observation from our organization development and project management support with several hardware and software organizations over the time shows a large number of demands to support the "agile journey" in their organizations.

Our very first question is, why agile? Most of the time, the answer is "we have a complex product". The good news is, they care about their product but what about their project? How can their project support their complex product development? Does the product life cycle go beyond the project life cycle? Does the product life cycle contain several project life cycles? Eventually what does agile mean for them? From our perspective, the rapid changes in products need to be reflected in projects as well, therefore the question is: is your organization or are you as a C-suite or project manager ready enough to welcome external and internal changes? Why does agility help organizations in today's uncertain environment?

The mindset of agility

In our previous post we talked about the benefits of being agile for organizations and leaders. As you already noticed, we keep talking about mindset and ability. But what is this ability and what is the idea and mindset behind this ability?

To answer this question, we must differentiate between the two mindsets' humans can hold:

The Fixed mindset (a change can be a source of risks and should be avoided)
The Growing mindset / Agile mindset (a change can be a source of opportunities and is welcome)

Let's start with a question: Are you a good cook? If your answer to this question is a clear "yes" or a clear "no", this can be characterized as a fixed mindset. But what could be the answer of a growing mindset to this question? And what is the goal and character of an agile mindset?