Metadata and Metamodeling


XML: Language for Specifications

Extensibility, verboseness, strong syntax and structure checking make XML a good candidate to design new markup languages as discussed previously. Authority and regulations are needed to control and simplify uses of a vast number of different languages. The WWW Consortium serves as the governing body on standardization of XML languages as specifications.

XML specifications are markup languages with unique namespaces and language elements whose semantics and syntaxes are well defined. Such clarifications provide the basis for world-wide compatibility and acceptance that would allow XML data modeling last many years. Applications that claim to be compliant with any number of XML specifications are expected to process and produce XML documents according to the same specifications.

XML syntax is defined by the XML specification [112]. How an XML document must start; how elements and attributes can be named; what the document delimiters and special characters are, all are defined in the specification. Developers are left to decide on the document structure.

There are two most common approaches to define XML language specifications: Document Type Definitions (DTD) [55] and schema-based definition languages, i.e., the XML Schema [116], SOX [33], and RELAX NG [109]. DTDs are a subset of SGML. They have a very simple structure and are easy to create and modify. However, DTDs provide neither primitive data types, i.e. integers and float numbers, nor extensibility for new types. DTDs are being deprecated in favor of schemas. Schemas not only solve some of the problems of DTDs but also go far beyond.

The XML Schema specification is an example to schema-based definition languages. Although the specification is more complex to learn and develop languages in than DTDs, one can model object-oriented design (OOD) patterns to a certain degree. Features like object classes and instances, inheritance, encapsulation, polymorphism, modularity, and class abstractions are supported. Since the specification was not intended to be a model language for OOD, there are efforts to map the Unified Modeling Language (UML), a modeling language commonly used in OOD, to XML Schema [99, 19, 66, 29] to simplify uses of legacy designs with new technologies.