Metadata and Metamodeling

Attribute-Value Pairs and The Dublin Core

One of the early attempts to document metadata for the Web content was to enrich HTML pages with <Meta> tags in file headers. These tags are widely utilized by Internet search engines. The Meta tags simply define HTML pages with attribute-value pairs. Both an attribute name and its value are processed as data within quotes as depicted in figure 2.6.

In this sample HTML header, all the lines contain metadata: the title, content type, author and description of the page, and the keywords related to the content. However, we see three different methods of representation: a predefined tag, title; a predefined attribute, http-equiv; and general purpose attributes, name and content.  This is an example of the problems of HTML and metadata representation within HTML: the same type of metadata requires different methods of processing. The Dublin Core (DC) MetaData Element Set [34] has been developed to solve this problem.

The DC unifies metadata representation with a small set of defined elements. It introduces metadata elements for most commonly used attributes of resources. Inthe Core Set are title, creator, subject, description, public, contributor, date, type,format, identifier, source, language, relation, coverage, and rights. [34, 93]

The figure 2.7 is an XHTML excerpt that uses several DC metadata elements. XHTML has not completely abandoned its way of handling metadata as in the case of the <title> tag. In the example, first, the <html> element defines the DC XML namespace using the dc: prefix, and then, the DC elements use the prefix as shown.

Document authors and IT managers do not have to adopt these elements into their models, however, any system that interacts with outside information managementsystems might face compatibility issues.

Developing In-House Specifications or Adopting Open Standards

Several approaches are used to resolve the compatibility issues between open standard metadata models developed by working groups or consortiums, and specifications developed internally by individual organizations. One method is to adopt anopen standard entirely. This approach resolves compatibility problems, and allows available tools to be used for implementations. Open standards are also adopted by many organizations and tested under many different circumstances. On the other hand, an open standard model may not meet all the requirements of systems organizations need. Modifications may be required, which may cause ambiguities in interactions with outside systems. Developers have no control over the future of the specification, and weaknesses may never be removed.

Another method is to develop in-house specifications which could be mapped toopen standard models. Developers have the control over life-cycles of specificationsand they can contain all the details specific to the organization. However, this alsorequires that design and implementation of models must be developed by the sameorganization, and additional step could arise for mapping in house and outside modelswhich is prone to performance loss and errors.

A third option is an hybrid solution where developers design their in-house modelsthat allow other metadata models to be integrated, or adopt an open standard thatallows extensions and customizations.

The simplicity of the Dublin Core element set makes it attractive for organizations to integrate into their models. Although its core set of metadata elements define major properties of a document as listed previously, many organizations choose to extend the set with additional elements. For example, there is only one date property which causes ambiguity between the date a document is created and the date it is last modified. Resource Description Framework (RDF) is another example that introduces a larger scope of metadata modeling alternatives while integrating the DC elements into its framework.