The Alignment API and the Ontology Mapping Language

Please note that this page is outdated as it only targets HALE up to version 2.1.2
We aimed to use an implementation-neutral as well as shema-language independent mapping language from which the actual transformation code can be derived. The criteria for this model2model mapping language are in short :
  • Expressive enough: it must support renaming of classes and attributes, restructuring, reclassification, and also a number of general (non-geo), geometric and topological functions to transform geographic data;
  • The actual mapping code for different platforms/implementations can be derived from it, for example XSLT or XQuery for XML/GML, or Java code, to do the actual data transformation;
  • Preferably it builds on existing standards or initiatives.

After literature research and testing of tools, we selected the the ontology mapping language (OML) proposed by Scharffe, Euzenat et al. This mapping language (alternative name 'alignment language') is the result of a number of European projects: DIP, SEKT and - most recent - Knowledge Web.

The following sections give an overview of the Alignment API that we use, providing examples encoded in the Ontology Mapping Language. For additional examples, please refer to Scharffe, F. (2008). Correspondence Patterns Representation. Faculty of Mathematics, Computer Science and Physics, University of Innsbruck.
h2. Alignment, Schema and Formalism

The Alignment represents all mappings defined in between two conceptual schemas. It comprises references to information about those two schemas, plus some metadata on the alignment itself. This metadata includes an About object that can be used to identify the alignment, a list of mappings, a list of ValueClasses used to encode enumerations in OML, and a value called Level which can be used to indicate the level of concreteness that this alignment represents. The following listing provides an example for such an Alignment element (with some omissions detailed later on).

<Alignment xmlns:omwg="http://www.omwg.org/TR/d7/ontology/alignment" 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
    xmlns:goml="http://www.esdi-humboldt.eu/schemas/goml" 
    xmlns:align="http://knowledgeweb.semanticweb.org/heterogeneity/alignment" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:gml="http://www.opengis.net/gml/"> 
  <align:level></align:level>
  <align:onto1>
    <align:Ontology>
      <align:location>
        http://www.esdi-humboldt.org/waterVA    
      </align:location>
      <align:formalism>...</align:formalism> 
    </align:Ontology>   
  </align:onto1>
  <align:onto2>
    <align:Ontology>
      <align:location>
        urn:x-inspire:specification:gmlas-v31:Hydrography:2.0
      </align:location>
      <align:formalism>...</align:formalism> 
    </align:Ontology> 
  </align:onto2>   
  <align:map>...</align:map> 
</Alignment>

For GML Application schemas, all HUMBOLDT schema mapping/translation applications expect the following formalism element.

<align:Formalism>
  <align:uri>http://www.opengis.net/gml/3.2.1/</align:uri>
  <align:name>GML 3.2.1 Application Schema</align:name>
</align:Formalism>

Alternatively, GML 3.1 or GML 2.1.2 declarations are also allowable.

Cell

A Cell contains a mapping between two Entities, such as FeatureClasses or Property objects. It represents the basic unit of conceptual schema mapping. The following listing provides an example for a simple equivalence mapping of two FeatureClasses (in GML: FeatureTypes).

<align:Cell>
  <omwg:entity1>
    <omwg:Class rdf:about=" 
      http://www.esdi-humboldt.org/waterVA/Watercourses_VA">
    <omwg:transf rdf:resource="…RenameFeatureFunction "/>
    </omwg:Class>
  </omwg:entity1>
  <omwg:entity2>
    <omwg:Class rdf:about=" 
      urn:x-inspire:specification:gmlas-v31:Hydrography:2.0/Watercourse">
    <omwg:transf/>
    </omwg:Class>
  </omwg:entity2>
  <align:relation>Equivalence</align:relation>
</align:Cell>

It should be noted that the Relation element is optional in the case of an equivalence relation, but should still be given always. The OML also contains an optional measure element which is not used in HALE at this time. This element can be used to save a confidence that a human or an algorithm would assign any given mapping.

The next listing provides an example of a simple Property mapping, where no relation element is needed. When a Property is mapped, a transf element always has to be present to indicate any required transformations of the type. In this case, a simple NAME attribute is used to set up a geographicalName attribute as required in the INSPIRE schema.

<align:Cell>
  <omwg:entity1>
    <omwg:Property rdf:about=" 
         http://www.esdi-humboldt.org/waterVA/Watercourses_VA/NAME">
      <omwg:transf rdf:resource="...CreateInspireGeoName">
      </omwg:transf>
    </omwg:Property>
  </omwg:entity1>
  <omwg:entity2>
    <omwg:Property rdf:about="urn:x-inspire:specification:
           gmlas-v31:Hydrography:2.0/SurfaceWater/geographicalName">
      <omwg:transf/>
    </omwg:Property>
  </omwg:entity2>
</align:Cell>

Cells can also be conditional. The following example provides a Cell that defines a mapping between two FeatureClasses that is only applicable when a certain Restriction on an attribute/property is met. This is also one of the places where we have defined an extension to the OML, namely the inclusion of filters based on OGC Common Query Language (CQL) expressions. These simplify the structure that the OML foresees for defining filters:

<align:Cell>
  <omwg:entity1>
    <omwg:Class rdf:about=" 
      http://www.esdi-humboldt.org/waterVA/Watercourses_VA">
      <omwg:transf rdf:resource="...RenameFeatureFunction"/>
      <omwg:attributeValueCondition>
        <omwg:Restriction>
          <goml:cqlStr>LEVEL > 200</goml:cqlStr>
        </omwg:Restriction>
      </omwg:attributeValueCondition>
    </omwg:Class>
  </omwg:entity1>
  <omwg:entity2>
    <omwg:Class rdf:about="urn:x-inspire:
      specification:gmlas-v31:Hydrography:2.0/Watercourse">
      <omwg:transf/>
    </omwg:Class>
  </omwg:entity2>
</align:Cell>

Furthermore, Cells can also define instance split and merge conditions. An instance split is a case where from one entity represented in the source schema, multiple entities in the target schema are created. For instance merges, the opposite is true: multiple source features are used to create a single target feature. The next listing provides an example for a split case:

<align:Cell>
  <omwg:entity1>
    <omwg:Class rdf:about=" 
      http://www.esdi-humboldt.org/waterVA/Watercourses_VA">
      <omwg:transf rdf:resource="…RenameFeatureFunction ">
        <param>
          <name>InstanceSplitCondition</name>
          <value>foreach LineString in waterVA:the_geom</value> 
        </param>
      </omwg:transf>
    </omwg:Class>
  </omwg:entity1>
  …
</align:Cell>

Example merge case:

<align:Cell>
  <omwg:entity1>
    <omwg:Class rdf:about=" 
      http://www.esdi-humboldt.org/waterVA/Watercourses_VA">
      <omwg:transf rdf:resource="…RenameFeatureFunction ">
        <param>
          <name>InstanceMergeCondition</name>
          <value>groupBy waterVA:LEVEL</value> 
        </param>
      </omwg:transf>
    </omwg:Class>
  </omwg:entity1>
  …
</align:Cell>

Finally, there can also be so-called augmentation Cells, which define transformations on Entity2 instead of Entity1. These have the special characteristic that they don’t need any of the information available in the source features and they are executed after all other mappings have been applied. The next listing provides an example for such an augmentation function.


<align:Cell>
  <omwg:entity1>
    <Class rdf:about="null"></Class>
  </omwg:entity1>
  <omwg:entity2>
    <omwg:Class rdf:about=" 
      http://www.esdi-humboldt.org/waterVA/Watercourses_VA">
      <omwg:transf rdf:resource="…NilReasonFunction">
        <param>
          <omwg:name>NilReasonType</omwg:name>
          <omwg:value>unpopulated</omwg:value>
        </param>
      </omwg:transf>
    </omwg:Class>
  </omwg:entity2>
  <align:relation>Equivalence</align:relation>
</align:Cell>

As can be seen from the example, augmentation cells also have the special characteristic that they use a “null” entity for entity1.