匹配管理领域定义语言(MappingMaster DSL)——叙词转换为本体专用语言(二)

前端之家收集整理的这篇文章主要介绍了匹配管理领域定义语言(MappingMaster DSL)——叙词转换为本体专用语言(二)前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

(续)

5、处理单元格内容

默认的操作是直接用引用单元格的内容。然而,默认的规则可以通过使用可选的值指定语句(value specification clause)改变。

这个语句通常由紧跟在编码指定的关键词后面的‘=’符号和由一个圆括号包围,逗号分隔的值指定列表。这些值指定列表,一个接一个的。这些值指定可以是单元格引用,引用的值,包含匹配组的正则表达式,或者内置的文档处理功能


5.1 基本的单元格内容处理

例如,扩展一个引用的表达式从而指定实体从单元格A5创建,就使用rdfs:label 命名编码并且名字的值是字符Sale在前,单元格值在后的的值。可以表达如下:

Class:@A5(rdfs:label=("Sale:",@A5))

值指定引用并不局限于引用单元格本身,也可以表达任意的单元格。多于一个编码也可为一个专门的引用指定,例如,不同的标识和标签值可以因为一个特殊的实体二生成,通过使用不同单元格的内容的方式。

例如,我们能扩展上面的例子,从而给生成类的rdf:ID赋值为B5,如下:

Class:@A5(rdf:ID=@B5 rdfs:label=("Sale:",@A5))


这个语言包含几个内置的文本处理方法,这些方法可以被用在值指定过程中。目前支持方法包括mm:replace,mm:replaceAllmm:replaceFirstmm:prependmm:appendmm:toLowerCasemm:toUpperCasemm:trimmm:reversemm:printfmm:decimalFormat。这些方法能有0个或者更多个参数,并且有一个返回值。提供的参数可以是引用字符串和引用本身的任意组合。


一个在标签分配之前转换单元格A5中的内容为大写的格式的语句可以书写为:

Class:@A5(mm:toUpperCase(@A5))


值处理函数也可用在值指定语句的后面,但是仅限于这些语句没有在引用中使用,并且只有一个函数被使用。


5.2 decimalFormat and printf


decimalFormat 和printf支持文字的和数字的内容的编码。他们的行为遵守标准的java语言的格式。

例如:

  Individual: Fred Facts: hasSalary @A1(mm:decimalFormat("###,###.00",@A1))
 Class: @A1(mm:printf("A_%s",@A1))
5.3 替换字符
mm:replace和mm:replaceAll函数从标准Java String类关联的方法中起作用。
例如,为了移除单元格中所有的非字母数字的字符,mm:replaceAll函数将通过如下方式使用:

Individual:@A5

Facts:hasItems @B5(mm:replaceAll("[^a-zA-Z0-9]",""))


5.4 前追加和后追加

  Class: @A5(rdfs:label=mm:prepend("Sale:")) 
Individual: @A2(mm:append("_MM")) 
5.5 文字

匹配管理目前支持如下的数据类型:

xsd:stringxsd:booleanxsd:bytexsd:shortxsd:intxsd:longxsd:floatxsd:doublexsd:integerxsd:decimalxsd:dateTimexsd:datexsd:timexsd:Durationrdf:PlainLiteralrdf:XMLLiteral


5.6 IRIs

为了自定义IRI创建过程,匹配管理有几个原则

mm:iri,mm:camelCaseEncode,mm:snakeCaseEncode,mm:uuidEncode,mm:hashEncode


5.7 缺失值处理

为了处理缺失单元格的值,默认值也可以在引用中被指定。默认值子句用来为这些单元的分配值。这个子句由mm:DefaultLocationValue,mm:DefaultLiteral,mm:DefaultLabel,和mm:DefaultID关键词表示,这些关键词后面紧跟一个为字符串的分配。例如,下面的表达式用这个子句来表明,“Unknown”值应该被用作新创建的类的label,如果单元格A5为空的情况下:

Class:@A5(rdfs:label mm:DefaultLabel="Unkown")

其他的行为也被支持来处理缺失的单元格值。默认的行为是忽略整个表达,如果它包含任何有空单元格值的引用。四个关键词被提供来更正这种行为。四个关键词是:

mm:ErrorIfEmptyLocation

mm:SkipIfEmptyLocation

mm:WarningIfEmptyLocation

mm:ProcessIfEmptyLocation

最后一个关键词允许电子表格的处理,这个电子表格可能包含大量缺失的值。这个关键词表明,这个语言处理器应该,如果可能的话,谨慎的去掉包含空引用的子表达语句,而不是去掉所有的表达。例如,下面的表达用电子表格的单元格A5申明一个Individual,并且用在单元格A6的值关联一个属性hasAge。

Individual:@A5

Facts:hasAge @A6(mm:ProcessIfEmptyLocation)


这里,用默认的忽略行为情况下,在单元格A5中丢失的值将会导致整个表达式都被忽略。然而,用Process规则的话,单元格A6将会被丢弃,仅仅会在包含它的子句为空的情况下。因此,如果单元格A5包含一个值,而单元格A6为空,这个结果表达式将任会申明一个Individual。


相似的方法,更多好的空值处理方法也被支持来指定一个不同的空值处理行为。这些处理行为可以针对:mm:Literal,rdf:ID和rdfs:label值。这里,这个标注指导规则包括mm:ErrorIfEmptyLabel.mm:SkipIfEmptyLable,mm:WarningIfEmptyLabel,和mm:ProcessIfEmptyLabel,响应的rdf:id和mm:Literal有相同的关键词:

mm:ErrorIfEmptyIDmm:SkipIfEmptyIDmm:WarningIfEmptyIDmm:ProcessIfEmptyIDmm:ErrorIfEmptyLiteralmm:SkipIfEmptyLiteralmm:WarningIfEmptyLiteralmm:ProcessIfEmptyLiteral.


5.8 位置移动(转换)

一个额外的选项被提供来处理空单元格的值。这个选项的目标是在许多电子表格中通常出现的情况,一个特定的单元格被提供一个值,而其下面所有空单元格隐含着与它有着相同的值。在这种情况下,当这些空单元格被处理时,他们的位置必须装换到包含这个值的的单元格的位置。例如,下面的表达式用这个关键词来表明,如果调用A5不包含申明类的名称值,则行号补习向上转换直到一个值被找到。

Class:@A5(mm:ShiftUp)

如果没有值被找到,通用的空值处理国科可以被使用。相似的规则还有:mm:ShiftDown,mm:ShiftLeft,mm:ShiftRight


5.9 在一个引用中遍历一系列的单元格

很明显,大部分的匹配将不会仅仅引用单个的单元格,而是会遍历表格中的一系列的行或者列,通配符‘*’能在一个序列中引用中被用来引用到目前的列或者行。匹配管理提供一个图形接口来指定这些范围。

用这些通配符标注的引用范例包括

@A3

@A*

@**

例如,遍历格网D4到G6以创建一个实例类,Sale,可以表达为:

Individual:@**

Types:Sale **

这个表达式可以被拓展来为这些实例的属性值分配属性

Individual:@**

Types:Sale

Facts:hasAmount @**

hasProduct @B*

hasState @*2




附件:(英文原文)


MappingMaster uses a domain specific language (DSL) to define mappings from spreadsheet content to OWL ontologies. This language is based on theManchester OWL Syntax,which is itself a DSL for describing OWL ontologies.

An introduction to the Manchester Syntax can be foundhere. A set of example Manchester Syntax expressions can be found in theQuick Referencesection of that document.

The Manchester Syntax supports the declarative specification of OWL axioms.

For example,a Manchester Syntax declaration of an OWL named classGumthat is a subclass of a named class calledProductcan be written using using a class declaration clause as:

  Class: Gum SubClassOf: Product 

The MappingMaster DSL extends the Manchester Syntax to support references to spreadsheet content in these declarations. MappingMaster introduces a newreferenceclause for referring to spreadsheet content. In this DSL,any clause in a Manchester Syntax expression that indicates an OWL named class,OWL property,OWL individual,data type,or a literal can be substituted with this reference clause. Any declarations containing such references are preprocessed and the relevant spreadsheet content specified by these references is imported. As each declaration is processed,the appropriate spreadsheet content is retrieved for each reference. This content can then be used in four main ways:

  • It can be used to directly name OWL entities that are created on demand.
  • It can be used to annotate OWL entities that are created on demand.
  • The content may reference existing OWL entities,either directly as a URI or through an annotation property.
  • Finally,the content may be used as a literal.
Using one of these approaches,each reference within an expression is thus resolved during preprocessing to a named OWL entity,a data type,or a literal. The resulting expression can then be executed by a standard Manchester Syntax processor.
@H_908_301@

Table of Contents

@H_403_305@

References

Reference in the MappingMaster DSL are prefixed by the character @. These are generally followed by an Excel-style cell reference. In the standard Excel cell notation,cells extend from A1 in the top left corner of a sheet within a spreadsheet to successively higher columns and rows,with alpha characters referring to columns and numerical values referring to rows .

Basic References Use

@A5

The above cell specification indicates that the reference is relative,meaning that if a formula containing the reference is copied to another cell then the row and column components of the reference are updated appropriately.

Sheets can also be specified by enclosing their name in single quotes and using the "!" character separator between the sheet name and the cell specification:

  @'A sheet'!A3 

These references can then be used in MappingMaster's DSL to define OWL constructs using spreadsheet content.

FlavouredGumis a subclass of the class named by the contents of cell B4 can be written:

  Class: FlavouredGum SubClassOf: @B4 

When processed,this expression will create an OWL named class using the contents of cell B4 ("Gum") as the class name and declareFlavouredGumto be its subclass. If the classGumalready exists,the subclass relationship will simply be established.

That is,references can be used both to define new OWL entities or to refer to existing entities.

A similar expression to declare that the classSalesItemis equivalent to the class named by the contents of cell B4 can be written:

  Class: SalesItem 
  EquivalentTo: @B4 

The Manchester Syntax also supports an individual declaration clause for declaring individuals; property values can be associated with the declared individuals using a facts subclause,which contains a list of property value declarations.

hasStateNamecan be written:

  Individual: @D2 
  Facts: hasStateName "California" 

Here,an individual willCAbe created if necessary and associated with the data propertyhasStateName,which will be given the string value "California".

Using the standard Manchester Syntax,annotation properties can also be associated with declared entities.

hasSourcecan be used to associated the above declared California individual with the source document as follows:

  Individual: @D2 
  Facts: hasStateName "California" 
  Annotations: hasSource "DMV Spreadsheet 12/12/2010" 

Classes or properties can be annotated in the same way. For example,a class can be annotated with thehasSourceannotation property as follows:

  Class: @D2 
  Annotations: hasSource "DMV Spreadsheet 12/12/2010" 

The Manchester Syntax also supports the use of OWL class expressions. In general,a class expression may occur anywhere a named class can occur.

Saleused the contents of cell D4 as the filler of anowl:HasValueaxiom with the propertyhasAmountcan be written:

  Class: Sale 
  SubClassOf: (hasAmount value @D4) 

In general,OWL entities named explicitly in a MappingMaster expression (as opposed to resolved through a reference) must already exist in the target ontology. In these examples,the classesSale,SalesItemandFlavouredGum,and propertieshasAmount,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">hasStateNameandhasSourcemust already exist.

Specifying the Type of a Reference

In the expression

   Class: @A5 
   SubClassOf: Drug 

reference@A5clearly refers to an OWL class. However,the reference type cannot always be inferred unambiguously.

Class: Sale SubClassOf: (@A3 value @D4)

the reference@A3could refer to an object,data,or annotation property,and reference@D4could be either an OWL individual or a literal.

To deal with this situation,Mapping Master supports explicit entity type specification. Specifically,a reference may be optionally followed by a parenthesis-enclosed entity type specification to explicitly declare the type of referenced entity. This specification can indicate that the entity is a named OWL class,an OWL object,data or annotation property,an OWL named individual,or a data type. The MappingMaster keywords to specify the types are the standard Manchester Syntax keywordsClass,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">ObjectProperty,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">DataProperty,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">AnnotationPropertyandIndividual,plus any XSD type name (e.g.,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:int).

Using this specification,the prevIoUs drug declaration,for example,can be written:

  Class: @A5(Class) 
  SubClassOf: Drug 

A declaration of an individual from cell B5 with an associated property value from cell C5 that is of type float can be specified as follows:

  Individual: @B5 
  Facts: hasSalary @C5(xsd:float) 

If thehasSalarydata property is already declared to be of typexsd:floatthen the explicit type qualification is not needed. A global default type can also be specified for literals in the case where the type of the associated data property is either unknown or unspecified or if no explicit type is provided in the reference.

References to OWL properties and individuals can be qualified in the same way.

Reference Resolution

References may specify OWL entities (i.e.,classes,properties,individuals,or datatypes) or literals. When a reference specified an OWL entity the reference value may resolve to an existing OWL entity or may be used to name an OWL entity that is created on demand.

Basic Reference Resolution

A variety of name resolution strategies are supported when creating or referencing OWL entities. The three primary strategies are to:

  • Usingrdf:IDs to create or resolve OWL entities.
  • Userdfs:labelannotations to create or resolve OWL entities
  • Create OWL entities based on the location of a cell ignoring the resolved reference value.
With rdf:ID encoding,and OWL entity generated from a reference is assigned its directly from the resolved reference value. ObvIoUsly,this content must represent a valid identifier (spaces are not,allowed in s for example).

Usingrdfs:labelencoding,an OWL entity resolved from a reference is given an automatically generated URI and itsrdfs:labelannotation value is set to the resolved reference value.

With location encoding,an OWL entity generated from a reference also given an automatically generated URI but in this case the resolved reference value are unused.

The default naming encoding uses therdfs:labelannotation property. The default may also be changed globally.

A name encoding clause is provided to explicitly specify a desired encoding for a particular reference. As with entity type specifications,this clause is enclosed by parentheses after the cell reference. The keywords to specify the three types of encoding aremm:Location,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">rdf:ID,andrdfs:label.

Using this clause,a specification ofrdf:IDencoding for the prevIoUs drug example can be written:

  Class: @B4(rdf:ID) 
  SubClassOf: Drug 

As mentioned,MappingMaster also supports entity creation where cell values are ignored. In this case,the keywordmm:Locationcan be used in parenthesis following a reference.

Individual: @D4(mm:Location)

By default,OWL entities names are resolved or generated using the namespace of the currently active ontology. The language includesmm:prefixandmm:namespaceclauses to override this default behavior.

rdfs:labelresolution) should use the namespace identified by the prefix "clinical",51)"> Individual: @A2(mm:prefix="clinical")

Similarly,an expression to indicate that it must use the namespace "http://clinical.stanford.edu/Clinical.owl#" can be written:

  Individual: @A2(mm:namespace="http://clinical.stanford.edu/Clinical.owl#") 

Explicit namespace or prefix qualification in reference allows disambiguation of duplicate labels in an ontology.

Reference Resolution Using Annotation Values

To support direct references to annotation values in expressions,MappingMaster's DSL adopts the Manchester Syntax mechanism of enclosing these references in single quotes.

Producthas anrdfs:labelannotation value 'A sellable product' it can be referred as follows:

  Class: @B4 
  SubClassOf: 'A sellable product' 

A sellable product will be resolved through an annotation value to the classProductwhen this expression is processed.

Reference Resolution Configuration Options

Document the following options:

mm:defaultPrefix,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:defaultNamespace,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:defaultLanguage,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ResolveIfOWLEntityExists,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:SkipIfOWLEntityExists,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:WarningIfOWLEntityExists,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ErrorIfOWLEntityExists,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:CreateIfOWLEntityDoesNotExist,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:SkipIfOWLEntityDoesNotExist,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:WarningIfOWLEntityDoesNotExist,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ErrorIfOWLEntityDoesNotExist,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ProcessIfEmptyLabel,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ErrorIfEmptyLabel,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:WarningIfEmptyLabel,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:SkipIfEmptyLabel

Processing Cell Content

The default behavior is to directly use the contents of the referenced cell. However,this default can be overridden using an optionalvalue specification clause.

This clause is usually indicated by the '=' character immediately after the encoding specification keyword and is followed by a parenthesis-enclosed,comma-separated list ofvalue specifications,which are appended to each other. These value specifications can be cell references,quoted values,regular expressions containing capturing groups,or inbuilt text processing functions.

Basic Cell Content Processing

rdfs:labelname encoding and that the name is to be the value of the cell preceded by the string "Sale:" can be written as follows:

  Class: @A5(rdfs:label=("Sale:",@A5)) 

Value specification references are not restricted to the referenced cell itself and may indicate arbitrary cells. More than one encoding can also be specified for a particular reference so,separate identifier and label annotation values can be generated for a particular entity using the contents of different cells.

rdf:IDof generated classes to cell B5 as follows:

  Class: @A5(rdf:ID=@B5 rdfs:label=("Sale:",51); font-size:16px"> If the assignment list includes only a single value then the opening and closing parenthesis can be omitted:

 The language includes several inbuilt text processing methods that be used in value specifications. At present,several methods are supported. These includemm:replace,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:replaceAll,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:replaceFirst,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:prepend,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:append,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:toLowerCase,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:toUpperCase,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:trim,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:reverse,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:printf,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:decimalFormat. These methods take zero or more arguments and return a value. Supplied arguments may be any combination of quoted strings or references.

An expression to convert the contents of cell A5 to upper case before label assignment can be written:

  Class: @A5(mm:toUpperCase(@A5)) 

A method can also have an explicit first argument omitted if the argument refers to the current location value. The prevIoUs expression can thus also be written:

  Class: @A5(mm:toUpperCase) 

Value processing functions can also used outside of value specification clauses - but only if these clause are not used in a reference,and only a single function can be used.

decimalFormat and printf

decimalFormatandprintfsupport formatting of textual and numerical content. Their behavior follows the standard Java specifications for theDecimalFormatclass and theString.formatmethod.

mm:decimalFormatcan be used as follows:

 When the value of cellA1is "23000.2" this will render:

   Individual: Fred Facts: hasSalary "23,000.20"

Here is an example ofmm:printf:

   Class: @A1(mm:printf("A_%s",51); font-size:16px"> When value of cellA1is "Car" this will render:

   Class: A_Car

Any parameter can be replaced with a reference clause. These functions will work with explicitrdf:IDandrdfs:labelassignment too.

Note that if only one parameter is supplied the second is assumed to be the enclosing reference location.

So

   Individual: Fred Facts: hasSalary @A1(mm:decimalFormat("###,###.00"))

is equivalent to:

 And

   Class: @A1(mm:printf("A_%s"))

Which is also equivalent to:

   Class: @A1(rdf:ID=mm:printf("A_%s",@A1))

Replacing Characters

Themm:replaceandmm:replaceAllfunctions follow from the associated methods in the standard JavaStringclass.

mm:replaceAllfunction can be used as follows:

  Individual: @A5 
  Facts: hasItems @B5(mm:replaceAll("[^a-zA-Z0-9]","")) 

mm:replacemethod can be used to replace commas with periods when processing literals:

  Individual: @A2 
  Facts: hasSalary @A3(xsd:float mm:replace(",",".")) 

Prepending and Appending

mm:prependmethod can be used as follows to simplify the above example:

  Class: @A5(rdfs:label=mm:prepend("Sale:")) 

The expression can be further simplified by omitting the explicitrdfs:labelqualification if it is the default:

  Class: @A5(mm:prepend("Sale:")) 

appendmethod works similarly.

mm:appendfunction:

  Individual: @A2(mm:append("_MM")) 

Extracting Values Using Regular Expressions

A similar approach can be used to selectively extract values from referenced cells. A regular expressiongroupsclause is provided and can be used in any position in a value specification clause. This clause is contained in a quoted string enclosed by square parenthesis. For example,if cell A5 in a spreadsheet contains the string "Pfizer:Zyvox" but only the text following the ':' character is to be used in the label encoding,an appropriate capture expression could be written as:

  Class: @A5(rdfs:label=[":(\S+)"]) 

Note that parentheses around the sub-expressions in a regular expression clause specify capture groups and indicate that the matched strings are to be extracted. In some cases,more than one group may be matched for a cell value,in which case the matched strings are extracted in the order that they are matched and are appended to each other.

Capturing groups can also be used to generate literals. For example,if cell A2 in a spreadsheet has a person's forename,middle initial,and surname separated by a single space,three capturing expressions can be used to selectively extract each name portion and separately assign them to different properties as follows:

  Individual: @A2 
  Types: Person 
  Facts: hasForename @A2(["(\S+)"]),hasInitial @A2(["\S+\s(\S+)"]),hasSurname @A2(["\S+\s\S+\s(\S+)"]) 

A similar example to separately extract two space-separated integers from a cell can be written as:

  Individual: @A2 
  Types: Person 
  Facts: hasMin @A2(xsd:int ["(\d+)\s+"]),hasMax @A2(xsd:int ["\s+(\d+)"]) 

hasManandhasMaxproperties are of typexsd:intthen the explicit qualification is not required here.

Capturing expressions can also be invoked via themm:capturingfunction:

  Individual: @A2 
  Types: Person 
  Facts: hasForename @A2(mm:capturing("(\S+)")

The Syntax of capturing expressions follows that supported by the JavaPatternclass.

Literals

Mapping Master currently supports the following datatypes:

xsd:string,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:boolean,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:byte,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:short,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:int,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:long,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:float,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:double,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:integer,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:decimal,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:dateTime,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:date,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:time,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">xsd:Duration,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">rdf:PlainLiteral,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">rdf:XMLLiteral

IRIs

Mapping Master has several directives to customize the IRI creation process.

mm:irimm:camelCaseEncodemm:snakeCaseEncodemm:uuidEncode
Directive Explanation
Use the resolved reference value to generate an IRI. An error will be thrown if the generated value does not represent a valid IRI.
mm:hashEncode

Missing Value Handling

To deal with missing cell values,default values can also be specified in references. A default value clause is provided to assign these values. This clause is indicated by the keywordsmm:DefaultLocationValue,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:DefaultLiteral,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:DefaultLabel,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:DefaultIDfollowed by an assignment to a string. For example,the following expression uses this clause to indicate that the value "Unknown" should be used as the created class label if cell A5 is empty:

  Class: @A5(rdfs:label mm:DefaultLabel="Unknown") 

Additional behaviors are also supported to deal with missing cell values. The default behavior is to skip an entire expression if it contains any references with empty cells. Four keywords are supplied to modify this behavior. These keywords indicate that:

  • An error should be thrown if a cell value is missing and the mapping process should be stopped (mm:ErrorIfEmptyLocation)
  • Expressions containing references with empty cells should be skipped (mm:SkipIfEmptyLocation)
  • Expressions containing references with empty cells should generate a warning in addition to being skipped (mm:WarningIfEmptyLocation)
  • Expressions containing such empty cells should be processed (mm:ProcessIfEmptyLocation).
The last option allows processing of spreadsheets that may contain a large amount of missing values. The option indicates that the language processor should,if possible,conservatively drop the sub-expression containing the empty reference rather than dropping the entire expression.

Consider,the following expression declaring an individual from cell A5 of a spreadsheet and associating a propertyhasAgewith it using the value in cell A6:

  Individual: @A5 
  Facts: hasAge @A6(mm:ProcessIfEmptyLocation) 

Using a similar approach,more fine grained empty value handling is also supported to specify different empty value handling behaviors formm:Literal,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">rdfs:labelvalues. Here,the label directives aremm:SkipIfEmptyLabel,with equivalent keywords for RDF identifier and literal handling. These aremm:ErrorIfEmptyID,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:SkipIfEmptyID,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:WarningIfEmptyID,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ProcessIfEmptyIDandmm:ErrorIfEmptyLiteral,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:SkipIfEmptyLiteral,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:WarningIfEmptyLiteral,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ProcessIfEmptyLiteral.

Location Shifting

One additional option is provided to deal with empty cell values. This option is targeted to the common case in many spreadsheets where a particular cell is supplied with a value and all empty cells below it are implied to have the same value. In this case,when these empty cells are being processed,their location must be shifted to the location above it containing a value. For example,the following expression uses this keyword to indicate that call A5 does not contain a value for the name of the declared class then the row number must be shifted upwards until a value is found:

  Class: @A5(mm:ShiftUp) 

If no value is found,normal empty value handling processing is applied. Similar directives provide for shifting down (mm:ShiftDown),and to allow shifting to the left (mm:ShiftLeft),or to the right (mm:ShiftRight).

Iterating Over a Range of Cells in a Reference

ObvIoUsly,most mappings will not just reference individual cells but will instead iterate of a range of columns or rows in a spreadsheet. The wildcard character '*' can then be used in references to refer to the current column and/or row in an iteration. MappingMaster provides a graphical interface to specify these ranges. (They will soon be supported in the DSL.)

Example references using this wildcard notation include:

  • @A3
  • @A*
  • @**
For example,an expression that iterates over the grid D4 to G6 to create an individual of class Sale for each cell can be written:
  Individual: @** 
  Types: Sale **

This expression can be extended to assign property values to these individuals:

  Individual: @** 
  Types: Sale 
  Facts: hasAmount @**,hasProduct @B*,hasState @*2 

Manchester Syntax Coverage

The DSL does not support the entire Manchester Syntax. The following clauses arenot currently supported:

  • OWL object property declarations
  • OWL data property declarations
  • OWL annotation property declarations
  • OWL datatype declarations
  • OWL literal type qualification
  • OWL disjoint classes
  • OWL equivalent and disjoint properties
  • OWL negative property assertions
  • OWL has key

Configuration Options

A set of global defaults can be specified for reference directives. The language has a number of clauses to specify these defaults.

The following examples illustrate the use of these clauses together with the current defaults.

  • mm:DefaultReferenceTypeCurrent default isClass. Other possible values includeNamedIndividual,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">AnnotationProperty,and any XSD datatype.
  • mm:DefaultPropertyTypeCurrent default isObjectProperty. Other possible value areDataPropertyandAnnotationProperty.
  • mm:DefaultPropertyValueTypeCurrent default isxsd:stringIf we are expecting a (data or annotation) property value,usexsd:string
  • mm:DefaultDataPropertyValueTypeCurrent default isxsd:string. Other possible values include any XSD datatype.
  • mm:DefaultValueEncodingCurrent default isrdf:ID. Other possible values arerdfs:Label,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:Literalandrdfs:Location.
  • mm:DefaultIRIEncodingCurrent default ismm:CamelCaseEncoding. Other passible values aremm:NoEncode,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:NoSnakeCaseEncode,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:UUIDEncodeandmm:HashEncode.
  • mm:DefaultShiftSettingCurrent default ismm:NoShift. Other possible values aremm:ShiftUp,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ShiftDown,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ShiftLeft,monospace; font-size:13.6px; padding:0.2em 0px; margin:0px">mm:ShiftRight.
  • mm:DefaultEmptyLocationSettingCurrent default ismm:WarningIfEmptyLocation.
  • mm:DefaultEmptyLiteralSettingCurrent default ismm:WarningIfEmptyLiteral.
  • mm:DefaultEmptyRDFIDSettingCurrent default ismm:WarningIfEmptyRDFID.
  • mm:DefaultEmptyRDFSLabelSettingCurrent default ismm:WarningIfEmptyRDFSLabel.
  • mm:DefaultIfOWLEntityExistsSettingCurrent default ismm:ResolveIfOWLEntityExists.
  • mm:DefaultIfOWLEntityDoesNotExistSettingCurrent default ismm:CreateIfOWLEntityDoesNotExist.
  • mm:DefaultLocationValueCurrent default is"".
  • mm:DefaultLiteralValueCurrent default ismm:DefaultRDFIDCurrent default ismm:DefaultRDFSLabelCurrent default ismm:DefaultLanguageCurrent default ismm:DefaultPrefixCurrent default ismm:DefaultNamespaceCurrent default is"".

Summary

The MappingMaster DSL allows OWL axioms and entities to be created from spreadsheet content. The use of the Manchester Syntax allows these OWL entities to be related to each other in complex ways.

Declaratively specifying mappings in this way has several advantages. The writing of these mappings does not require any programming or scripting expertise. These mappings can be shared easily using theMappingMaster GUI,which can save and load theese mappings. The mappings can also easily be executed repeatedly on different spreadsheets with the same structure.

猜你在找的正则表达式相关文章