RELAX NG derived Schema for OpenMath

The schema

Currently we maintain two different schema for OpenMath, a DTD (omobj.dtd) and a W3C XML Schema (omobj.xsd). Having two schemas causes maintenance problems, and some features of the language, notably that an OMF must have exactly one attribute, either hex or dec, are not captured by either schema.

This is an experiment to describe OpenMath 1 using Relax NG, which is able to express more constraints than either XSD or DTD, and then to automatically derive XSD and DTD files from the Relax NG schema (Using James Clark's trang convertor).

More information on Relax NG, and links to implementations of validators and convertors for this schema format are available from the Relax NG Home page.

Example Documents

The following table links to some example OpenMath XML files, and indicates whether they are valid or invalid according to these schema (as reported by Jing, rxp or xsv).

File Note RNG (jing) DTD (rxp) XSD (xsv)
ombad1.xmlAttribute and element content not matching regexp constraints.invalidvalidinvalid
ombad2.xmlAttributed "Variable" an OMIinvalidvalidinvalid
ombad3.xmlOMF with no attribute or both dec and hexinvalidvalidvalid
ombad4.xmlNo namespaceinvalidvalidinvalid
omgood1.xmlSmall valid examplevalidvalidvalid
omgood2.xmlValid attributed bound variablevalidvalidvalid

Conversion Scripts

Converting the RNC format to RNG and XSD is simply a matter of

trang openmath1.rnc openmath1.rng
trang openmath1.rnc openmath1.xsd

In order to produce a DTD, the RNC needs to be simplified slightly before using trang, (due to documented restrictions in trang's dtd support) the script currently just uses sed for this and is available here.

CD specific schema

Relax NG allows one to state constraints such as:

If the cd attribute of OMS is arith1 then the name attribute must be one of lcm, gcd, plus, ...

Thus it is easy to use a stylesheet cd2rnc.xsl to generate for any given CD, a Relax NG schema that expresses the constraint that an OMS naming that CD must only use symbols defined in the specified dictionary.

The modularisation mechanisms of Relax NG then allow one to include these schema for all the CDs that you want to allow, and to replace the regexp-based validation of the OMS attributes by explicit lists of allowed CD names, and for each CD Name, a list of allowed symbol names.

A Typical CD-specific Relax NG schema would be arith1.rnc which looks like:

cd.attlist.OMS |= 
  attribute cd {string "arith1" },
  attribute name {
    string "lcm" |
    string "gcd" |
    string "plus" |
    string "unary_minus" |
    string "minus" |
    string "times" |
    string "divide" |
    string "power" |
    string "abs" |
    string "root" |
    string "sum" |
    string "product" }

To build a schema that allows this CD we just need to include the openmath1 schema described above, override the attribute declarations for OMS, and then include the schema for each CD that we wish to allow. omextra.rnc does this for all the CDs in the "extra" collection available from this site (that is, all the CDs other than contribted CDs.

Its format is as shown below, and may be trivially edited to allow a different list of CDs for specific purposes.

include "openmath1.rnc" {
 attlist.OMS = cd.attlist.OMS}

include "alg1.rnc"
include "altenc.rnc"
include "arith1.rnc"
include "arith2.rnc"
include "bigfloat1.rnc"
include "calculus1.rnc"
include "cc.rnc"
include "coercions.rnc"
include "combinat1.rnc"
include "complex1.rnc"
include "dimensions1.rnc"
include "ecc.rnc"
include "error.rnc"
include "fns1.rnc"
include "fns2.rnc"
include "group1.rnc"
include "icc.rnc"
include "indnat.rnc"
include "integer1.rnc"
include "interval1.rnc"
include "lc.rnc"
include "limit1.rnc"
include "linalg1.rnc"
include "linalg2.rnc"
include "linalg3.rnc"
include "linalg4.rnc"
include "linalg5.rnc"
include "list1.rnc"
include "list2.rnc"
include "logic1.rnc"
include "mathmltypes.rnc"
include "meta.rnc"
include "metagrp.rnc"
include "metasig.rnc"
include "minmax1.rnc"
include "moreerrors.rnc"
include "multiset1.rnc"
include "nums1.rnc"
include "omtypes.rnc"
include "opnode.rnc"
include "permgrp.rnc"
include "permut1.rnc"
include "physical_consts1.rnc"
include "piece1.rnc"
include "poly.rnc"
include "polyd.rnc"
include "polyr.rnc"
include "polyslp.rnc"
include "polysts.rnc"
include "polyu.rnc"
include "quant1.rnc"
include "relation0.rnc"
include "relation1.rnc"
include "rounding1.rnc"
include "s_data1.rnc"
include "s_dist1.rnc"
include "semigroup.rnc"
include "set1.rnc"
include "set2.rnc"
include "setname1.rnc"
include "setname2.rnc"
include "setoid.rnc"
include "sigma.rnc"
include "sts.rnc"
include "transc1.rnc"
include "transc2.rnc"
include "transc3.rnc"
include "typesorts.rnc"
include "units_imperial1.rnc"
include "units_metric1.rnc"
include "veccalc1.rnc"


More CD specific schema

Initial comments on this page have been generally positive but people seem to want more constraints still...

cdsts2rnc.xsl is a modification of the above stylesheet that also inspects the associated STS file for a CD. It builds two enumerated lists in the schema, one for general symbols and one for symbols that should only be used as keys in attributions.

For example the "type" symbol in the sts CD is such a symbol, and sts-sts.rnc has the form:

cd.attlist.OMS |= 
  attribute cd {string "sts" },
  attribute name {
    string "mapsto" |
    string "nary" |
    string "nassoc" |
    string "error" |
    string "structure" |
    string "binder" |
    string "attribution" |
    string "Object" |
    string "NumericalValue" |
    string "SetNumericalValue" }

cd.attlist.OMS.attrib |= 
  attribute cd {string "sts" },
  attribute name {
    string "type" }

The schema omextra-sts.rnc is as shown below and redefines OMATP to only allow OMS elements that have attributes matching the cd.attlist.OMS.attrib pattern, and then includes all the CD specific schema as before.


default namespace = "http://www.openmath.org/OpenMath"

include  "openmath1.rnc" {
 attlist.OMS = cd.attlist.OMS

OMATP = element OMATP { (element OMS {cd.attlist.OMS.attrib}, omel)+ }
}

include "alg1-sts.rnc"
include "altenc-sts.rnc"
include "arith1-sts.rnc"
include "arith2-sts.rnc"
include "bigfloat1-sts.rnc"
include "calculus1-sts.rnc"
include "cc-sts.rnc"
include "coercions-sts.rnc"
include "combinat1-sts.rnc"
include "complex1-sts.rnc"
include "dimensions1-sts.rnc"
include "ecc-sts.rnc"
include "error-sts.rnc"
include "fns1-sts.rnc"
include "fns2-sts.rnc"
include "group1-sts.rnc"
include "icc-sts.rnc"
include "indnat-sts.rnc"
include "integer1-sts.rnc"
include "interval1-sts.rnc"
include "lc-sts.rnc"
include "limit1-sts.rnc"
include "linalg1-sts.rnc"
include "linalg2-sts.rnc"
include "linalg3-sts.rnc"
include "linalg4-sts.rnc"
include "linalg5-sts.rnc"
include "list1-sts.rnc"
include "list2-sts.rnc"
include "logic1-sts.rnc"
include "mathmltypes-sts.rnc"
include "meta-sts.rnc"
include "metagrp-sts.rnc"
include "metasig-sts.rnc"
include "minmax1-sts.rnc"
include "moreerrors-sts.rnc"
include "multiset1-sts.rnc"
include "nums1-sts.rnc"
include "omtypes-sts.rnc"
include "opnode-sts.rnc"
include "permgrp-sts.rnc"
include "permut1-sts.rnc"
include "physical_consts1-sts.rnc"
include "piece1-sts.rnc"
include "poly-sts.rnc"
include "polyd-sts.rnc"
include "polyr-sts.rnc"
include "polyslp-sts.rnc"
include "polysts-sts.rnc"
include "polyu-sts.rnc"
include "quant1-sts.rnc"
include "relation0-sts.rnc"
include "relation1-sts.rnc"
include "rounding1-sts.rnc"
include "s_data1-sts.rnc"
include "s_dist1-sts.rnc"
include "semigroup-sts.rnc"
include "set1-sts.rnc"
include "set2-sts.rnc"
include "setname1-sts.rnc"
include "setname2-sts.rnc"
include "setoid-sts.rnc"
include "sigma-sts.rnc"
include "sts-sts.rnc"
include "transc1-sts.rnc"
include "transc2-sts.rnc"
include "transc3-sts.rnc"
include "typesorts-sts.rnc"
include "units_imperial1-sts.rnc"
include "units_metric1-sts.rnc"
include "veccalc1-sts.rnc"

The example ombad6.xml which looks like:










]]>

Is declared valid by the openmath1.rnc and omextra.rnc schmema but invalid by omextra-sts.rnc as <OMS name="plus" cd="arith1"/> is not allowed as an odd numbered child of OMATP, and conversely <OMS name="type" cd="sts"/> is only allowed in such a position.