Home » XML Schematron Editor
Schematron is a rule-based validation language for making assertions about the presence or absence of patterns in XML trees. It is a structural schema language expressed in XML using a small number of elements and XPath.
Schematron is capable of expressing constraints in ways that other XML schema languages like XML Schema and DTD cannot. For example, it can require that the content of an element be controlled by one of its siblings. Or it can request or require that the root element, regardless of what element that is, must have specific attributes. Schematron can also specify required relationships between multiple XML files.
Constraints and content rules may be associated with “plain-English” validation error messages, allowing translation of numeric Schematron error codes into meaningful user error messages.
XML Schematron Editor in Workgroup works as follows:
The Schematron schema language differs from most other XML schema languages in that it is a rule-based language that uses path expressions instead of grammars. This means that instead of creating a grammar for an XML document, a Schematron schema makes assertions that are applied to a specific context within the document. If the assertion fails, a diagnostic message that is supplied by the author of the schema can be displayed.
One advantages of a rule-based approach is that in many cases modifying the wanted constraint written in plain English can easily create the Schematron rules. For example, a simple content model can be written like this: “The Person
element should in the XML instance document have an attribute Title
and contain the elements Name
and Gender
in that order. If the value of the Title
attribute is ‘Mr’ the value of the Gender
element must be ‘Male’.”
In this sentence the context in which the assertions should be applied is clearly stated as the Person
element while there are four different assertions:
Person
) should have an attribute Title
Name
and Gender
Name
should appear before the child element Gender
Title
has the value ‘Mr’, the element Gender
must have the value ‘Male’In order to implement the path expressions used in the rules in Schematron, XPath is used with various extensions provided by XSLT.
It has already been mentioned that Schematron makes various assertions based on a specific context in a document. Both the assertions and the context make up two of the four layers in Schematron’s fixed four-layer hierarchy:
The bottom layer in the hierarchy is the assertions, which are used to specify the constraints that should be checked within a specific context of the XML instance document. In a Schematron schema, the typical element used to define assertions is assert
. The assert
element has a test
attribute, which is an XSLT pattern. In the preceding example, there was four assertions made on the document in order to specify the content model, namely:
Person
) should have an attribute Title
Name
and Gender
Name
should appear before the child element Gender
Title
has the value ‘Mr’, the element Gender
must have the value ‘Male’Written using Schematron assertions this would be expressed as
Type | Test | Text |
---|---|---|
Assert | @Title |
The element Person must have a Title attribute. |
Assert | count(*) = 2 and count(Name) = 1 and count(Gender)= 1 |
The element Person should have the child elements Name and Gender. |
Assert | *[1] = Name |
The element Name must appear before element Gender. |
Assert | (@Title = 'Mr' and Gender = 'Male') or @Title != 'Mr' |
If the Title is “Mr” then the gender of the person must be “Male”. |
If you are familiar with XPath, these assertions are easy to understand, but even for people with limited experience using XPath they are rather straightforward. The first assertion simply tests for the occurrence of an attribute Title
. The second assertion tests that the total number of children is equal to 2 and that there is one Name
element and one Gender
element. The third assertion tests that the first child element is Name
, and the last assertion tests that if the person’s title is ‘Mr’, the gender of the person must be ‘Male’.
If the condition in the test
attribute is not fulfilled, the content of the assertion element is displayed to the user.
Each of these assertions has a condition that is evaluated, but the assertion does not define where in the XML instance document this condition should be checked. For example, the first assertion tests for the occurrence of the attribute Title
, but it is not specified on which element in the XML instance document this assertion is applied. The next layer in the hierarchy, the rules, specifies the location of the contexts of assertions.
The Assert type element is used to tag positive assertions about a document.
The Report type is used to tag negative assertions about a document.
The rules in Schematron are declared by using the rule
element, which has a context
attribute. The value of the context
attribute must match an XPath Expression that is used to select one or more nodes in the document. Like the name suggests, the context
attribute is used to specify the context in the XML instance document where the assertions should be applied. In the previous example the context was specified to be the Person
element, and a Schematron rule with the Person
element as context would simply be
Id | Abstract | Context |
---|---|---|
False | Person |
Since the rules are used to group all assertions together that share the same context, the rules are designed so that the assertions are declared as children of the rule
element. For the previous example, this means that the complete Schematron rule would be
The element Person must have a Title attribute.
The element Person should have the child elements Name and Gender.
The element Name must appear before element Age.
If the Title is "Mr" then the gender of the person must be "Male".
This means that all the assertions in the rule will be tested on every Person
element in the XML instance document. If the context is not all the Person
elements, it is easy to change the XPath location path to define a more restricted context. The value Database/Person,
for example, sets the context to be all the Person
elements that have the element Database
as its parent.
The third layer in the Schematron hierarchy is the pattern, declared using the pattern
element, which is used to group together different rules. The pattern
element also has a name
attribute that will be displayed in the output when the pattern is checked. For the preceding assertions, you could have two patterns: one for checking the structure and another for checking the co-occurrence constraint. Since patterns group different rules together, Schematron is designed so that rules are declared as children of the pattern
element. This means that the previous example, using the two patterns, would look like
The element Person must have a Title attribute.
The element Person should have the child elements Name and Gender.
The element Name must appear before element Age.
If the Title is "Mr" then the gender of the person must be "Male".
The name of the pattern will always be displayed in the output, regardless of whether the assertions fail or succeed. If the assertion fails, the output will also contain the content of the assertion element. However, there is also additional information displayed together with the assertion text to help you locate the source of the failed assertion. For example, if the co-occurrence constraint above was violated by having Title
=’Mr’ and Gender
=’Female’ then the following diagnostic would be generated by Schematron:
From pattern "Check structure":
From pattern "Check co-occurrence constraints":
Assertion fails: "If the Title is "Mr" then the gender of the person must be "Male"."
at /Person[1] ...</>
The pattern names are always displayed, while the assertion text is only displayed when the assertion fails. The additional information starts with an XPath expression that shows the location of the context element in the instance document (in this case the first Person
element) and then on a new line the start tag of the context element is displayed.
The assertion to test the co-occurrence constraint is not trivial, and in fact this rule could be written in a simpler way by using an XPath predicate when selecting the context. Instead of having the context set to all Person
elements, the co-occurrence constraint can be simplified by only specifying the context to be all the Person
elements that have the attribute Title
=’Mr’. If the rule was specified using this technique, the co-occurrence constraint could be described like this
If the Title is "Mr" then the gender of the person must be "Male".
By moving some of the logic from the assertion to the specification of the context, the complexity of the rule has been decreased. This technique is often very useful when writing Schematron schemas.
*[Reference: www.xml.com/pub/a/2003/11/12/schematron.html]
NORTH AMERICA: +1 (877)-872-0027
WORLD: 1-418-872-4000
EMAIL: info@caristix.com
ASIA-PACIFIC
DENIS CANTIN
T: +61418441388
denis.cantin@caristix.com
NORTH AMERICA & EUROPE
JEAN-LUC MORIN
T: 418 872-4000
jeanluc.morin@caristix.com
HL7® / FHIR® are a registered trademarks of Health Level Seven International. The use of this trademark does not constitute an endorsement by HL7.