http://www.w3schools.com/schema/schema_howto.asp
What is an XML Schema?
The purpose of an XML Schema is to define the legal building blocks of an XML
document, just like a DTD.
An XML Schema:
- defines elements that can appear in a document
- defines attributes that can appear in a document
- defines which elements are child elements
- defines the order of child elements
- defines the number of child elements
- defines whether an element is empty or can include text
- defines data types for elements and attributes
- defines default and fixed values for elements and attributes
Well-Formed is not Enough
A well-formed XML document is a document that conforms to the XML syntax
rules, like:
- it must begin with the XML declaration
- it must have one unique root element
- start-tags must have matching end-tags
- elements are case sensitive
- all elements must be closed
- all elements must be properly nested
- all attribute values must be quoted
- entities must be used for special characters
The <schema> element may contain some attributes. A schema declaration
often looks something like this:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
...
...
</xs:schema>
|
The following fragment:
xmlns:xs="http://www.w3.org/2001/XMLSchema"
|
indicates that the elements and data types used in the schema come from the
"http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the
elements and data types that come from the "http://www.w3.org/2001/XMLSchema"
namespace should be prefixed with xs:
This fragment:
targetNamespace="http://www.w3schools.com"
|
indicates that the elements defined by this schema (note, to, from, heading,
body.) come from the "http://www.w3schools.com" namespace.
This fragment:
xmlns="http://www.w3schools.com"
|
indicates that the default namespace is "http://www.w3schools.com".
This fragment:
elementFormDefault="qualified"
|
indicates that any elements used by the XML instance document which were
declared in this schema must be namespace qualified.
What is a Simple Element?
A simple element is an XML element that can contain only text. It cannot
contain any other elements or attributes.
Defining a Simple Element
The syntax for defining a simple element is:
<xs:element name="xxx" type="yyy"/>
|
XML Schema has a lot of built-in data types. The most common types are:
- xs:string
- xs:decimal
- xs:integer
- xs:boolean
- xs:date
- xs:time
Default and Fixed Values for Simple Elements
Simple elements may have a default value OR a fixed value specified.
A default value is automatically assigned to the element when no other value
is specified.
In the following example the default value is "red":
<xs:element name="color" type="xs:string" default="red"/>
|
A fixed value
is also automatically assigned to the element, and you cannot specify another
value.
In the following example the fixed value is "red":
<xs:element name="color" type="xs:string" fixed="red"/>
|
How to Define an Attribute?
The syntax for defining an attribute is:
<xs:attribute name="xxx" type="yyy"/>
|
Restrictions are used to define acceptable values for XML
elements or attributes. Restrictions on XML elements are called facets.
Restrictions on Values
The following example defines an element called "age" with a restriction. The
value of age cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
Restrictions on a Set of Values
To limit the content of an XML element to a set of acceptable values, we
would use the enumeration constraint.
The example below defines an element called "car" with a restriction. The
only acceptable values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The example above could also have been written like this:
<xs:element name="car" type="carType"/>
<xs:simpleType name="carType">
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
|
Note: In this case the type "carType" can be used by other elements
because it is not a part of the "car" element.
Restrictions on a Series of Values
To limit the content of an XML element to define a series of numbers or
letters that can be used, we would use the pattern constraint.
The example below defines an element called "letter" with a restriction. The
only acceptable value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example defines an element called "initials" with a restriction. The
only acceptable value is THREE of the UPPERCASE letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example also defines an element called "initials" with a
restriction. The only acceptable value is THREE of the LOWERCASE OR UPPERCASE
letters from a to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example defines an element called "choice" with a restriction. The
only acceptable value is ONE of the following letters: x, y, OR z:
<xs:element name="choice">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[xyz]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example defines an element called "prodid" with a restriction. The
only acceptable value is FIVE digits in a sequence, and each digit must be in a
range from 0 to 9:
<xs:element name="prodid">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:pattern value="[0-9][0-9][0-9][0-9][0-9]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
Other Restrictions on a Series of Values
The example below defines an element called "letter" with a restriction. The
acceptable value is zero or more occurrences of lowercase letters from a to
z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a-z])*"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example also defines an element called "letter" with a restriction.
The acceptable value is one or more pairs of letters, each pair consisting of a
lower case letter followed by an upper case letter. For example, "sToP" will be
validated by this pattern, but not "Stop" or "STOP" or "stop":
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="([a-z][A-Z])+"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example defines an element called "gender" with a restriction. The
only acceptable value is male OR female:
<xs:element name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
The next example defines an element called "password" with a restriction.
There must be exactly eight characters in a row and those characters must be
lowercase or uppercase letters from a to z, or a number from 0 to 9:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-zA-Z0-9]{8}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
Restrictions on Whitespace Characters
To specify how whitespace characters should be handled, we would use the
whiteSpace constraint.
This example defines an element called "address" with a restriction. The
whiteSpace constraint is set to "preserve", which means that the XML processor
WILL NOT remove any white space characters:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
This example also defines an element called "address" with a restriction. The
whiteSpace constraint is set to "replace", which means that the XML processor
WILL REPLACE all white space characters (line feeds, tabs, spaces, and carriage
returns) with spaces:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
This example also defines an element called "address" with a restriction. The
whiteSpace constraint is set to "collapse", which means that the XML processor
WILL REMOVE all white space characters (line feeds, tabs, spaces, carriage
returns are replaced with spaces, leading and trailing spaces are removed, and
multiple spaces are reduced to a single space):
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
Restrictions on Length
To limit the length of a value in an element, we would use the length,
maxLength, and minLength constraints.
This example defines an element called "password" with a restriction. The
value must be exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:length value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
This example defines another element called "password" with a restriction.
The value must be minimum five characters and maximum eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
|
Restrictions for Datatypes
Constraint |
Description |
enumeration |
Defines a list of acceptable values |
fractionDigits |
Specifies the maximum number of decimal places allowed. Must be
equal to or greater than zero |
length |
Specifies the exact number of characters or list items allowed.
Must be equal to or greater than zero |
maxExclusive |
Specifies the upper bounds for numeric values (the value must be
less than this value) |
maxInclusive |
Specifies the upper bounds for numeric values (the value must be
less than or equal to this value) |
maxLength |
Specifies the maximum number of characters or list items allowed.
Must be equal to or greater than zero |
minExclusive |
Specifies the lower bounds for numeric values (the value must be
greater than this value) |
minInclusive |
Specifies the lower bounds for numeric values (the value must be
greater than or equal to this value) |
minLength |
Specifies the minimum number of characters or list items allowed.
Must be equal to or greater than zero |
pattern |
Defines the exact sequence of characters that are acceptable
|
totalDigits |
Specifies the exact number of digits allowed. Must be greater
than zero |
whiteSpace |
Specifies how white space (line feeds, tabs, spaces, and carriage
returns) is handled |
What is a Complex Element?
A complex element is an XML element that contains other elements and/or
attributes.
There are four kinds of complex elements:
- empty elements
- elements that contain only other elements
- elements that contain only text
- elements that contain both other elements and text
How to Define a Complex Element
Look at this complex XML element, "employee", which contains only other
elements:
<employee>
<firstname>John</firstname>
<lastname>Smith</lastname>
</employee>
|
We can define a complex element in an XML Schema two different ways:
1. The "employee" element can be declared directly by naming the element,
like this:
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
|
If you use the method described above, only the "employee" element can use
the specified complex type. Note that the child elements, "firstname" and
"lastname", are surrounded by the <sequence> indicator. This means that
the child elements must appear in the same order as they are declared. You will
learn more about indicators in the XSD Indicators chapter.
2. The "employee" element can have a type attribute that refers to the name
of the complex type to use:
<xs:element name="employee" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
|
You can also base a complex element on an existing complex element and add
some elements, like this:
<xs:element name="employee" type="fullpersoninfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="fullpersoninfo">
<xs:complexContent>
<xs:extension base="personinfo">
<xs:sequence>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
|
Complex Types with Mixed Content
An XML element, "letter", that contains both text and other elements:
<letter>
Dear Mr.<name>John Smith</name>.
Your order <orderid>1032</orderid>
will be shipped on <shipdate>2001-07-13</shipdate>.
</letter>
|
The following schema declares the "letter" element:
<xs:element name="letter">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="orderid" type="xs:positiveInteger"/>
<xs:element name="shipdate" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
|
Note: To enable character data to appear between the child-elements of
"letter", the mixed attribute must be set to "true". The <xs:sequence> tag
means that the elements defined (name, orderid and shipdate) must appear in that
order inside a "letter" element.
We can control HOW elements are to be used in documents with
indicators.
Indicators
There are seven indicators:
Order indicators:
Occurrence indicators:
Group indicators:
- Group name
- attributeGroup name
All Indicator
The <all> indicator specifies that the child elements can appear in any
order, and that each child element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
|
Note: When using the <all> indicator you can set the
<minOccurs> indicator to 0 or 1 and the <maxOccurs> indicator can
only be set to 1
Choice Indicator
The <choice> indicator specifies that either one child element or
another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice>
</xs:complexType>
</xs:element>
|
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear
in a specific order:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
|
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
Note: For all "Order" and "Group" indicators (any, all, choice,
sequence, group name, and group reference) the default value for maxOccurs and
minOccurs is 1.
maxOccurs Indicator
The <maxOccurs> indicator specifies the maximum number of times an
element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>
|
The example above indicates that the "child_name" element can occur a minimum
of one time (the default value for minOccurs is 1) and a maximum of ten times in
the "person" element.
minOccurs Indicator
The <minOccurs> indicator specifies the minimum number of times an
element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string"
maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
|
The example above indicates that the "child_name" element can occur a minimum
of zero times and a maximum of ten times in the "person" element.
Tip: To allow an element to appear an unlimited number of times, use
the maxOccurs="unbounded" statement
Group Indicators
Group indicators are used to define related sets of elements.
Element Groups
Element groups are defined with the group declaration, like this:
<xs:group name="groupname">
...
</xs:group>
|
You must define an all, choice, or sequence element inside the group
declaration. The following example defines a group named "persongroup", that
defines a group of elements that must occur in an exact sequence:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
|
After you have defined a group, you can reference it in another definition,
like this:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
<xs:element name="person" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:group ref="persongroup"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
|
Attribute Groups
Attribute groups are defined with the attributeGroup declaration, like
this:
<xs:attributeGroup name="groupname">
...
</xs:attributeGroup>
|
The following example defines an attribute group named "personattrgroup":
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
|
After you have defined an attribute group, you can reference it in another
definition, like this:
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
<xs:element name="person">
<xs:complexType>
<xs:attributeGroup ref="personattrgroup"/>
</xs:complexType>
</xs:element>
|
The <any> element enables us to extend the XML document
with elements not specified by the schema!
The <any> Element
The <any> element enables us to extend the XML document with elements
not specified by the schema.
The following example is a fragment from an XML schema called "family.xsd".
It shows a declaration for the "person" element. By using the <any>
element we can extend (after <lastname>) the content of "person" with any
element:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:any minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
|
Now we want to extend the "person" element with a "children" element. In this
case we can do so, even if the author of the schema above never declared any
"children" element.
Look at this schema file, called "children.xsd":
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="children">
<xs:complexType>
<xs:sequence>
<xs:element name="childname" type="xs:string"
maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
|
The XML file below (called "Myfamily.xml"), uses components from two
different schemas; "family.xsd" and "children.xsd":
<?xml version="1.0" encoding="ISO-8859-1"?>
<persons xmlns="http://www.microsoft.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:SchemaLocation="http://www.microsoft.com family.xsd
http://www.w3schools.com children.xsd">
<person>
<firstname>Hege</firstname>
<lastname>Refsnes</lastname>
<children>
<childname>Cecilie</childname>
</children>
</person>
<person>
<firstname>Stale</firstname>
<lastname>Refsnes</lastname>
</person>
</persons>
|
The XML file above is valid because the schema "family.xsd" allows us to
extend the "person" element with an optional element after the "lastname"
element.
The <any> and <anyAttribute> elements are used to make EXTENSIBLE
documents! They allow documents to contain additional elements that are not
declared in the main XML schema.
The <anyAttribute> element enables us to extend the XML
document with attributes not specified by the schema!
The <anyAttribute> Element
The <anyAttribute> element enables us to extend the XML document with
attributes not specified by the schema.
The following example is a fragment from an XML schema called "family.xsd".
It shows a declaration for the "person" element. By using the
<anyAttribute> element we can add any number of attributes to the "person"
element:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
<xs:anyAttribute/>
</xs:complexType>
</xs:element>
|
Now we want to extend the "person" element with a "gender" attribute. In this
case we can do so, even if the author of the schema above never declared any
"gender" attribute.
Look at this schema file, called "attribute.xsd":
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:attribute name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:schema>
|
The XML file below (called "Myfamily.xml"), uses components from two
different schemas; "family.xsd" and "attribute.xsd":
<?xml version="1.0" encoding="ISO-8859-1"?>
<persons xmlns="http://www.microsoft.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:SchemaLocation="http://www.microsoft.com family.xsd
http://www.w3schools.com attribute.xsd">
<person gender="female">
<firstname>Hege</firstname>
<lastname>Refsnes</lastname>
</person>
<person gender="male">
<firstname>Stale</firstname>
<lastname>Refsnes</lastname>
</person>
</persons>
|
The XML file above is valid because the schema "family.xsd" allows us to add
an attribute to the "person" element.
The <any> and <anyAttribute> elements are used to make EXTENSIBLE
documents! They allow documents to contain additional elements that are not
declared in the main XML schema.