XML Document Type Declaration(DTD) is a way to describe precisely the XML language. DTD is an XML technique used to define the structure of a XML document. It's useful to make sure the structure and vocabulary of XML documents are valid against the grammatical rules of the appropriate XML language.
A DTD can contain declarations that define elements, attributes, notations, and entities for any XML files that reference the DTD file. It also establishes constraints for how each element, attribute, notation, and entity can be used within any of the XML files that reference the DTD file. To be considered a valid XML file, the document must be accompanied by a DTD (or an XML schema), and conform to all of the declarations in the DTD (or XML schema).
<!DOCTYPE element DTD identifier
[
declaration1
declaration2
........
]>
<!DOCTYPE root-element [element-declarations]>
<!DOCTYPE root-element SYSTEM "file-name">
Generally, DTDs consist of three basic parts:
When using a DTD to define the content of an XML document, you must declare each element that appears within the document.
<!ELEMENT element-name category>
or
<!ELEMENT element-name (element-content)>
The ELEMENT declaration begins with an exclamation mark. Following the ELEMENT keyword is the name of the element that you are defining. The element name must appear exactly as it will within the XML document, including any namespace prefix. The content model of the element appears after the element name. An element may contain element children, text, a combination of children and text, or the element may be empty.
<!ELEMENT element-name EMPTY>
<!ELEMENT element-name (#PCDATA)>
<!ELEMENT element-name ANY>
<!ELEMENT element-name (child-name1,child-name2,... )>
<!ELEMENT element-name (child-name+)>
<!ELEMENT element-name (child-name*)>
<!ELEMENT element-name (child-name?)>
<!ELEMENT element-name (child-name1|child-name2)>
<!ELEMENT element-name (#PCDATA|child-name1|child-name2)*>
Attribute declarations are similar to element declarations in many ways. Instead of declaring allowable content models for elements, you declare a list of allowable attributes for each element. These lists are called ATTLIST declarations:
<!ATTLIST element-name attribute-name attribute-type attribute-value>
The attribute-type can be one of the following:
Type | Description |
---|---|
CDATA |
The value is character data |
(en1|en2|..) |
The value must be one from an enumerated list |
ID |
The value is a unique id |
IDREF |
The value is the id of another element |
IDREFS |
The value is a list of other ids |
NMTOKEN |
The value is a valid XML name |
NMTOKENS |
The value is a list of valid XML names |
ENTITY |
The value is an entity |
ENTITIES |
The value is a list of entities |
NOTATION |
The value is a name of a notation |
The attribute-value can be one of the following:
Value | Description |
---|---|
value |
The default value of the attribute |
#REQUIRED |
The attribute is required |
#IMPLIED |
The attribute is optional |
#FIXED value |
The attribute value is fixed |
Entities are used to define shortcuts to special characters within the XML documents. Entities can be primarily of four types
−All XML parsers must support built-in entities. In general, you can use these entity references anywhere. You can also use normal text within the XML document, such as in element contents and attribute values.
There are five built-in entities can be used within an XML document by default:
Character entities, much like the five built-in entities, are not declared within the DTD. Instead, they can be used in the document within element and attribute content without any declaration. You can used © as value for copyright character. The browser will see that copyright is replaced by the character ©.
Instead of representing only a single character, general entities can represent characters, paragraphs, and even entire documents.
Syntax<!ENTITY entity_name "text">
<!DOCTYPE note [
<!ENTITY CSE "Computer Science and Engineering">
]>
<department> &CSE; </department>
Note: In an instance document, when the XML parser finds &CSE;, it replaces it with Computer Science and Engineering.
<!DOCTYPE department [
<!ELEMENT department (student)>
<!ELEMENT student (name,rollno)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT rollno (#PCDATA)>
]>
<department>
<student>
<name>.....</name>
<rollno>....</rollno>
</student>
</department>
<!DOCTYPE department SYSTEM "student.dtd">
<department>
<student rollno="501">
<name>Ramu</name>
<branch>&CSE;</branch>
</student>
<student rollno="401">
<name>Arun</name>
<branch>&ECE;</branch>
</student>
</department>
<!ELEMENT department (student+)>
<!ELEMENT student (name,branch)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT branch (#PCDATA)>
<!ATTLIST student rollno CDATA #REQUIRED>
<!ENTITY CSE "Computer Science and Engineering">
<!ENTITY ECE "Electronics and Communications Engineering">