Home

XML Well-Formed

 

Well-Formedness

 

Tag Creation

In the previous lesson, we mentioned that XML worked through markups. A simple markup is made of a tag created between the left angle bracket "<" and the right angle bracket ">". Just creating a markup is not particularly significant. You must give it meaning. To do this, you can type a number, a date, or a string on the right side of the > symbol. The text on the right side of the right angle bracket ">" is referred to as the item's text. In the .NET Framework, it is called a value.

After specifying the value of the markup, you must close it: this is a rule not enforced in HTML but must be respected in XML to make it "well-formed".

To close a tag, use the same formula of creating a tag with the left angle bracket "<", the tag, and the right angle bracket ">" except that, between the < symbol and the tag, you must type a forward slash. The formula to use is:

<tag>some value</tag>

The item on the left side of the "some value" string, in this case <tag>, is called the opening or start-tag. The item on the right side of the "some value" string, in this case </tag>, is called the closing or end-tag. Like<tag> is a markup, </tag> also is called a markup.

As mentioned previously, with XML, you create your own tags with custom names. This means that a typical XML file is made of various items. Here is an example:

<title>The Distinguished Gentleman</title><director>Jonathan Lynn</director><length>112 Minutes</length>

Practical Learning Practical Learning: Creating XML

  1. Change the XmlDocument.Load() call as follows:
     
    Imports System
    Imports System.xml
    
    Module Exercise
        
        Public Sub Main()
            Dim docXML As XmlDocument = New XmlDocument
    
            docXML.LoadXml("<?xml version=""1.0"" encoding=""utf-8""?>")
        End Sub
    
    End Module
  2. Save the file

Tag Names

When creating your tags, there are various rules you must observe with regards to their names. Unlike HTML, XML is very restrictive with its rules. For example, unlike HTML but like C/C++/C#, XML is case-sensitive. This means that CASE, Case, and case are three different words. Therefore, from now on, you must pay close attention to what you write inside of the < and the > delimiters.

Besides case sensitivity, there are some rules you must observe when naming the tags of your markups:

  • The name of a tag must be in one word, no space in the name
  • The name must start with a an alphabetic letter or an underscore - Examples are <Country> or <_salary>
  • The first letter or underscore that starts a name can be followed by:
    • Letters - Example: <OperatingSystem>
    • Digits - Example: <L153>
    • Hyphens - Example: <TV-Rating>
    • Underscores - Example: <Chief_Accountant>
  • The name of a tag cannot start with xml, XML or any combination of X (uppercase or lowercase), followed by M (uppercase or lowercase), and followed by L (uppercase or lowercase)

In our lessons, here are the rules we will apply:

  • A name will start in uppercase (most of the time) or lowercase
  • When a name is a combination of words, such as [hourly salary], we will start each part in uppercase. Examples will be HourlySalary or DateOfBirth

In future sections, we will learn that, with some markups, you can include non-readable characters between the angle brackets. In fact, you will need to pay close attention to the symbols you type in a markup. We will also see how some characters have special meaning.

The Root

Every XML document must have one particular tag that, either is the only tag in the file, or acts as the parent of all the other tags of the same document. This tag is called the root. Here is an example of a file that has only one tag:

<rectangle>A rectangle is a shape with 4 sides and 4 straight angles</rectangle>

This would produce:

XML in a Browser

If there are more than one tag in the XML file, one of them must serve as the parent or root. Otherwise, you would receive an error. Based on this rule, the following XML code is not valid:

<rectangle>A rectangle is a shape with 4 sides and 4 straight angles</rectangle>
<square>A square is a rectangle whose 4 sides are equal</square>

This would produce:

An ill-formed XML file in a Browser

 To correct this type of error, you can change one of the existing tags to act as the root. Here is an example:

<rectangle>A rectangle is a shape with 4 sides and 4 straight angles
<square>A square is a rectangle whose 4 sides are equal</square></rectangle>

This would produce:

Good Nested Tags

Alternatively, you can create a tag that acts as the parent for the other tags. Here is an example:

<geometry><rectangle>A rectangle is a shape with 4 sides and 4 straight angles
</rectangle><square>A square is a rectangle whose 4 sides are equal</square></geometry>

This would produce:

A good XML file should have a Document Type Declaration:

<?xml version="1.0" encoding="utf-8"?><geometry><rectangle>A rectangle 
is a shape with 4 sides and 4 straight angles</rectangle><square>A 
square is a rectangle whose 4 sides are equal</square></geometry>

To provide access to the root of an XML file, the XmlDocument class is equipped with the DocumentElement property.

Practical Learning Practical Learning: Creating the Root Tag

  1. Don't close Notepad but start another instance of Notepad and type the following in it:
     
    <?xml version="1.0" encoding="utf-8"?>
    <Parts></Parts>
  2. Save the file as Parts.xml in your IntroXML folder created in the previous lesson
     

Empty Tags

We mentioned that, unlike HTML, every XML tag must be closed. We also saw that the value of a tag is specified on the right side of the right angle bracket of the start tag. In some cases, you will create a tag that doesn't have a value or, may be for some reason, you don't provide a value to it. Here is an example:

<dinner></dinner>

This type of tag is called an empty tag. Since there is no value in it, you may not need to provide an end tag but it still must be closed. Although this writing is allowed, an alternative is to close the start tag itself. To do this, between the tag name and the right angle bracket, type an empty space followed by a forward slash. Based on this, the above line can be written as follows:

<dinner />

Both produce the same result or accomplish the same role.

White Spaces

In the above example, we typed various items on the same line. If you are creating a long XML document, although creating various items on the same line is acceptable, this technique can make it (very) difficult to read. One way you can solve this problem is to separate tags with empty spaces. Here is an example:

<title>The Distinguished Gentleman</title> <director>Jonathan Lynn</director> <length>112 Minutes</length>

Yet a better solution consists of typing each element on its own line. This would make the document easier to read. Here is an example:

<title>The Distinguished Gentleman</title>
<director>Jonathan Lynn</director>
<length>112 Minutes</length>

All these are possible and acceptable because the XML parser doesn't consider the empty spaces or end of line. Therefore, to make your code easier to read, you can use empty spaces, carriage-return-line-feed combinations, or tabs inserted in various sections. All these are referred to as white spaces.

Nesting Tags

Most XML files contain more than one tag. We saw that a tag must have a starting point and a tag must be closed as seen in the above example. One tag can be included in another tag: this is referred to as nesting. A tag that is created inside of another tag is said to be nested. A tag that contains another tag is said to be nesting. Consider the following example:

<Smile>Please smile to the camera</Smile>
<English>Welcome to our XML Class</English>
<French>Bienvenue à notre Classe XML</French>

In this example, you may want the English tag to be nested in the Smile tag. To nest one tag inside of another, you must type the nested tag before the end-tag of the nesting tag. For example, if you want to nest the English tag in the Smile tag, you must type the whole English tag before the </Smile> end tag. Here is an example:

<Smile>Please smile to the camera<English>Welcome to our XML Class</English></Smile>

To make this code easier to read, you can use white spaces as follows:

<smile>Please smile to the camera
<English>Welcome to our XML Class</English>
</smile>

When a tag is nested, it must also be closed before its nesting tag is closed. Based on this rule, the following code is ill-formed:

<Smile>Please smile to the camera
<English>Welcome to our XML Class
</Smile>
</English>

The rule broken here is that the English tag that is nested in the the Smile tag is not closed inside the Smile tag but outside.

 

Practical Learning Practical Learning: Creating XML

  1. To apply the concept of nesting XML tags, change the Parts.xml file as follows:
     
    <?xml version="1.0" encoding="utf-8" ?>
    <Parts>
    	<Part>
    		<CarYear>2005</CarYear>
    		<Make>Acura</Make>
    		<Model>MDX 3.5 4WD</Model>
    		<PartNumber>293749</PartNumber>
    		<PartName>Air Filter</PartName>
    		<UnitPrice>16.85</UnitPrice>
    	</Part>
    	<Part>
    		<CarYear>2002</CarYear>
    		<Make>Audi</Make>
    		<Model>A4 Quattro</Model>
    		<PartNumber>283759</PartNumber>
    		<PartName>Clutch Release Bearing</PartName>
    		<UnitPrice>55.50</UnitPrice>
    	</Part>
    	<Part>
    		<CarYear>1998</CarYear>
    		<Make>Dodge</Make>
    		<Model>Neon</Model>
    		<PartNumber>491759</PartNumber>
    		<PartName>Crankshaft Position Sensor</PartName>
    		<UnitPrice>22.85</UnitPrice>
    	</Part>
    	<Part>
    		<CarYear>2000</CarYear>
    		<Make>Chevrolet</Make>
    		<Model>Camaro</Model>
    		<PartNumber>844509</PartNumber>
    		<PartName>Control Module Connector</PartName>
    		<UnitPrice>25.65</UnitPrice>
    	</Part>
    </Parts>
  2. Save the file
  3. Access the exercise.cs file and change the call to the XmlDocument.Load() method as follows:
     
    Imports System
    Imports System.xml
    
    Module Exercise
        
        Public Sub Main()
            Dim docXML As XmlDocument = New XmlDocument
    
            docXML.LoadXml("<?xml version=""1.0"" encoding=""utf-8""?>" & _
        		       "<Employees><Employee><EmplNumber>48-705</EmplNumber>" & _
        		       "<FirstName>John</FirstName><LastName>Cranston</LastName>" & _
        		       "<HourlySalary>16.48</HourlySalary></Employee><Employee>" & _
        		       "<EmplNumber>22-688</EmplNumber><FirstName>Annie</FirstName>" & _
        		       "<LastName>Loskar</LastName><HourlySalary>12.50</HourlySalary>" & _
        		       "</Employee><Employee><EmplNumber>85-246</EmplNumber>" & _
        		       "<FirstName>Bernie</FirstName><LastName>Christo</LastName>" & _
        		       "<HourlySalary>22.52</HourlySalary></Employee><Employee>" & _
        		       "<EmplNumber>70-155</EmplNumber><FirstName>Ernestine</FirstName>" & _
        		       "<LastName>Borrison</LastName><HourlySalary>20.14</HourlySalary>" & _
        		       "</Employee></Employees>")
    
            docXML.Save("Employees.xml")
        End Sub
    
    End Module
  4. Close this instance of Notepad
  5. When asked whether you want to save it, click Yes
  6. Switch to the Command Prompt
  7. To compile the exercise, type vbc exercise.vb and press Enter
  8. To execute the application, type exercise and press Enter
  9. To close the Command Prompt, type exit and press Enter
  10. Open Windows Explorer and display the contents of your IntroXML folder.
    Notice the presence of the parts.xml and the Employees.xml files
 

Previous Copyright © 2005-2016, FunctionX Next