Validating XML against XSD

davearia

Well-known member
Joined
Jan 4, 2005
Messages
184
This has been driving me mad all day, thank goodness its hometime.

Basically I have a routine that validates an XML document, here is the code:

Code:
Private Function ValidateXML(ByRef xmlDoc As XmlDocument, ByVal streamReader As StreamReader) As Boolean
        Dim buffer() As Byte = System.Text.ASCIIEncoding.ASCII.GetBytes(xmlDoc.InnerXml)
        Dim memstrm As New System.IO.MemoryStream(buffer)
        Dim xmlReader As New XmlTextReader(memstrm)
        memstrm.Close()
        Dim xmlVal As New XmlValidatingReader(xmlReader)
        xmlVal.Schemas.Add("urn:pleaseWork", Server.MapPath("XSD/AddEmployee.xsd"))
        xmlVal.ValidationType = ValidationType.Schema
        AddHandler xmlVal.ValidationEventHandler, AddressOf SchemaValidationEventHandler
        Try
            Dim isValid As Boolean = True
            While xmlVal.Read
                Dim strr As String = xmlVal.Value
            End While
            Return True
        Catch ex As Exception
            Return False
        End Try

    End Function

    Private Sub SchemaValidationEventHandler(ByVal sender As Object, ByVal e As System.Xml.Schema.ValidationEventArgs)
        If e.Severity = XmlSeverityType.Error Then
            Dim poo As String = String.Empty
            _errorCount += 1
        ElseIf e.Severity = XmlSeverityType.Warning Then
            Dim notAsPoo As String = String.Empty
            _errorCount += 1
        End If
    End Sub

Here is the XSD (AddEmployee.XSD):

Code:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="AddEmployee" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="urn:pleaseWork" elementFormDefault="qualified" targetNamespace="urn:pleaseWork">
    <xs:element name="Employees">
      <xs:complexType>
        <xs:sequence>
          <xs:element name="Employee">
            <xs:complexType>
              <xs:sequence>
                <xs:element name="ID" type="xs:int" minOccurs="1"/>
                <xs:element name="Name" type="xs:string" minOccurs="1"/>
                <xs:element name="ActivityID" type="xs:int" minOccurs="1"/>
              </xs:sequence>
            </xs:complexType>
          </xs:element>
        </xs:sequence>
      </xs:complexType>
    </xs:element>
</xs:schema>

Finally here is my XML document;

Code:
<?xml version="1.0" standalone="yes"?>
<Employees>
  <Employee>
    <Name>Dave Jones</Name>
    <ActivityID>9</ActivityID>
  </Employee>
  <Employee>
    <ID>111</ID>
    <Name>Wayne Wallice</Name>
    <ActivityID>1</ActivityID>
  </Employee>
  <Employee>
    <ID>1111</ID>
    <Name>Justin Davies</Name>
    <ActivityID>2</ActivityID>
  </Employee>
  <Employee>
    <ID>11111</ID>
    <Name>Matthew Jones</Name>
    <ActivityID>0</ActivityID>
  </Employee>
  <Employee>
    <ID>111111</ID>
    <Name>Steve Pear</Name>
    <ActivityID>2</ActivityID>
  </Employee>
</Employees>

If you look at the XSD file youll see that the element ID is supposed to mandatory in that I have set minoccurs to 1. However when I run the code this doesnt seem to be flagged as an error. Also I get 20 errors like Could not find schema information for the element Employees etc.

Can someone show me what stupid mistake I have made as I am totally stumped!

Thanks, Dave.
 
You have to associate the Xml document with the schema.

The code you use to add a schema to a validating reader is saying "If you find this namespace, use this contract to verify it"...but the validator never found the namespace.

Add the namespace to the root node:

<Employees xmlns="urn:pleaseWork">

Fyi, you may just want to change the namespace to pleaseWork or something along the lines of http://*/pleaseWork.

Also, you should take out the standalone attribute in the xml header...the standalone attribute is designed specifically to tell the xml reader that nothing outside of this document will affect how it is processed...ie, there is no dtd to validate against. Although most readers will validate anyway, they shouldnt.


Also, the XmlValidatingReader instance is redundant, as the XmlDocument has a validate function that performs exactly the same procedure.
 
I have taken your advice and come up with this solution which works:

Code:
Private Function ValidXML(ByVal xmlDoc As XmlDocument, ByVal requestID As Int32) As Boolean
        Dim buffer() As Byte = System.Text.ASCIIEncoding.ASCII.GetBytes(xmlDoc.InnerXml)
        Dim memstrm As New System.IO.MemoryStream(buffer)
        Dim xmlReader As New XmlTextReader(memstrm)
        Dim xmlVal As New XmlValidatingReader(xmlReader)
        Try
            Select Case requestID
                Case requestType.insert
                    xmlVal.Schemas.Add(Nothing, Server.MapPath("XSD/AddEmployee.xsd"))
                Case requestType.update
                    xmlVal.Schemas.Add(Nothing, Server.MapPath("XSD/AddEmployee.xsd"))
                Case requestType.delete
                    xmlVal.Schemas.Add(Nothing, Server.MapPath("XSD/AddEmployee.xsd"))
            End Select
            xmlVal.ValidationType = ValidationType.Schema
            AddHandler xmlVal.ValidationEventHandler, AddressOf SchemaValidationEventHandler
            While xmlVal.Read
                Any invalid XML nonsense will cause SchemaValidationEventHandler to be called.
            End While
            If _isValid Then
                Return True
            Else
                Return False
            End If
        Finally
            memstrm.Close()
        End Try
    End Function

    Private Sub SchemaValidationEventHandler(ByVal sender As Object, ByVal e As System.Xml.Schema.ValidationEventArgs)
        If e.Severity = XmlSeverityType.Error Then
            _isValid = False
        End If
    End Sub

I added xmlns="http://www.me.com" to the XML document and it validates super. I would like to use the newer approach you mentioned by I am not sure how to code this. If you have an example please share, failing that I promise to read up on this and replace this function soon.

Thanks again, Dave.
 
This code should work, when converted:

Code:
 private bool ValidateXML(XmlDocument xmldoc)
        {
            xmldoc.Schemas.Add("http://me.com", "XSD/AddEmployee.xsd");
            xmldoc.Validate(new System.Xml.Schema.ValidationEventHandler(xmlVal_ValidationEventHandler));

            return _isValid;
        }

        void xmlVal_ValidationEventHandler(object sender, System.Xml.Schema.ValidationEventArgs e)
        {
            if (e.Severity == System.Xml.Schema.XmlSeverityType.Error)
                _isValid = false;
        }


And now that Im older and more experienced, I would add these comments...

The xml namespace and location of the XSD should be stored in a seperate resource file and not as literals...
And you could use an anonymous function for the ValidationHander to move the processing logic closer to the source call.
 
Jobs a goodun!

Took your advice Diesel.

Here is the code:

Code:
 <summary>
     Checks the schema of the XML document against the agreed schema.
     </summary>
     <param name="xmlDoc"></param>
     <returns></returns>
     <remarks></remarks>
    Private Function ValidXML(ByVal xmlDoc As XmlDocument, ByVal requestID As Int32) As Boolean
        Firstly check to see if the namespace exists on the root node.
        If xmlDoc.ChildNodes(0).Attributes("xmlns") Is Nothing Then
            It doesnt exist.
            Return False
        Else
            If xmlDoc.ChildNodes(0).Attributes("xmlns").Value <> Resources.Resource.NameSpace Then
                It does exist but it does match the desired namespace.
                Return False
            End If
        End If
        Get the the appropriate schema.
        Select Case requestID
            Case requestType.insert
                xmlDoc.Schemas.Add(Resources.Resource.NameSpace, Server.MapPath(Resources.Resource.AddEmployeePath))
            Case requestType.update
                xmlDoc.Schemas.Add(Resources.Resource.NameSpace, Server.MapPath(Resources.Resource.UpdateEmployeePath))
            Case requestType.delete
                xmlDoc.Schemas.Add(Resources.Resource.NameSpace, Server.MapPath(Resources.Resource.DeleteEmployeePath))
        End Select
        Validate the XML document.
        xmlDoc.Validate(New System.Xml.Schema.ValidationEventHandler(AddressOf SchemaValidationEventHandler))
    End Function

     <summary>
     Delegated method to handle invalid documents.
     </summary>
     <param name="sender"></param>
     <param name="e"></param>
     <remarks></remarks>
    Private Sub SchemaValidationEventHandler(ByVal sender As Object, ByVal e As System.Xml.Schema.ValidationEventArgs)
        If e.Severity = XmlSeverityType.Error Then
            _isValid = False
        End If
    End Sub

So the namespace and paths etc are all in the resource file :D.

I even do a check at the start of the process to make sure that the namespace the code expects has been included in the root node of the XML.

Your second point about anonymous call. The bit of reading I just did on Google suggests that VB.NET might not be capable of this. I am sure if I am wrong about this I will be told, so appologies in advance. I am starting to code in C#.NET at home. Got myself some good books and everything I write at home will be in C#.NET. It will be tempting to revert back to VB.NET but I want to make the transition to being a C#.NET developer.

Anyway thanks once again Diesel. :D

Have a good weekend, Dave.
 
Oh, your right, vb doesnt have anonymous functions. Oh well, it would have only been a minor improvement in the code.

Lookin good.
 
Back
Top