Bali Win32 Validatelet

$Id: win32.html,v 1.2 2002/09/23 18:08:27 Bear Exp $
By Kohsuke KAWAGUCHI

Bali can produce C++ source code targeted to the Win32/Visual C++ environment. The generated code can validate MSXML DOM tree or MSXML SAX events.

Generating a validatelet

To compile myschema.rng into a validatelet in the foonamespace to the src folder, type in as follows:

$ java -jar bali.jar myschema.rng -ow src foo bar

The third parameter of the "-ow" option specifies the stem of the factory method names.

This will generate bar.h and bar.cpp into the src folder, along with other source/header files necessary to run a validatelet.

Writing code to use validatelet

Project set-up

First, you need to modify stdafx.h and have it include a couple of header files necessary for a validatelet. **********************************

For DOM

Generated bar.h defines methods that create a validatelet. Your application should include this header file and call these methods to obtain an instance of the validatelet.

To create a validatelet for MSXML DOM, call the createBarDOMValidatelet() method as follows:

bali::IValidatelet* pValidatelet = foo::createBarDOMValidatelet();

The IValidatelet interface defines one method that takes a MSXML DOM document or element and validate it. If you pass a document, this method will validate the whole document. If you pass an element, it will validate a sub-tree rooted at that element.

Upon the completion, this method returns true if the validation was successful, and false otherwise. You can also receive a pointer to the node that causes a failure (see the header file for the detail.)

Thus the code will look like:

// load a document
IXMLDOMDocumentPtr pDOMDocument(__uuidof(DOMDocument));
pDOMDocument->async = false;
pDOMDocument->load(myXml);

if( pValidatelet->validate( pDOMDocument ) )
{
    // this document is OK. The show will go on
}
else
{
    // abort. there was an error.
}

The same validatelet object can be reused many times, as long as it is used by only one thread at any given time.

Once you finish using it, it is your resonsibility to delete a validatelet.

delete pValidatelet;

For SAX

Another method defined in the header file is the createBarSAXValidatelet method. This method will return a ISAXContentHandler interface that implements a validatelet. This interface will allow you to validate SAX events.

MSXML2::ISAXContentHandler* pValidatelet = foo::createBarSAXValidatelet();

You can then use this handler to validate SAX events. Every time the startDocument callback is called, a validatelet re-initializes itself. Thus you can reuse one SAX validatelet as many times as you want.

Since SAX validatelet is a COM object, you need to release it explicitly.

pValidatelet->Release();

Request for comments

Although it was my favorite platform until a year or two ago, I no longer work on the Win32 platform. Therefore, it is very likely that the interface I provided here is somewhat different from what it should be.

I solicit any comment regarding how a validatelet should be exposed to the client application.

Limitation

  1. Right now, the only supported datatype library is the RELAX NG built-in datatypes. XML Schema datatypes are not supported simply because I couldn't find an implementation for this platform. Let me know if you have one.
  2. Because of the validation algorithm, the generated validatelet is not necessarily a fail-fast validator. This is, when you read a start element, it doesn't validate all the attributes immediately. Thus if you write your code in such a way that you'll access attributes at the start element, that code could be vulnerable to invalid documents.
  3. A validatelet doesn't support recovery from an error. After reporting the first error, it simply stops validation.