shilling for my feature request [1926]

Fri Jul 30 23:48:37 EDT 2004

On Friday, July 30, 2004, at 05:59 PM, Troy Rollins wrote:

> For those of us who aren't familiar with the term "pull-parser", I'll 
> ask some indulgence. Like most of us, I parse a lot of string content, 
> and I'm familiar with a number of models for parsing XML content, for 
> instance. But, what specifically is a pull-parser and how does #1926 
> make it possible / better?
> --
> Troy

OK school time.

from here:
http://otn.oracle.com/tech/xml/xdk/staxpreview.html

"Oracle StAX Pull Parser Preview

A new XML document parsing technology is being developed as part of the 
Java Community Process to supplement DOM and SAX. Called Streaming API 
for XML or StAX, this technology is being developed under JSR-173 and 
is in its final draft stage. StAX parsing has significant advantages 
over DOM and SAX which are discussed in the Sep/Oct. issue of Oracle 
Magazine in the article - Parsing XML Efficiently.

StAX gives parsing control to you through either a simple 
iterator-based API and an underlying stream of events or a cursor style 
object API. Methods such as next() and hasNext() allow you to pull the 
event by asking for next one rather than handling it in a callback. 
This gives you precise control over XML document processing. As 
distinct from other event-based approaches StAX allows you to stop 
processing the document, skip ahead to sections of the document, or get 
subsections of the document."

More on pull-parsers at
here: http://www.xmlpull.org/

So...
These new split functions would allow us to set our own rules for 
next(), nextTag(), and nextText() while streaming fragments out of of 
full XML documents. This is because we would have high speed functions 
to pull data out of large documents and the need for not relying on the 
streaming method would leave those current pull-parser implementations 
further behind.

MTML breaks the rules in a way that XML was never meant to. MTML 
element type tag sets can begin within an other tag set and end outside 
these other tag sets. This would break most XML parsers and even some 
of the new streaming designs that are designed as implementations of 
pull-parsing. All this adds up to the designer of the data structure 
being able to run modified and simple data transfers. "This is a good 
thing" Martha Stewart. It's better to dust off your competitors if you 
can offer the option. Development time within RunRev including this 
kind of data structuring can be a winning combination for you when it 
comes to offering services.

HTH,

Mark