RE: Full Thrust Format for Email games

From: Alex Williams <thantos@d...>
Date: Fri, 30 May 1997 15:51:08 -0400
Subject: RE: Full Thrust Format for Email games

On Fri, 30 May 1997, Joachim Heck - SunSoft wrote:

>   That's funny - that's just the problem I find with the system you
> propose here.  Braces, or any delimiter are nice because they make a
> recursive parse easy to keep track of.

I've never really found that to be the case for my idiom (and I wrote
LISP/Scheme code for a long time; more delimiters than you can shake a
stick at).  In either case, some flag retains information about
whether or not you've completed the current section; at some point
you've got to read the next data-section into memory (or operate on
the next 'chunk' if you just slurped the whole thing up as I'm prone
to do).

>   The only problem I have with this is that you need to use look-ahead
> logic to parse it.  Example: I have read in the data in your example.
> Now imagine there's another DSIIVehicle: line in the file.  In order
> to correctly read in the information, I need to recognize that that
> line signifies not only the beginning of a new vehicle description,
> but the end of the current description.  So I have to finalize my
> parsing of the current vehicle and begin parsing the next one.  This
> happens, unfortunately, after I've already read in the first line for

Actually, you don't have to do look-ahead processing at all; the way
to avoid it is to keep the 'chunks' (in this case, recursively smaller
pieces of the original message-data) in memory.  You then extract the
currently active 'chunk', via whatever means, and examine its first
few bytes.  If it matches a break for current processing, you finalize
the current object, store/save it and initialize a new object/item.
When you start processing the new object/item, you begin reading the
'chunk' from the beginning again, removing the necessity for true
look-ahead processing, you just draw up the next 'chunk' and dispatch
it to the right handler for that chunk, depending on its
identification field.

> the new vehicle.  Ideally (as far as I'm concerned) the program would
> never have to deal with two objects simultaneously - putting braces
> around things allows this since a closing brace can always be matched
> with its opening brace to figure out whether the description is
> finished or not.

In my suggested processing algorithm, you don't.  You always finalize
the current object being processed before initializing the next, at
whatever level you're working at.  We assume that each block is
'finished' when its written out to file; if its missing a 'vital
field,' information that can't be proceeded without in processing
(such as a DSIIVehicle field or a Size field in the same chunk), you
signal an exception and abort processing.  This simplifies the
processing and error checking immesurably.  Braces actually just act
as 'syntactic sugar' in the evaluation since they don't add
information to the data you don't already have.

>   I should point out that A) I don't do this for a living, and B) I
> have something of an object-oriented perspective on these matters,
> since the only FT parser I've written was OO.  So the idea there was
> that I could have objects read themselves out of a file.  This scheme
> gets tricky when the object has to prime the pump for the next object
> (about which it should know nothing).

Instead of structuring it as Objects reading themselves out of a file,
think of it as a master application Object which draws up the next
chunk, examines it, then passes it as initialization data to a
specific type of Object that knows how to interpret that chunk to
initialize its data attributes.  Once that chunk is done and an
initial instance is created, the master ap Obj looks at the next
chunk, and if its another chunk it knows how to handle, it
dispatches.  If not, it passes it to the last Object it instantiated
which may or may not know how to handle the chunk, and proceeds
likewise.  If the chunk falls off the end of the instance chain, its a
bad chunk and you raise an exception; if it gets handled by an Object,
you end up with a new instance of an Object.

In this particular discussion, you might have an ApObj that recognizes
chunks that begin with 'DSIIVehicle', 'FTShip' or 'SGSquad'.  Upon
reading a chunk that begins appropriately, it passes the chunk to the
named Class and creates an instance of the result.  Subsequent
unrecognized chunks, beginning with 'Weapon', for example, would be
passed to DSIIVehicle, let's say, who would then create instances of
the Weapon class its module was aware of.  A FTShip instance might use
a field of the same name, 'Weapon', but require different fields and
accept somewhat different formats.

Ths you have a fully OOP hierarchy with dynamic instance creation that
dispatches input the appropriate handling subclass.

>   That might be a more complicated way of going about things than is
> necessary.  Actually, other than the delimiters marking the beginning
> and ending of objects, our formats are quite similar.

I'm not a big fan of delimiters anymore; after getting into Python
(where blocks are defined by indentation and every opportunity is made
to simplify syntax), its very difficult to see using delimiters when
there is no additional info encoded into the data itself in so doing.

-- 
[  Alexander Williams {thantos@alf.dec.com/zander@photobooks.com}  ]
[ Alexandrvs Vrai,  Prefect 8,000,000th Experimental Strike Legion ]
[	     BELLATORES INQVIETI --- Restless Warriors		   ]
====================================================================
		      "There are no innocents."
Prev: RE: Full Thrust Format for Email games
Next: RE: Full Thrust Format for Email games