PDA

View Full Version : Parsers


Glass
06-03-2005, 09:00 PM
I've been talking to some guys, and they say that they seldom use anything BUT parsers for text hacking and stuff.

How do you make a parser, and what exactly does it do?

uneek
06-04-2005, 04:17 AM
A parser is just a fancy word for something that analyzes the grammatical structure of its input. In other words, if you were to have the expression 'The dog runs', you can break it down according to the grammatical structure of english, i.e. 'article' + 'noun' + 'verb'

perhaps most commonly, parsers (in conjunction with other software) are used to produce programs from source code

> I've been talking to some guys, and they say that they
> seldom use anything BUT parsers for text hacking and stuff.
>
> How do you make a parser, and what exactly does it do?
>

breadcrust
06-04-2005, 12:33 PM
yes, theres _lots_ of types of parsers, for markup, programming languages, spoken languages, special formats like css, binary formats (like roms), etc.

how you make a parser depends alot on what type of parser your making.

Glass
06-05-2005, 02:02 AM
> how you make a parser depends alot on what type of parser
> your making.
>
Well, I want to do script dumps of games. How about that?

uneek
06-05-2005, 02:41 AM
If it were me (and I havn't got a ton of knowlege in this area), I'd start by writing a formal grammar for the script. Next step is to document what the various bytes mean, and then write a parser to analyze the expressions in the script dump... for example, if there is an event, show window, let's say the byte for that event is 0xCA... say the 0xCA is followed by a pointer to the text for the window, which is followed by a y coord and an x coord, and then the width and height of the window... my grammar for this expression might be something like

Event ::= ShowWndEvent
ShowWndEvent ::= SWByte TextP COORD DIM
SWByte ::= PRIMITIVE_WORD
TextP ::= PRIMITIVE_CHARP
COORD ::= PRIMITIVE_WORD PRIMITIVE_WORD
DIM ::= COORD

parser can say, for example, fetch token from stream, sees that it's 0xCA, and match it to the showwindow event... parser then checks to see if following tokens match event, if so, then parser determines there is a show window event... if the rest of the tokens dont match, parser looks for some other possible match or if none found, throws an exception.

this is pretty simplified, but it should give you some idea of what you're getting into...

as far as the Event COORD etc stuff that is just notation... you can easily produce this form in C++ or another programming language... something like

Event()
{
if (ShowWindowEvent() == true){
AddExpr(ShowWindowEventExpr);
}
}

ShowWindowEvent()
{
return Nexttok() == SWByte() && NextTok() == TextP() ...;
}

Once upon a time I was doing some modding for a game and wrote an assembler for a scripting language that the game uses. wasn't particularly hard, but it did take quite a bit of time and got pretty tedious. if you're interested, I can link to the source code. it's pretty nasty and poorly documented but it's not overly complicated imo =/

> > how you make a parser depends alot on what type of parser
> > your making.
> >
> Well, I want to do script dumps of games. How about that?
>
<P ID="edit"><FONT class="small">Edited by uneek on 06/04/05 09:44 PM.</FONT></P>

Glass
06-05-2005, 04:02 PM
> Once upon a time I was doing some modding for a game and
> wrote an assembler for a scripting language that the game
> uses. wasn't particularly hard, but it did take quite a bit
> of time and got pretty tedious. if you're interested, I can
> link to the source code. it's pretty nasty and poorly
> documented but it's not overly complicated imo =/

Looks like I'll have to finish learning C++ first ^ _ ^

orcfan32
06-16-2005, 01:30 AM
>
> Looks like I'll have to finish learning C++ first ^ _ ^
>

That would mean that we're in the same boat! <img src=smilies/thumb.gif>