The Open Source Swiss Army Knife

/programmingToolBox/
/programmingToolBox/ + sub-categories
http://www.sirfsup.com/
web directory content
    
      

Not logged in
Chat Register Login
return to:  http:/www.sirfsup.com      /programmingToolBox 
Permalink: parsers.txt
Title: add
article options : please login   |  raw source view  

parsing definitions

recursive descent
backtracking
left recursive productions
LL(1)

context free language: a BNF grammar can be used to express context-free
langauges. most constructs in modern programming languages can be represented
in BNF.

myowndictionary and parsing

parsing is taking a string and decomposing it.

you need to do two things to parse something
1. decompose it
2. add the symbols between the tokens generated in (1) from the decomposition

by doing both of those things, you can

... make some rules ............

and recombine the original string somehow.

i have done that on myowndictionary.com

tokens: spaces, periods in fact, the following structure

if ($from_lang=="english") $split_pattern = "/((?:[\.!\s\?,:-]|\\\")+)/";

which splits on spaces, literal periods, exclamation points, literal question
marks, commas, semicolons, colons, hyphens, double quotes.

(one or more of the above)

it goes over a paragraph ....
it then generates the data structure of splits and words (i.e. in computer
terms, "tokens")

.

then, there are rules to recombine the original string somehow.

i make a new string, which is the original one, plus the HREF tags which
provide definitions of the words when a word is put over it.

most of the "application logic" comes from the database. a database provides
for the rule to rewrite the tokens.

lex

it does the above

you can generate C-code for each match

yacc

yacc is Baucus-naru form.

that means, as the parser proceeds, the tokens which it leaves behind .... if
there are 2 or more of them, they will
be combined and then the parser will proceed.

in that case, the string is reduced to nothing but a single token at the end
of the processing.

(sounds like making a calculator application, not processing natural language
in any way)

i need a bitwise parser which compares the value to the database values.

the regular expression matches of the language used against the database
will take care of gobbling up the parsing sequence.
if there is nothing else,
also if there is a match for anything in the "notes" fields,
their will have to be a regular expression match on the bit-wise type
to check for any of those fields.


Leave a Reply
Your Name:     anonymous
Your Email:
Website:  
Comments:

The author will be notified of your reply.
return to top