next up previous contents
Next: Building Parse Trees Up: Grammars Previous: Mixing Tabled and Prolog

So What Kind of Parser is it?

A pure DCG, one without extra arguments (and without look-ahead symbols which we haven't discussed at all), represents a context-free grammar, and the Prolog and XSB engines provide recognizers for it. A context-free recognizer is a program that, given a context-free grammar and an input string, responds ``yes'' or ``no'' according to whether the input string is in or is not in the language of the given grammar.

We noted that the recognizer that ``you get for free'' with Prolog is a recursive descent recognizer. The recognizer ``you get for free'' with XSB and tabling is a variant of Earley's algorithm, or an active chart recognition algorithm (ref Peter Norvig and B. Sheil.)

The worst-case complexity of the recognizer under XSB (with all recursive nonterminals tabled) is O(nk+1) where n is the length of the input string and k is the maximum number of nonterminals and terminals on the right-hand-side of any rule. A little thought shows that this is consistent with the discussion of the complexity of datalog programs under XSB in Chapter ??. This is an example of a situation in which tabling turns an otherwise exponential algorithm (recursive descent) into a polynomial one (active chart recognition.)

Any grammar can be changed to another grammar that represents the same language but has two (or fewer) nonterminal symbols on the right-hand-side of every rule. This is the so-called Chomsky normal form. So if we transform a grammar into this form, then its worst-case complexity will be O(n3).

In fact, the folding and tabling that is done automatically by the XSB compiler when the :- suppl_table. and :- edb word/2. directives are given results in exactly the transformation necessary to transform a grammar to Chomsky normal form. So giving those directives guarantee the best worst-case complexity.

For unambiguous grammars, the complexity is actually O(n2). I find this an intriguing situation, which is particularly pleasant, since it is undecidable whether a context-free grammar is or is not ambiguous. So it will be undecidable to determine what the actual complexity of the algorithm is, given a grammar. But, no problem; the algorithm will automatically tune itself to the data (grammar in this case) to be more efficient in the unambiguous case.

These complexity results are very good and make the parser that ``you get for free'' with XSB quite a desirable parser.


next up previous contents
Next: Building Parse Trees Up: Grammars Previous: Mixing Tabled and Prolog
David S. Warren
1999-07-31