next up previous contents
Next: On the Complexity of Up: Tabling and Datalog Programming Previous: Genome Examples

Inferring When to Table

Up to now whenever we wanted calls to a predicate to be tabled, we explicitly coded a table directive to indicate the specific predicate to table. There is a facility in XSB for the programmer to direct the system to choose what predicates to table, in which case the system will generate table directives automatically. There are two directives that control this process: auto_table/0 and suppl_table/1. When such a directive is placed in a source file, it applies to all predicates in that file when it is compiled.

auto_table/0 causes the compiler to table enough predicates to avoid infinite loops due to redundant procedure calls. The current implementation of auto_table uses the call graph of the program. There is a node in the call graph of a program for each predicate, P/N, that appears in the program. There is an edge from node for predicate P/N to the node for predicate Q/M if there is a rule in the program with an atom with predicate P/N in the head and a literal with predicate Q/M in the body. The algorithm constructs the call graph and then chooses enough predicates to table to ensure that all loops in the call graph are broken. The algorithm, as currently implemented in XSB, finds a minimal set of nodes that breaks all cycles. The algorithm can be exponential in the number of predicates in the worst case3.1. If the program is a Datalog program, i.e., it has no recursive data structures, then auto_table is guaranteed to make all queries to it terminate finitely. Termination of general programs is, of course, undecidable, and auto_table may or may not improve their termination characteristics.

The goal of auto_table is to guarantee termination of Datalog programs, but there are other uses tabling. Tabling can have a great effect on the efficiency of terminating programs. Example 3.5.1 illustrates how a multiway join predicate can use tabling to eliminate redundant subcomputations.

Example 3.5.1    Consider the following set of relations describing a student database for a college:
1.
student(StdId,StdName,Yr): Student with ID StdId and name StdName is in year Yr, where year is 1 for freshman, 2 for sophomores, etc.
2.
enroll(StdId,CrsId): Student with ID StdId is enrolled in the course with number CrsId.
3.
course(CrsId,CrsName): Course with number CrsId has name CrsName.
We define a predicate yrCourse/2, which, given a year, finds all the courses taken by some student who is in that year:
yrCourse(Yr,CrsName) :- 
    student(StdId,_,Yr), enroll(StdId,CrsId), course(CrsId,CrsName).
Note that it will most likely be the case that if one student of a given year takes a course then many students in the same year will take that same course. Evaluated directly, this definition will result in the course name being looked up for every student that takes a given course, not just for every course taken by some student. By introducing an intermediate predicate, and tabling it, we can elminate this redundancy:
yrCourse(Yr,CrsName) :- 
    yrCrsId(Yr,CrsId), course(CrsId,CrsName).

:- table yrCrsId/2.
yrCrsId(Yr,CrsId) :-
    student(StdId,_,Yr), enroll(StdId,CrsId).
The intermediate predicate yrCrsId is tabled and so will eliminate duplicates. Thus course will only be accessed once for each course, instead of once for each student. This can make a very large difference in evaluation time.

In this example a table has been used to eliminate duplicates that arise from the database operations of a join and a projection. Tables may also be used to eliminate duplicates arising from unions.

The suppl_table/1 directive is a means by which the programmer can ask the XSB system to perform such factoring automatically. The program:

:- edb student/3, enroll/2, course/2.
:- suppl_table(2).
yrCourse(Yr,CrsName) :- 
    student(StdId,_,Yr), enroll(StdId,CrsId), course(CrsId,CrsName).
will automatically generate a program equivalent to the one above with the new intermediate predicate and the table declaration.

To understand precisely how suppl_table works, we need to understand some distinctions and definitions of deductive databases. Predicates that are defined by sets of ground facts can be designated as extensional predicates. The extensional predicates make up the extensional database (EDB). The remaining predicates are called intensional predicates, which make up the intensional database (IDB), and they usually have definitions that depend on the extensional predicates. In XSB the declaration:

           :- edb student/3, enroll/2, course/2.
declares three predicates to be extensional predicates. (Their definitions will have to be given elsewhere.) We define the data dependency count of an IDB clause to be the number of tabled IDB predicate it depends on plus the number of EDB predicates it depends on (not through a tabled IDB predicate.) The command:
       :- suppl_table(2).
instructs XSB to factor any clause which has a data dependency count of greater than two. In Example 3.5.1 the data dependency count of the original form of join/2 is three, while after undergoing supplementary tabling, its count is two. Choosing a higher number for suppl_table results in less factoring and fewer implied table declarations.

The next subsection describes somewhat more formally how these transformations affect the worst-case complexity of query evaluation.



 
next up previous contents
Next: On the Complexity of Up: Tabling and Datalog Programming Previous: Genome Examples
David S. Warren
1999-07-31