Fixed some stuff for both versions -- now will start cutting.
git-svn-id: https://yap.svn.sf.net/svnroot/yap/trunk@1896 b08c6af1-5177-4d33-ba66-4b1c6b8b522a
This commit is contained in:
parent
a941b4d38e
commit
613a9ac5cf
@ -252,7 +252,7 @@ access big datasets and on the other they are unfit for static
|
|||||||
analysis since queries are often ad hoc and generated only during
|
analysis since queries are often ad hoc and generated only during
|
||||||
runtime as new hypotheses are formed or refined.
|
runtime as new hypotheses are formed or refined.
|
||||||
%
|
%
|
||||||
Our thesis is that the Prolog abstract machine should be able to adapt
|
Our thesis is that the abstract machine should be able to adapt
|
||||||
automatically to the runtime requirements of such or, even better, of
|
automatically to the runtime requirements of such or, even better, of
|
||||||
all applications by employing increasingly aggressive forms of dynamic
|
all applications by employing increasingly aggressive forms of dynamic
|
||||||
compilation. As a concrete example of what this means in practice, in
|
compilation. As a concrete example of what this means in practice, in
|
||||||
@ -284,10 +284,10 @@ clauses exceeds a threshold. For this reason the \switchONconstant and
|
|||||||
\instr{T} and the number of clauses \instr{N} the table contains (or
|
\instr{T} and the number of clauses \instr{N} the table contains (or
|
||||||
equivalently, \instr{N} is the size of the hash table). In each bucket
|
equivalently, \instr{N} is the size of the hash table). In each bucket
|
||||||
of this hash table and also in the bucket for the variable case of
|
of this hash table and also in the bucket for the variable case of
|
||||||
\switchONterm the code performs a sequential backtracking search of
|
\switchONterm the code sequentially backtracks through the clauses
|
||||||
the clauses using a \TryRetryTrust chain of instructions. The \try
|
using a \TryRetryTrust chain of instructions. The \try instruction
|
||||||
instruction sets up a choice point, the \retry instructions (if~any)
|
sets up a choice point, the \retry instructions (if~any) update
|
||||||
update certain fields of this choice point, and the \trust instruction
|
certain fields of this choice point, and the \trust instruction
|
||||||
removes it.
|
removes it.
|
||||||
|
|
||||||
The WAM has additional indexing instructions (\instr{try\_me\_else}
|
The WAM has additional indexing instructions (\instr{try\_me\_else}
|
||||||
@ -307,7 +307,7 @@ Fig.~\ref{fig:carc:index}. This code is typically placed before the
|
|||||||
code for the clauses and the \switchONconstant instruction is the
|
code for the clauses and the \switchONconstant instruction is the
|
||||||
entry point of predicate. Note that compared with vanilla WAM this
|
entry point of predicate. Note that compared with vanilla WAM this
|
||||||
instruction has an extra argument: the register on the value of which
|
instruction has an extra argument: the register on the value of which
|
||||||
we will index ($r_1$). This extra argument will allow us to go beyond
|
we index ($r_1$). This extra argument will allow us to go beyond
|
||||||
first argument indexing. Another departure from the WAM is that if
|
first argument indexing. Another departure from the WAM is that if
|
||||||
this argument register contains an unbound variable instead of a
|
this argument register contains an unbound variable instead of a
|
||||||
constant then execution will continue with the next instruction; in
|
constant then execution will continue with the next instruction; in
|
||||||
@ -452,18 +452,18 @@ instruction works as follows:
|
|||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item if the argument register $r_i$ is a free variable, then
|
\item if the argument register $r_i$ is a free variable, then
|
||||||
execution continues with the next instruction;
|
execution continues with the next instruction;
|
||||||
\item otherwise, \JITI kicks in as follows. The abstract machine will
|
\item otherwise, \JITI kicks in as follows. The abstract machine
|
||||||
scan the WAM code of the clauses and create an index table for the
|
scans the WAM code of the clauses and creates an index table for the
|
||||||
values of the corresponding argument. It can do so because the
|
values of the corresponding argument. It can do so because the
|
||||||
instruction takes as arguments the number of clauses \instr{N} to
|
instruction takes as arguments the number of clauses \instr{N} to
|
||||||
index and the arity \instr{A} of the predicate. (In our example, the
|
index and the arity \instr{A} of the predicate. (In our example, the
|
||||||
numbers 5 and 3.) For Datalog facts, this information is sufficient.
|
numbers 5 and 3.) For Datalog facts, this information is sufficient.
|
||||||
Also, because the WAM byte code for the clauses has a very regular
|
Because the WAM byte code for the clauses has a very regular
|
||||||
structure, the index table can be created very quickly. Upon its
|
structure, the index table can be created very quickly. Upon its
|
||||||
creation, the \jitiONconstant instruction will get transformed to a
|
creation, the \jitiONconstant instruction gets transformed to a
|
||||||
\switchONconstant. Again this is straightforward because of the two
|
\switchONconstant. Again this is straightforward because of the two
|
||||||
instructions have similar layouts in memory. Execution of the
|
instructions have similar layouts in memory. Execution of the
|
||||||
abstract machine will continue with the \switchONconstant
|
abstract machine then continues with the \switchONconstant
|
||||||
instruction.
|
instruction.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
Figure~\ref{fig:carg:jiti_single:after} shows the index table $T_2$
|
Figure~\ref{fig:carg:jiti_single:after} shows the index table $T_2$
|
||||||
@ -523,7 +523,7 @@ argument has been created.
|
|||||||
The main advantage of this scheme is its simplicity. The compiled code
|
The main advantage of this scheme is its simplicity. The compiled code
|
||||||
(Fig.~\ref{fig:carc:jiti_single:before}) is not significantly bigger
|
(Fig.~\ref{fig:carc:jiti_single:before}) is not significantly bigger
|
||||||
than the code which a WAM-based compiler would generate
|
than the code which a WAM-based compiler would generate
|
||||||
(Fig.~\ref{fig:carc:index}) and, even if \JITI turns out unnecessary
|
(Fig.~\ref{fig:carc:index}) and, if \JITI turns out unnecessary
|
||||||
during runtime (e.g. execution encounters only open calls or with only
|
during runtime (e.g. execution encounters only open calls or with only
|
||||||
the first argument bound), the extra overhead is minimal: the
|
the first argument bound), the extra overhead is minimal: the
|
||||||
execution of some \jitiONconstant instructions for the open call only.
|
execution of some \jitiONconstant instructions for the open call only.
|
||||||
@ -729,7 +729,7 @@ We describe the process of demand-driven index construction.
|
|||||||
Let $p/k$ be a predicate with $n$ clauses.
|
Let $p/k$ be a predicate with $n$ clauses.
|
||||||
%
|
%
|
||||||
At a high level, its indices form a tree whose root is the entry point
|
At a high level, its indices form a tree whose root is the entry point
|
||||||
of the predicate. For simplicity, we assume that the root node of the
|
of the predicate. For simplicity, assume that the root node of the
|
||||||
tree and the interior nodes corresponding to the index table for the
|
tree and the interior nodes corresponding to the index table for the
|
||||||
first argument have been constructed at compile time. Leaves of this
|
first argument have been constructed at compile time. Leaves of this
|
||||||
tree are the nodes containing the code for the clauses of the
|
tree are the nodes containing the code for the clauses of the
|
||||||
@ -746,12 +746,12 @@ instruction and the $T$ instructions are either a sequence of
|
|||||||
\TryRetryTrust instructions (if $l > 1$) or a \jump instruction (if
|
\TryRetryTrust instructions (if $l > 1$) or a \jump instruction (if
|
||||||
\mbox{$l = 1$}). Step~2.2 dynamically constructs an index table $\cal
|
\mbox{$l = 1$}). Step~2.2 dynamically constructs an index table $\cal
|
||||||
T$ whose buckets are the newly created interior nodes in the tree.
|
T$ whose buckets are the newly created interior nodes in the tree.
|
||||||
Each bucket associated with a single clause contains a \jump
|
Each bucket associated with a single clause contains a \jump to the
|
||||||
instruction to the label of that clause. Each bucket associated with
|
label of that clause. Each bucket associated with many clauses starts
|
||||||
many clauses starts with the $I$ instructions which are yet to be
|
with the $I$ instructions which are yet to be visited and continues
|
||||||
visited and continues with a \TryRetryTrust chain pointing to the
|
with a \TryRetryTrust chain pointing to the clauses. When the index
|
||||||
clauses. When the index construction is done, the instruction mutates
|
construction is done, the instruction mutates to a \switchSTAR WAM
|
||||||
to a \switchSTAR WAM instruction.
|
instruction.
|
||||||
%-------------------------------------------------------------------------
|
%-------------------------------------------------------------------------
|
||||||
\begin{Algorithm}[t]
|
\begin{Algorithm}[t]
|
||||||
\caption{Actions of the abstract machine with \JITI}
|
\caption{Actions of the abstract machine with \JITI}
|
||||||
@ -862,19 +862,20 @@ exist in the body of the clause (e.g., type tests such as
|
|||||||
Y}, numeric constraints such as \code{X > 0}, etc).
|
Y}, numeric constraints such as \code{X > 0}, etc).
|
||||||
|
|
||||||
A reasonable concern for \JITI is increased memory consumption during
|
A reasonable concern for \JITI is increased memory consumption during
|
||||||
runtime due to the index tables. In our experience, this does not seem
|
runtime due to the creation of index tables. In our experience, this
|
||||||
to be a problem in practice since most applications do not have demand
|
does not seem to be a problem in practice since most applications do
|
||||||
for indexing on many argument combinations. In applications where it
|
not have demand for indexing on many argument combinations. In
|
||||||
does become a problem or when running in an environment with limited
|
applications where it does become a problem or when running in an
|
||||||
memory, we can easily put a bound on the size of index tables, either
|
environment with limited memory, we can easily put a bound on the size
|
||||||
globally or for each predicate separately. For example, the \jitiSTAR
|
of index tables, either globally or for each predicate separately. For
|
||||||
instructions can either become inactive when this limit is reached, or
|
example, the \jitiSTAR instructions can either become inactive when
|
||||||
better yet we can recover the space of some tables. To do so, we can
|
this limit is reached, or better yet we can recover the space of some
|
||||||
employ any standard recycling algorithm (e.g., least recently used)
|
tables. To do so, we can employ any standard recycling algorithm
|
||||||
and reclaim the of index tables that are no longer in use. This is
|
(e.g., least recently used) and reclaim the memory of index tables
|
||||||
easy to do by reverting the corresponding \switchSTAR instructions
|
that are no longer in use. This is easy to do by reverting the
|
||||||
back to \jitiSTAR instructions. If the indices are demanded again at a
|
corresponding \switchSTAR instructions back to \jitiSTAR instructions.
|
||||||
time when memory is available, they can simply be regenerated.
|
If the indices are demanded again at a time when memory is available,
|
||||||
|
they can simply be regenerated.
|
||||||
|
|
||||||
|
|
||||||
\section{Demand-Driven Indexing of Dynamic Predicates} \label{sec:dynamic}
|
\section{Demand-Driven Indexing of Dynamic Predicates} \label{sec:dynamic}
|
||||||
@ -916,14 +917,14 @@ If several calls are alive in the stack, several snapshots will be
|
|||||||
alive at the same time. The standard solution to this problem is to
|
alive at the same time. The standard solution to this problem is to
|
||||||
use time stamps to tell which clauses are \emph{live} for which calls.
|
use time stamps to tell which clauses are \emph{live} for which calls.
|
||||||
%
|
%
|
||||||
This solution complicates freeing index tables because (1) an index
|
This solution complicates freeing index tables because: (1) an index
|
||||||
table holds references to clauses, and (2) the table may be in use,
|
table holds references to clauses, and (2) the table may be in use,
|
||||||
that is, it may be accessible from the execution stacks. An index
|
that is, it may be accessible from the execution stacks. An index
|
||||||
table thus is killed in several steps:
|
table thus is killed in several steps:
|
||||||
\begin{enumerate}
|
\begin{enumerate}
|
||||||
\item Detach the index table from the indexing tree.
|
\item Detach the index table from the indexing tree.
|
||||||
\item Recursively \emph{kill} every child of the current table:
|
\item Recursively \emph{kill} every child of the current table:
|
||||||
if the current table is killed, so will be its children.
|
if the current table is killed, so are its children.
|
||||||
\item Wait until the table is not in use, that is, it is not pointed
|
\item Wait until the table is not in use, that is, it is not pointed
|
||||||
to by someone.
|
to by someone.
|
||||||
\item Walk the table and release any references it may hold.
|
\item Walk the table and release any references it may hold.
|
||||||
@ -954,7 +955,7 @@ inside compound terms. The user can then use the appropriate compiler
|
|||||||
directive for these predicates.} For dynamic predicates, \JITI is
|
directive for these predicates.} For dynamic predicates, \JITI is
|
||||||
employed only if they consist of Datalog facts; if a clause which is
|
employed only if they consist of Datalog facts; if a clause which is
|
||||||
not a Datalog fact is asserted, all dynamically created index tables
|
not a Datalog fact is asserted, all dynamically created index tables
|
||||||
for the predicate are simply killed and the \jitiONconstant
|
for the predicate are simply removed and the \jitiONconstant
|
||||||
instruction becomes a \instr{noop}. All this is done automatically,
|
instruction becomes a \instr{noop}. All this is done automatically,
|
||||||
but the user can disable \JITI in compiled code using an appropriate
|
but the user can disable \JITI in compiled code using an appropriate
|
||||||
compiler option.
|
compiler option.
|
||||||
@ -971,7 +972,7 @@ relations: in such cases YAP will maintain a list of matching clauses
|
|||||||
at each \jitiSTAR node. Indexing dynamic predicates in YAP follows
|
at each \jitiSTAR node. Indexing dynamic predicates in YAP follows
|
||||||
very much the same algorithm as static indexing: the key idea is that
|
very much the same algorithm as static indexing: the key idea is that
|
||||||
most nodes in the index tree must be allocated separately so that they
|
most nodes in the index tree must be allocated separately so that they
|
||||||
can grow or contract independently. YAP can index arguments where some
|
can grow or shrink independently. YAP can index arguments where some
|
||||||
clauses have unconstrained variables, but only for static predicates,
|
clauses have unconstrained variables, but only for static predicates,
|
||||||
as in dynamic code this would complicate support for logical update
|
as in dynamic code this would complicate support for logical update
|
||||||
semantics.
|
semantics.
|
||||||
@ -982,7 +983,7 @@ convenient abbreviation.
|
|||||||
|
|
||||||
\section{Performance Evaluation} \label{sec:perf}
|
\section{Performance Evaluation} \label{sec:perf}
|
||||||
%================================================
|
%================================================
|
||||||
We evaluate \JITI on a set of benchmarks and LP applications.
|
We evaluate \JITI on a set of benchmarks and applications.
|
||||||
Throughout, we compare performance of JITI with first argument
|
Throughout, we compare performance of JITI with first argument
|
||||||
indexing. For the benchmarks of Sect.~\ref{sec:perf:ineffective}
|
indexing. For the benchmarks of Sect.~\ref{sec:perf:ineffective}
|
||||||
and~\ref{sec:perf:effective} which involve both systems, we used a
|
and~\ref{sec:perf:effective} which involve both systems, we used a
|
||||||
@ -1005,9 +1006,9 @@ As both systems support tabling, we decided to use tabling benchmarks
|
|||||||
because they are small and easy to understand, and because they are a
|
because they are small and easy to understand, and because they are a
|
||||||
bad case for JITI in the following sense: tabling avoids generating
|
bad case for JITI in the following sense: tabling avoids generating
|
||||||
repetitive queries and the benchmarks operate over extensional
|
repetitive queries and the benchmarks operate over extensional
|
||||||
database (EDB) predicates of size approximately equal the size of the
|
database (EDB) predicates of size approximately equal to the size of
|
||||||
program. We used \compress, a tabled program that solves a puzzle from
|
the program. We used \compress, a tabled program that solves a puzzle
|
||||||
an ICLP Prolog programming competition. The other benchmarks are
|
from an ICLP Prolog programming competition. The other benchmarks are
|
||||||
different variants of tabled left, right and doubly recursive
|
different variants of tabled left, right and doubly recursive
|
||||||
transitive closure over an EDB predicate forming a chain of size shown
|
transitive closure over an EDB predicate forming a chain of size shown
|
||||||
in Table~\ref{tab:ineffective} in parentheses. For each variant of
|
in Table~\ref{tab:ineffective} in parentheses. For each variant of
|
||||||
@ -1253,7 +1254,7 @@ memory usage should be at or close to the maximum. These applications
|
|||||||
use a mixture of static and dynamic predicates and we show their
|
use a mixture of static and dynamic predicates and we show their
|
||||||
memory usage separately. On static predicates, memory usage varies
|
memory usage separately. On static predicates, memory usage varies
|
||||||
widely, from only 10\% to the worst case, \Carcino, where the index
|
widely, from only 10\% to the worst case, \Carcino, where the index
|
||||||
tree takes more space than the original program. Hash tables dominate
|
tables take more space than the original program. Hash tables dominate
|
||||||
usage in \IEProtein and \Susi, whereas \TryRetryTrust chains dominate
|
usage in \IEProtein and \Susi, whereas \TryRetryTrust chains dominate
|
||||||
in \BreastCancer. In most other cases no single component dominates
|
in \BreastCancer. In most other cases no single component dominates
|
||||||
memory usage. Memory usage for dynamic data is shown in the last two
|
memory usage. Memory usage for dynamic data is shown in the last two
|
||||||
@ -1289,8 +1290,8 @@ As presented, \JITI is a hybrid technique: index generation occurs
|
|||||||
during runtime but is partly guided by the compiler, because we want
|
during runtime but is partly guided by the compiler, because we want
|
||||||
to combine it with compile-time WAM-style indexing. More flexible
|
to combine it with compile-time WAM-style indexing. More flexible
|
||||||
schemes are of course possible. For example, index generation can be
|
schemes are of course possible. For example, index generation can be
|
||||||
fully dynamic (as in YAP), combined with user declarations, or use
|
fully dynamic (as in YAP), combined with user declarations, or driven
|
||||||
static analysis to be even more selective or go beyond fixed-order
|
by static analysis to be even more selective or go beyond fixed-order
|
||||||
indexing.
|
indexing.
|
||||||
%
|
%
|
||||||
Last, observe that \JITI fully respects Prolog semantics. Better
|
Last, observe that \JITI fully respects Prolog semantics. Better
|
||||||
|
Reference in New Issue
Block a user