Added introduction.
git-svn-id: https://yap.svn.sf.net/svnroot/yap/trunk@1814 b08c6af1-5177-4d33-ba66-4b1c6b8b522a
This commit is contained in:
parent
9f4dc198ba
commit
52c4cfb18f
@ -95,12 +95,55 @@
|
|||||||
|
|
||||||
\section{Introduction}
|
\section{Introduction}
|
||||||
%=====================
|
%=====================
|
||||||
The WAM~\cite{Warren83}
|
The WAM~\cite{Warren83} has been both a blessing and a curse for
|
||||||
|
Prolog systems. Its ingenious design has allowed implementors to get
|
||||||
|
byte code compilers with decent performance --- it is not a fluke that
|
||||||
|
most Prolog systems are still based on the WAM. On the other hand,
|
||||||
|
\emph{because} the WAM gives good performance in many cases,
|
||||||
|
implementors have felt reluctant to explore alternatives that
|
||||||
|
drastically depart from its basic philosophy.
|
||||||
|
%
|
||||||
|
For example, first argument indexing makes sense for many Prolog
|
||||||
|
applications. For applications accessing large databases though is
|
||||||
|
clearly sub-optimal; for long time now, the database community has
|
||||||
|
recognized that good indexing mechanisms are the basis for fast query
|
||||||
|
processing.
|
||||||
|
|
||||||
%% The slogan ``first argument indexing is all you need'' makes sense for
|
As logic programming applications grow in size, Prolog systems need to
|
||||||
%% many Prolog applications. For applications accessing large databases
|
efficiently access larger and larger data sets and the need for any-
|
||||||
%% though is clearly false; for long time now, the database community has
|
and multi-argument indexing becomes more and more profound. Static
|
||||||
%% realized that indexing mechanisms are essential for fast query processing.
|
generation of multi-argument indexing is one alternative. However,
|
||||||
|
this alternative is often unattractive because it may drastically
|
||||||
|
increase the size of the generated byte code unnecessarily. Static
|
||||||
|
analysis techniques can partly address this concern, but in
|
||||||
|
applications that rely on features which are inherently dynamic (e.g.,
|
||||||
|
generating hypotheses for inductive logic programming data sets during
|
||||||
|
runtime) they are inapplicable or grossly inaccurate. Another
|
||||||
|
alternative, which has not been investigated so far, is to do flexible
|
||||||
|
indexing on demand during program execution.
|
||||||
|
|
||||||
|
This is precisely what we advocate in this paper. More specifically,
|
||||||
|
we present a minimal extension to the WAM that allows for flexible
|
||||||
|
indexing of Prolog clauses during runtime based on actual demand. For
|
||||||
|
static predicates, the scheme we propose is partly guided by the
|
||||||
|
compiler; for dynamic code, besides being demand-driven by queries,
|
||||||
|
the method needs to cater for code updates during runtime. In our
|
||||||
|
experience these schemes pay off. We have implemented \JITI in two
|
||||||
|
different Prolog systems (Yap and XXX) and have obtained non-trivial
|
||||||
|
speedups, ranging from a few percent to orders of magnitude, across a
|
||||||
|
wide range of applications. Given these results, we see very little
|
||||||
|
reason for Prolog systems not to incorporate some form of indexing
|
||||||
|
based on actual demand from queries. In fact, we see \JITI as only the
|
||||||
|
first step towards effective runtime optimization of Prolog programs.
|
||||||
|
|
||||||
|
This paper is structured as follows. After commenting on the state of
|
||||||
|
the art and related work concerning indexing in Prolog systems
|
||||||
|
(Sect.~\ref{sec:related}) we briefly review indexing in the WAM
|
||||||
|
(Sect.~\ref{sec:prelims}). We then present \JITI schemes for static
|
||||||
|
(Sect.~\ref{sec:static}) and dynamic (Sect.~\ref{sec:dynamic})
|
||||||
|
predicates, and discuss their implementation in two Prolog systems and
|
||||||
|
the performance benefits they bring (Sect.~\ref{sec:perf}). The paper
|
||||||
|
ends with some concluding remarks.
|
||||||
|
|
||||||
|
|
||||||
\section{State of the Art and Related Work} \label{sec:related}
|
\section{State of the Art and Related Work} \label{sec:related}
|
||||||
@ -180,7 +223,7 @@ runtime as new hypotheses are formed or refined.
|
|||||||
%
|
%
|
||||||
Our thesis is that the Prolog abstract machine should be able to adapt
|
Our thesis is that the Prolog abstract machine should be able to adapt
|
||||||
automatically to the runtime requirements of such or, even better, of
|
automatically to the runtime requirements of such or, even better, of
|
||||||
all applications by employing increasingly agressive forms of dynamic
|
all applications by employing increasingly aggressive forms of dynamic
|
||||||
compilation. As a concrete example of what this means in practice, in
|
compilation. As a concrete example of what this means in practice, in
|
||||||
this paper we will attack the problem of providing effective indexing
|
this paper we will attack the problem of providing effective indexing
|
||||||
during runtime. Naturally, we will base our technique on the existing
|
during runtime. Naturally, we will base our technique on the existing
|
||||||
@ -206,12 +249,12 @@ for being atomic, one for (non-empty) list, and one for structure. In
|
|||||||
any case, control goes to a (possibly empty) bucket of clauses. In the
|
any case, control goes to a (possibly empty) bucket of clauses. In the
|
||||||
buckets for constants and structures the second level of dispatching
|
buckets for constants and structures the second level of dispatching
|
||||||
involves the value of the register. The \switchONconstant and
|
involves the value of the register. The \switchONconstant and
|
||||||
\switchONstructure instructions implement this dispatching, typically
|
\switchONstructure instructions implement this dispatching: typically
|
||||||
with a \fail instruction when the bucket is empty, with a \jump
|
with a \fail instruction when the bucket is empty, with a \jump
|
||||||
instruction for only one clause, with a sequential scan when the
|
instruction for only one clause, with a sequential scan when the
|
||||||
number of clauses is small, and with a hash lookup when the number of
|
number of clauses is small, and with a hash lookup when the number of
|
||||||
clauses exceeds a threshold. For this reason the \switchONconstant and
|
clauses exceeds a threshold. For this reason the \switchONconstant and
|
||||||
\switchONstructure instructions take as arguments a hash table
|
\switchONstructure instructions take as arguments the hash table
|
||||||
\instr{T} and the number of clauses \instr{N} the table contains (or
|
\instr{T} and the number of clauses \instr{N} the table contains (or
|
||||||
equivalently, \instr{N} is the size of the hash table). In each bucket
|
equivalently, \instr{N} is the size of the hash table). In each bucket
|
||||||
of this hash table and also in the bucket for the variable case of
|
of this hash table and also in the bucket for the variable case of
|
||||||
@ -222,7 +265,7 @@ update certain fields of this choice point, and the \trust instruction
|
|||||||
removes it.
|
removes it.
|
||||||
|
|
||||||
The WAM has additional indexing instructions (\instr{try\_me\_else}
|
The WAM has additional indexing instructions (\instr{try\_me\_else}
|
||||||
and friends) that allow indexing to be intersperced with the code of
|
and friends) that allow indexing to be interspersed with the code of
|
||||||
clauses. For simplicity we will not consider them here. This is not a
|
clauses. For simplicity we will not consider them here. This is not a
|
||||||
problem since the above scheme handles all cases. Also, we will feel
|
problem since the above scheme handles all cases. Also, we will feel
|
||||||
free to do some minor modifications and optimizations when this
|
free to do some minor modifications and optimizations when this
|
||||||
@ -232,8 +275,8 @@ We present an example. Consider the Prolog code shown in
|
|||||||
Fig.~\ref{fig:carc:facts}. It is a fragment of the well-known machine
|
Fig.~\ref{fig:carc:facts}. It is a fragment of the well-known machine
|
||||||
learning dataset \textit{Carcinogenesis}~\cite{Carcinogenesis@ILP-97}.
|
learning dataset \textit{Carcinogenesis}~\cite{Carcinogenesis@ILP-97}.
|
||||||
The five clauses get compiled to the WAM code shown in
|
The five clauses get compiled to the WAM code shown in
|
||||||
Fig.~\ref{fig:carc:clauses}. With only first argument indexing, the
|
Fig.~\ref{fig:carc:clauses}. The first argument indexing indexing code
|
||||||
indexing code that a Prolog compiler generates is shown in
|
that a Prolog compiler generates is shown in
|
||||||
Fig.~\ref{fig:carc:index}. This code is typically placed before the
|
Fig.~\ref{fig:carc:index}. This code is typically placed before the
|
||||||
code for the clauses and the \switchONconstant instruction is the
|
code for the clauses and the \switchONconstant instruction is the
|
||||||
entry point of predicate. Note that compared with vanilla WAM this
|
entry point of predicate. Note that compared with vanilla WAM this
|
||||||
@ -772,12 +815,12 @@ $O(1)$ where $n$ is the number of clauses.
|
|||||||
The observant reader has no doubt noticed that
|
The observant reader has no doubt noticed that
|
||||||
Algorithm~\ref{alg:construction} provides multi-argument indexing but
|
Algorithm~\ref{alg:construction} provides multi-argument indexing but
|
||||||
only for the main functor symbol of arguments. For clauses with
|
only for the main functor symbol of arguments. For clauses with
|
||||||
compound terms that require indexing in their subterms we can either
|
compound terms that require indexing in their sub-terms we can either
|
||||||
employ a program transformation like \emph{unification
|
employ a program transformation like \emph{unification
|
||||||
factoring}~\cite{UnifFact@POPL-95} at compile time or modify the
|
factoring}~\cite{UnifFact@POPL-95} at compile time or modify the
|
||||||
algorithm to consider index positions inside compound terms. This is
|
algorithm to consider index positions inside compound terms. This is
|
||||||
relatively easy to do but requires support from the register allocator
|
relatively easy to do but requires support from the register allocator
|
||||||
(passing the subterms of compound terms in appropriate argument
|
(passing the sub-terms of compound terms in appropriate argument
|
||||||
registers) and/or a new set of instructions. Due to space limitations
|
registers) and/or a new set of instructions. Due to space limitations
|
||||||
we omit further details.
|
we omit further details.
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user