Revised up to Section 7.

git-svn-id: https://yap.svn.sf.net/svnroot/yap/trunk@1831 b08c6af1-5177-4d33-ba66-4b1c6b8b522a
2007-03-11 19:28:35 +00:00 · 2007-03-11 19:28:35 +00:00 · 9ed8306415
commit 9ed8306415
parent 7afc0fdd07
1 changed files with 129 additions and 113 deletions
--- a/docs/index/iclp07.tex
+++ b/docs/index/iclp07.tex
@ -48,6 +48,19 @@
 \newcommand{\pta}{\bench{pta}\xspace}
 \newcommand{\tea}{\bench{tea}\xspace}
 %------------------------------------------------------------------------------
+\newcommand{\BreastCancer}{\bench{BreastCancer}\xspace}
+\newcommand{\Carcinogenesis}{\bench{Carcinogenesis}\xspace}
+\newcommand{\Choline}{\bench{Choline}\xspace}
+\newcommand{\GeneExpression}{\bench{GeneExpression}\xspace}
+\newcommand{\IEProtein}{\bench{IE-Protein\_Extraction}\xspace}
+\newcommand{\Krki}{\bench{Krki}\xspace}
+\newcommand{\KrkiII}{\bench{Krki~II}\xspace}
+\newcommand{\Mesh}{\bench{Mesh}\xspace}
+\newcommand{\Mutagenesis}{\bench{Mutagenesis}\xspace}
+\newcommand{\Pyrimidines}{\bench{Pyrimidines}\xspace}
+\newcommand{\Susi}{\bench{Susi}\xspace}
+\newcommand{\Thermolysin}{\bench{Thermolysin}\xspace}
+%------------------------------------------------------------------------------
 \newenvironment{SmallProg}{\begin{tt}\begin{small}\begin{tabular}[b]{l}}{\end{tabular}\end{small}\end{tt}}
 \newenvironment{ScriptProg}{\begin{tt}\begin{scriptsize}\begin{tabular}[b]{l}}{\end{tabular}\end{scriptsize}\end{tt}}
 \newenvironment{FootProg}{\begin{tt}\begin{footnotesize}\begin{tabular}[c]{l}}{\end{tabular}\end{footnotesize}\end{tt}}
@ -120,7 +133,7 @@ For example, first argument indexing is sufficient for many Prolog
 applications. However, it is clearly sub-optimal for applications
 accessing large databases; for a long time now, the database community
 has recognized that good indexing is the basis for fast query
-processing~\cite{}.
+processing.

 As logic programming applications grow in size, Prolog systems need to
 efficiently access larger and larger data sets and the need for any-
@ -144,7 +157,7 @@ the method needs to cater for code updates during runtime. Where our
 schemes radically depart from current practice is that they generate
 new byte code during runtime, in effect doing a form of just-in-time
 compilation. In our experience these schemes pay off. We have
-implemented \JITI in two different Prolog systems (Yap and XXX) and
+implemented \JITI in two different Prolog systems (YAP and XXX) and
 have obtained non-trivial speedups, ranging from a few percent to
 orders of magnitude, across a wide range of applications. Given these
 results, we see very little reason for Prolog systems not to
@ -226,14 +239,14 @@ systems currently do not provide the type of indexing that
 applications require. Even in systems like Ciao~\cite{Ciao@SCP-05},
 which do come with built-in static analysis and more or less force
 such a discipline on the programmer, mode information is not used for
-multi-argument indexing!
+multi-argument indexing.

 % The grand finale:
 The situation is actually worse for certain types of Prolog
 applications. For example, consider applications in the area of
 inductive logic programming. These applications on the one hand have
-big demands for effective indexing since they need to efficiently
-access big datasets and on the other they are very unfit for static
+high demands for effective indexing since they need to efficiently
+access big datasets and on the other they are unfit for static
 analysis since queries are often ad hoc and generated only during
 runtime as new hypotheses are formed or refined.
 %
@ -241,11 +254,11 @@ Our thesis is that the Prolog abstract machine should be able to adapt
 automatically to the runtime requirements of such or, even better, of
 all applications by employing increasingly aggressive forms of dynamic
 compilation. As a concrete example of what this means in practice, in
-this paper we will attack the problem of providing effective indexing
-during runtime. Naturally, we will base our technique on the existing
-support for indexing that the WAM provides, but we will extend this
-support with the technique of \JITI that we describe in the next
-sections.
+this paper we will attack the problem of satisfying the indexing needs
+of applications during runtime. Naturally, we will base our technique
+on the existing support for indexing that the WAM provides, but we
+will extend this support with the technique of \JITI that we describe
+in the next sections.


 \section{Indexing in the WAM} \label{sec:prelims}
@ -271,7 +284,7 @@ equivalently, \instr{N} is the size of the hash table). In each bucket
 of this hash table and also in the bucket for the variable case of
 \switchONterm the code performs a sequential backtracking search of
 the clauses using a \TryRetryTrust chain of instructions. The \try
-instruction sets up a choice point, the \retry instructions (if any)
+instruction sets up a choice point, the \retry instructions (if~any)
 update certain fields of this choice point, and the \trust instruction
 removes it.

@ -529,13 +542,14 @@ heuristically decide that some arguments are most likely than others
 to be used in the \code{in} mode. Then we can simply place the
 \jitiONconstant instructions for these arguments \emph{before} the
 instructions for other arguments. This is possible since all indexing
-instructions take the argument register number as an argument.
+instructions take the argument register number as an argument; their
+order does not matter.

 \subsection{From any argument indexing to multi-argument indexing}
 %-----------------------------------------------------------------
 The scheme of the previous section gives us only single argument
 indexing. However, all the infrastructure we need is already in place.
-We can use it to obtain (fixed-order) multi-argument \JITI in a
+We can use it to obtain any fixed-order multi-argument \JITI in a
 straightforward way.

 Note that the compiler knows exactly the set of clauses that need to
@ -650,7 +664,7 @@ requires the following extensions:
  indexing will be based. Writing such a code walking procedure is not
  hard.\footnote{In many Prolog systems, a procedure with similar
  functionality often exists for the disassembler, the debugger, etc.}
-\item Indexing on an argument that contains unconstrained variables
+\item Indexing on a position that contains unconstrained variables
  for some clauses is tricky. The WAM needs to group clauses in this
  case and without special treatment creates two choice points for
  this argument (one for the variables and one per each group of
@ -658,7 +672,7 @@ requires the following extensions:
  by now. Possible solutions to it are described in a 1987 paper by
  Carlsson~\cite{FreezeIndexing@ICLP-87} and can be readily adapted to
  \JITI. Alternatively, in a simple implementation, we can skip \JITI
-  for arguments with variables in some clauses.
+  for positions with variables in some clauses.
 \end{enumerate}
 Before describing \JITI more formally, we remark on the following
 design decisions whose rationale may not be immediately obvious:
@ -800,26 +814,25 @@ to a \switchSTAR WAM instruction.
 %-------------------------------------------------------------------------

 \paragraph*{Complexity properties.}
-Complexity-wise, dynamic index construction does not add any overhead
-to program execution. First, note that each demanded index table will
-be constructed at most once. Also, a \jitiSTAR instruction will be
+Index construction during runtime does not change the complexity of
+query execution. First, note that each demanded index table will be
+constructed at most once. Also, a \jitiSTAR instruction will be
 encountered only in cases where execution would examine all clauses in
 the \TryRetryTrust chain.\footnote{This statement is possibly not
 valid the presence of Prolog cuts.} The construction visits these
 clauses \emph{once} and then creates the index table in time linear in
-the number of clauses. One pass over the list of $\langle c, L
+the number of clauses as one pass over the list of $\langle c, L
 \rangle$ pairs suffices. After index construction, execution will
-visit only a subset of these clauses as the index table will be
-consulted.
+visit a subset of these clauses as the index table will be consulted.
 %% Finally, note that the maximum number of \jitiSTAR instructions
 %% that will be visited for each query is bounded by the maximum
 %% number of index positions (symbols) in the clause heads of the
 %% predicate.
 Thus, in cases where \JITI is not effective, execution of a query will
 at most double due to dynamic index construction. In fact, this worst
-case is extremely unlikely in practice. On the other hand, \JITI can
-change the complexity of evaluating a predicate call from $O(n)$ to
-$O(1)$ where $n$ is the number of clauses.
+case is pessimistic and extremely unlikely in practice. On the other
+hand, \JITI can change the complexity of query evaluation from $O(n)$
+to $O(1)$ where $n$ is the number of clauses.

 \subsection{More implementation choices}
 %---------------------------------------
@ -857,9 +870,9 @@ instructions can either become inactive when this limit is reached, or
 better yet we can recover the space of some tables. To do so, we can
 employ any standard recycling algorithm (e.g., least recently used)
 and reclaim the of index tables that are no longer in use. This is
-easy to do by reverting the corresponding \jitiSTAR instructions back
-to \switchSTAR instructions. If the indices are needed again, they can
-simply be regenerated.
+easy to do by reverting the corresponding \switchSTAR instructions
+back to \jitiSTAR instructions. If the indices are demanded again at a
+time when memory is available, they can simply be regenerated.


 \section{Demand-Driven Indexing of Dynamic Predicates} \label{sec:dynamic}
@ -893,9 +906,9 @@ arguments. As optimizations, we can avoid indexing for predicates with
 only one clause (these are often used to simulate global variables)
 and we can exclude arguments where some clause has a variable.

-Under logical update semantics calls to a dynamic goal execute in a
+Under logical update semantics calls to dynamic predicates execute in a
 ``snapshot'' of the corresponding predicate. In other words, each call
-sees the clauses that existed at the time the call was made, even if
+sees the clauses that existed at the time when the call was made, even if
 some of the clauses were later deleted or new clauses were asserted.
 If several calls are alive in the stack, several snapshots will be
 alive at the same time. The standard solution to this problem is to
@ -903,8 +916,8 @@ use time stamps to tell which clauses are \emph{live} for which calls.
 %
 This solution complicates freeing index tables because (1) an index
 table holds references to clauses, and (2) the table may be in use,
-that is, it may be accesible from the execution stacks. A table thus
-is killed in several steps:
+that is, it may be accessible from the execution stacks. An index
+table thus is killed in several steps:
 \begin{enumerate}
 \item Detach the index table from the indexing tree.
 \item Recursively \emph{kill} every child of the current table:
@ -920,6 +933,7 @@ is killed in several steps:
 %% the \emph{itemset-node}, so the emulator reads all the instruction's
 %% arguments before executing the instruction.

+
 \section{Implementation in XXX and in YAP} \label{sec:impl}
 %==========================================================
 The implementation of \JITI in XXX follows a variant of the scheme
@ -927,7 +941,7 @@ presented in Sect.~\ref{sec:static}. The compiler uses heuristics to
 determine the best argument to index on (i.e., this argument is not
 necessarily the first) and employs \switchSTAR instructions for this
 task. It also statically generates \jitiONconstant instructions for
-other argument positions that are good candidates for \JITI.
+other arguments that are good candidates for \JITI.
 Currently, an argument is considered a good candidate if it has only
 constants or only structure symbols in all clauses. Thus, XXX uses
 only \jitiONconstant and \jitiONstructure instructions, never a
@ -935,11 +949,11 @@ only \jitiONconstant and \jitiONstructure instructions, never a
 symbols.\footnote{Instead, it prompts its user to request unification
 factoring for predicates that look likely to benefit from indexing
 inside compound terms. The user can then use the appropriate compiler
-directive for these predicates.} For dynamic predicates \JITI is
+directive for these predicates.} For dynamic predicates, \JITI is
 employed only if they consist of Datalog facts; if a clause which is
 not a Datalog fact is asserted, all dynamically created index tables
-for the predicate are simply dropped and the \jitiONconstant
-instruction becomes a \instr{noop}. All these are done automatically,
+for the predicate are simply killed and the \jitiONconstant
+instruction becomes a \instr{noop}. All this is done automatically,
 but the user can disable \JITI in compiled code using an appropriate
 compiler option.

@ -957,7 +971,8 @@ very much the same algorithm as static indexing: the key idea is that
 most nodes in the index tree must be allocated separately so that they
 can grow or contract independently. YAP can index arguments where some
 clauses have unconstrained variables, but only for static predicates,
-as it would complicate updates.
+as in dynamic code this would complicate support for logical update
+semantics.

 YAP uses the term JITI (Just-In-Time Indexing) to refer to \JITI. In
 the next section we will take the liberty to use this term as a
@ -1099,40 +1114,39 @@ this benchmark.
  \end{verbatim}
 \end{small}

-% Our experience with the indexing algorithm described here shows a
-% significant performance improvement over the previous indexing code in
-% our system. Quite often, this has allowed us to tackle applications
-% which previously would not have been feasible. We next present some
-% results that show how useful the algorithms can be.
+%% Our experience with the indexing algorithm described here shows a
+%% significant performance improvement over the previous indexing code in
+%% our system. Quite often, this has allowed us to tackle applications
+%% which previously would not have been feasible.

 \subsection{Performance of \JITI on ILP applications} \label{sec:perf:ILP}
 %-------------------------------------------------------------------------
 The need for \JITI was originally motivated by ILP applications.
 Table~\ref{tab:ilp:time} shows JITI performance on some learning tasks
-using the ALEPH system~\cite{ALEPH}. The dataset \bench{Krki} tries to
+using the ALEPH system~\cite{ALEPH}. The dataset \Krki tries to
 learn rules from a small database of chess end-games;
-\bench{GeneExpression} learns rules for yeast gene activity given a
+\GeneExpression learns rules for yeast gene activity given a
 database of genes, their interactions, and micro-array gene expression
-data; \bench{BreastCancer} processes real-life patient reports towards
+data; \BreastCancer processes real-life patient reports towards
 predicting whether an abnormality may be malignant;
-\bench{IE-Protein\_Extraction} processes information extraction from
-paper abstracts to search proteins; \bench{Susi} learns from shopping
-patterns; and \bench{Mesh} learns rules for finite-methods mesh
-design. The datasets \bench{Carcinogenesis}, \bench{Choline},
-\bench{Mutagenesis}, \bench{Pyrimidines}, and \bench{Thermolysin} are
-about predicting chemical properties of compounds. The first three
+\IEProtein processes information extraction from
+paper abstracts to search proteins; \Susi learns from shopping
+patterns; and \Mesh learns rules for finite-methods mesh
+design. The datasets \Carcinogenesis, \Choline,
+\Mutagenesis, \Pyrimidines, and \Thermolysin try to
+predict chemical properties of compounds. The first three
 datasets store properties of interest as tables, but
-\bench{Thermolysin} learns from the 3D-structure of a molecule's
-conformations. Several of these datasets are standard across Machine
-Learning literature. \bench{GeneExpression}~\cite{} and
-\bench{BreastCancer}~\cite{} were partly developed by some of the
+\Thermolysin learns from the 3D-structure of a molecule's
+conformations. Several of these datasets are standard across the Machine
+Learning literature. \GeneExpression~\cite{} and
+\BreastCancer~\cite{} were partly developed by some of the
 paper's authors. Most datasets perform simple queries in an
-extensional database. The exception is \bench{Mutagenesis} where
+extensional database. The exception is \Mutagenesis where
 several predicates are defined intensionally, requiring extensive
 computation.

 %------------------------------------------------------------------------------
-\begin{table}[ht]
+\begin{table}[t]
  \centering
  \caption{Machine Learning (ILP) Datasets: Times are given in Seconds,
    we give time for standard indexing with no indexing on dynamic
@ -1144,18 +1158,18 @@ computation.
    \cline{2-4}
    Benchmark       &    1st    &   JITI  &{\bf ratio} \\
    \hline
-    \bench{BreastCancer}           &      1450 &      88 &  16    \\
-    \bench{Carcinogenesis}         &    17,705 &     192 &  92    \\
-    \bench{Choline}                &    14,766 &   1,397 &  11    \\
-    \bench{GeneExpression}         &   193,283 &   7,483 &  26    \\
-    \bench{IE-Protein\_Extraction} & 1,677,146 &   2,909 & 577    \\
+    \BreastCancer   &      1450 &      88 &  16    \\
+    \Carcinogenesis &    17,705 &     192 &  92    \\
+    \Choline        &    14,766 &   1,397 &  11    \\
+    \GeneExpression &   193,283 &   7,483 &  26    \\
+    \IEProtein      & 1,677,146 &   2,909 & 577    \\
    \bench{Krki}                   &       0.3 &     0.3 &   1    \\
    \bench{Krki II}                &       1.3 &     1.3 &   1    \\
-    \bench{Mesh}                   &         4 &       3 &   1.3  \\
+    \Mesh           &         4 &       3 &   1.3  \\
    \bench{Mutagenesis}            &    51,775 &  27,746 &   1.9  \\
-    \bench{Pyrimidines}            &   487,545 & 253,235 &   1.9  \\
-    \bench{Susi}                   &   105,091 &     307 & 342    \\
-    \bench{Thermolysin}            &    50,279 &   5,213 &  10    \\
+    \Pyrimidines    &   487,545 & 253,235 &   1.9  \\
+    \Susi           &   105,091 &     307 & 342    \\
+    \Thermolysin    &    50,279 &   5,213 &  10    \\
    \hline
 \end{tabular}
 \end{table}
@ -1163,30 +1177,30 @@ computation.

 We compare times for 10 runs of the saturation/refinement cycle of the
 ILP system. Table~\ref{tab:ilp:time} shows time results. The
-\bench{Krki} datasets have small search spaces and small databases, so
+\Krki datasets have small search spaces and small databases, so
 they achieve the same performance under both versions:
-there is no slowdown. The \bench{Mesh}, \bench{Mutagenesis}, and
-\bench{Pyrimides} applications do not benefit much from indexing in
+there is no slowdown. The \Mesh, \Mutagenesis, and
+\Pyrimidines applications do not benefit much from indexing in
 the database, but they do benefit from indexing in the dynamic
 representation of the search space, as their running times halve.

-The \bench{BreastCancer} and \bench{GeneExpression} applications use
-1NF data (that is, unstructured data). The benefit here is mostly from
-multiple-argument indexing.  \bench{BreastCancer} is particularly
+The \BreastCancer and \GeneExpression applications use data in 
+1NF (that is, unstructured data). The benefit here is mostly from
+multiple-argument indexing. \BreastCancer is particularly
 interesting. It consists of 40 binary relations with 65k elements
-each, where the first argument is the key, like in
-\bench{sg\_cyl}. We know that most calls have the first argument
-bound, hence indexing was not expected to matter very much. Instead,
-the results show \JITI running time to improve by an order of
-magnitude. Like in \bench{sg\_cyl}, this suggests that even a small
-percentage of badly indexed calls can come to dominate running time.
+each, where the first argument is the key, like in \sgCyl. We know
+that most calls have the first argument bound, hence indexing was not
+expected to matter very much. Instead, the results show \JITI running
+time to improve by an order of magnitude. Like \sgCyl, this
+suggests that even a small percentage of badly indexed calls can end
+up dominating runtime.

-\bench{IE-Protein\_Extraction} and \bench{Thermolysin} are example
+\IEProtein and \Thermolysin are example
 applications that manipulate structured data.
-\bench{IE-Protein\_Extraction} is the largest dataset we consider,
-and indexing is simply critical: it is not possible to run the
-application in reasonable time with one argument
-indexing. \bench{Thermolysin} is smaller and performs some
+\IEProtein is the largest dataset we consider,
+and indexing is absolutely critical: it is not possible to run the
+application in reasonable time with first argument
+indexing. \Thermolysin is smaller and performs some
 computation per query: even so, indexing improves performance by an
 order of magnitude.

@ -1201,34 +1215,37 @@ order of magnitude.
    Benchmark   &  \textbf{Clause} & {\bf Index}  & \textbf{Clause} & {\bf Index} \\
 %    \textbf{Benchmarks} &   & Total & T & W & S &  & Total & T & C & W & S  \\
    \hline
-    \bench{BreastCancer}
-    & 60940 & 46887 
+    \BreastCancer
+    & 60,940 & 46,887 
    % & 46242 & 3126  & 125
    & 630  & 14
    % &42 & 18& 57 &6
    \\

-    \bench{Carcinogenesis} 
+    \Carcinogenesis
    & 1801 & 2678
    % &1225 & 587 & 865
-    & 13512 & 942
+    & 13,512 & 942
    %& 291 & 91 & 457 & 102
    \\

-    \bench{Choline}  & 666 & 174
+    \Choline  & 666 & 174
    % &67 & 48 & 58
    & 3172 & 174
    % & 76 & 4 & 48 & 45
    \\
-    \bench{GeneExpression}    &  46726 & 22629
+
+    \GeneExpression
+    &  46,726 & 22,629
    % &6780 & 6473 & 9375
-    & 116463 & 9015
+    & 116,463 & 9015
    %& 2703 & 932 & 3910 & 1469
    \\

-    \bench{IE-Protein\_Extraction}    &146033 & 129333
+    \bench{IE-Protein\_Extraction}
+    & 146,033 & 129,333
    %&39279 & 24322 & 65732
-    & 53423 & 1531
+    & 53,423 & 1531
    %& 467 & 108 & 868 & 86
    \\

@ -1258,7 +1275,7 @@ order of magnitude.
    
    \bench{Pyrimidines}       & 774 & 218
    %&76 & 63 & 77
-    & 25840 & 12291
+    & 25,840 & 12,291
    %& 4847 & 43 & 3510 & 3888
    \\

@ -1270,10 +1287,9 @@ order of magnitude.

    \bench{Thermolysin}       & 2317 & 929
    %&429 & 184 & 315
-    & 116129 & 7064
+    & 116,129 & 7064
    %& 3295 & 1438 & 2160 & 170
    \\
-
    \hline
 \end{tabular}
 \end{table*}
@ -1287,12 +1303,12 @@ usage on \emph{static} predicates. Static data-base sizes range from
 146MB (\bench{IE-Protein\_Extraction} to less than a MB
 (\bench{Choline}, \bench{Krki}, \bench{Mesh}). Indexing code can be
 more than the original code, as in \bench{Mutagenesis}, or almost as
-much, eg, \bench{IE-Protein\_Extraction}. In most cases the YAP \JITI
+much, e.g., \bench{IE-Protein\_Extraction}. In most cases the YAP \JITI
 adds at least a third and often a half to the original data-base. A
 more detailed analysis shows the source of overhead to be very
 different from dataset to dataset. In \bench{IE-Protein\_Extraction}
 the problem is that hash tables are very large. Hash tables are also
-where most space is spent in \bench{Susi}. In \bench{BreastCancer}
+where most space is spent in \bench{Susi}. In \BreastCancer
 hash tables are actually small, so most space is spent in
 \TryRetryTrust chains. \bench{Mutagenesis} is similar: even though YAP
 spends a large effort in indexing it still generates long