Merged the two tables of 7.3
git-svn-id: https://yap.svn.sf.net/svnroot/yap/trunk@1836 b08c6af1-5177-4d33-ba66-4b1c6b8b522a
This commit is contained in:
parent
075c9a5bf3
commit
9ec9b7fb70
@ -49,14 +49,13 @@
|
|||||||
\newcommand{\tea}{\bench{tea}\xspace}
|
\newcommand{\tea}{\bench{tea}\xspace}
|
||||||
%------------------------------------------------------------------------------
|
%------------------------------------------------------------------------------
|
||||||
\newcommand{\BreastCancer}{\bench{BreastCancer}\xspace}
|
\newcommand{\BreastCancer}{\bench{BreastCancer}\xspace}
|
||||||
\newcommand{\Carcinogenesis}{\bench{Carcinogenesis}\xspace}
|
\newcommand{\Carcino}{\bench{Carcinogenesis}\xspace}
|
||||||
\newcommand{\Choline}{\bench{Choline}\xspace}
|
\newcommand{\Choline}{\bench{Choline}\xspace}
|
||||||
\newcommand{\GeneExpression}{\bench{GeneExpression}\xspace}
|
\newcommand{\GeneExpr}{\bench{GeneExpression}\xspace}
|
||||||
\newcommand{\IEProtein}{\bench{IE-Protein\_Extraction}\xspace}
|
\newcommand{\IEProtein}{\bench{IE-Protein\_Extraction}\xspace}
|
||||||
\newcommand{\Krki}{\bench{Krki}\xspace}
|
\newcommand{\Krki}{\bench{Krki}\xspace}
|
||||||
\newcommand{\KrkiII}{\bench{Krki~II}\xspace}
|
\newcommand{\KrkiII}{\bench{Krki~II}\xspace}
|
||||||
\newcommand{\Mesh}{\bench{Mesh}\xspace}
|
\newcommand{\Mesh}{\bench{Mesh}\xspace}
|
||||||
\newcommand{\Mutagenesis}{\bench{Mutagenesis}\xspace}
|
|
||||||
\newcommand{\Pyrimidines}{\bench{Pyrimidines}\xspace}
|
\newcommand{\Pyrimidines}{\bench{Pyrimidines}\xspace}
|
||||||
\newcommand{\Susi}{\bench{Susi}\xspace}
|
\newcommand{\Susi}{\bench{Susi}\xspace}
|
||||||
\newcommand{\Thermolysin}{\bench{Thermolysin}\xspace}
|
\newcommand{\Thermolysin}{\bench{Thermolysin}\xspace}
|
||||||
@ -1013,8 +1012,8 @@ in parentheses. For each variant of transitive closure, we issue two
|
|||||||
queries: one with mode \code{(in,out)} and one with mode
|
queries: one with mode \code{(in,out)} and one with mode
|
||||||
\code{(out,out)}.
|
\code{(out,out)}.
|
||||||
%
|
%
|
||||||
For YAP, indices on the first argument and \TryRetryTrust are built on
|
For YAP, indices on the first argument and \TryRetryTrust chains are
|
||||||
all benchmarks under \JITI.
|
built on all benchmarks under \JITI.
|
||||||
%
|
%
|
||||||
For XXX, \JITI triggers on no benchmark but the \jitiONconstant
|
For XXX, \JITI triggers on no benchmark but the \jitiONconstant
|
||||||
instructions are executed for the three \bench{tc\_?\_oo} benchmarks.
|
instructions are executed for the three \bench{tc\_?\_oo} benchmarks.
|
||||||
@ -1069,8 +1068,9 @@ columns separately.
|
|||||||
%--------------------------------------------------------------------------
|
%--------------------------------------------------------------------------
|
||||||
On the other hand, when \JITI is effective, it can significantly
|
On the other hand, when \JITI is effective, it can significantly
|
||||||
improve time performance. We use the following programs and
|
improve time performance. We use the following programs and
|
||||||
applications:\TODO{If time permits, we should also add FSA benchmarks
|
applications:
|
||||||
(\bench{k963}, \bench{dg5} and \bench{tl3})}
|
%% \TODO{For the journal version we should also add FSA benchmarks
|
||||||
|
%% (\bench{k963}, \bench{dg5} and \bench{tl3})}
|
||||||
\begin{description}
|
\begin{description}
|
||||||
\item[\sgCyl] The same generation DB benchmark on a $24 \times 24
|
\item[\sgCyl] The same generation DB benchmark on a $24 \times 24
|
||||||
\times 2$ cylinder. We issue the open query.
|
\times 2$ cylinder. We issue the open query.
|
||||||
@ -1122,52 +1122,80 @@ difference in this benchmark.
|
|||||||
\subsection{Performance of \JITI on ILP applications} \label{sec:perf:ILP}
|
\subsection{Performance of \JITI on ILP applications} \label{sec:perf:ILP}
|
||||||
%-------------------------------------------------------------------------
|
%-------------------------------------------------------------------------
|
||||||
The need for \JITI was originally noticed in inductive logic
|
The need for \JITI was originally noticed in inductive logic
|
||||||
programming applications. Table~\ref{tab:ilp:time} shows \JITI
|
programming applications, which tend to issue ad hoc queries during
|
||||||
performance on some learning tasks using the ALEPH
|
runtime and their indexing requirements cannot be determined at
|
||||||
system~\cite{ALEPH}. The dataset \Krki tries to learn rules from a
|
compile time. On the other hand, these applications operate on lots of
|
||||||
small database of chess end-games; \GeneExpression learns rules for
|
data, so memory consumption is a reasonable concern. We evaluate
|
||||||
|
JITI's time and space performance on some learning tasks using the
|
||||||
|
ALEPH system~\cite{ALEPH}. We use the following datasets:
|
||||||
|
%
|
||||||
|
% Table~\ref{tab:ilp:time} shows JITI performance.
|
||||||
|
The dataset \Krki tries to learn rules from a
|
||||||
|
small database of chess end-games; \GeneExpr learns rules for
|
||||||
yeast gene activity given a database of genes, their interactions, and
|
yeast gene activity given a database of genes, their interactions, and
|
||||||
micro-array gene expression data; \BreastCancer processes real-life
|
micro-array gene expression data; \BreastCancer processes real-life
|
||||||
patient reports towards predicting whether an abnormality may be
|
patient reports towards predicting whether an abnormality may be
|
||||||
malignant; \IEProtein processes information extraction from paper
|
malignant; \IEProtein processes information extraction from paper
|
||||||
abstracts to search proteins; \Susi learns from shopping patterns; and
|
abstracts to search proteins; \Susi learns from shopping patterns; and
|
||||||
\Mesh learns rules for finite-methods mesh design. The datasets
|
\Mesh learns rules for finite-methods mesh design. The datasets
|
||||||
\Carcinogenesis, \Choline, \Pyrimidines, and
|
\Carcino, \Choline, \Pyrimidines, and
|
||||||
\Thermolysin try to predict chemical properties of compounds. The
|
\Thermolysin try to predict chemical properties of compounds. The
|
||||||
first three datasets store properties of interest as tables, but
|
first three datasets store properties of interest as tables, but
|
||||||
\Thermolysin learns from the 3D-structure of a molecule's
|
\Thermolysin learns from the 3D-structure of a molecule's
|
||||||
conformations. Several of these datasets are standard across the
|
conformations. Several of these datasets are standard across the
|
||||||
Machine Learning literature. \GeneExpression~\cite{ilp-regulatory06}
|
Machine Learning literature. \GeneExpr~\cite{ilp-regulatory06}
|
||||||
and \BreastCancer~\cite{DBLP:conf/ijcai/DavisBDPRCS05} were partly
|
and \BreastCancer~\cite{DBLP:conf/ijcai/DavisBDPRCS05} were partly
|
||||||
developed by some of the paper's authors. Most datasets perform simple
|
developed by an author of this paper. Most datasets perform simple
|
||||||
queries in an extensional database.
|
queries in an extensional database.
|
||||||
|
|
||||||
%------------------------------------------------------------------------------
|
%------------------------------------------------------------------------------
|
||||||
\begin{table}[t]
|
\begin{table}[t]
|
||||||
\centering
|
\centering
|
||||||
\caption{Machine Learning (ILP) Datasets: Times are given in Seconds,
|
\caption{Time and space performance on Machine Learning (ILP) Datasets}
|
||||||
we give time for standard indexing with no indexing on dynamic
|
\label{tab:ilp}
|
||||||
predicates versus the \JITI implementation}
|
|
||||||
\label{tab:ilp:time}
|
|
||||||
\setlength{\tabcolsep}{3pt}
|
\setlength{\tabcolsep}{3pt}
|
||||||
\begin{tabular}{|l||r|r|r|} \hline %\cline{1-3}
|
\subfigure[Time (in seconds)]{\label{tab:ilp:time}
|
||||||
& \multicolumn{3}{|c|}{Time (in secs)} \\
|
\begin{tabular}{|l||r|r|r||} \hline
|
||||||
|
& \multicolumn{3}{|c||}{Time (in secs)} \\
|
||||||
\cline{2-4}
|
\cline{2-4}
|
||||||
Benchmark & 1st & JITI &{\bf ratio} \\
|
Benchmark & 1st & JITI &{\bf ratio} \\
|
||||||
\hline
|
\hline
|
||||||
\BreastCancer & 1450 & 88 & 16 \\
|
\BreastCancer & 1,450 & 88 & 16 \\
|
||||||
\Carcinogenesis & 17,705 & 192 & 92 \\
|
\Carcino & 17,705 & 192 & 92 \\
|
||||||
\Choline & 14,766 & 1,397 & 11 \\
|
\Choline & 14,766 & 1,397 & 11 \\
|
||||||
\GeneExpression & 193,283 & 7,483 & 26 \\
|
\GeneExpr & 193,283 & 7,483 & 26 \\
|
||||||
\IEProtein & 1,677,146 & 2,909 & 577 \\
|
\IEProtein & 1,677,146 & 2,909 & 577 \\
|
||||||
\bench{Krki} & 0.3 & 0.3 & 1 \\
|
\Krki & 0.3 & 0.3 & 1 \\
|
||||||
\bench{Krki II} & 1.3 & 1.3 & 1 \\
|
\KrkiII & 1.3 & 1.3 & 1 \\
|
||||||
\Mesh & 4 & 3 & 1.3 \\
|
\Mesh & 4 & 3 & 1.3 \\
|
||||||
\Pyrimidines & 487,545 & 253,235 & 1.9 \\
|
\Pyrimidines & 487,545 & 253,235 & 1.9 \\
|
||||||
\Susi & 105,091 & 307 & 342 \\
|
\Susi & 105,091 & 307 & 342 \\
|
||||||
\Thermolysin & 50,279 & 5,213 & 10 \\
|
\Thermolysin & 50,279 & 5,213 & 10 \\
|
||||||
\hline
|
\hline
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
|
}
|
||||||
|
\subfigure[Memory usage (in KB)]{\label{tab:ilp:memory}
|
||||||
|
\begin{tabular}{||r|r|r|r||} \hline
|
||||||
|
\multicolumn{2}{||c|}{Static code}
|
||||||
|
& \multicolumn{2}{|c||}{Dynamic code} \\
|
||||||
|
\hline
|
||||||
|
\multicolumn{1}{||c|}{Clauses} & \multicolumn{1}{c}{Index}
|
||||||
|
& \multicolumn{1}{|c|}{Clauses} & \multicolumn{1}{c||}{Index}\\
|
||||||
|
\hline
|
||||||
|
60,940 & 46,887 & 630 & 14 \\
|
||||||
|
1,801 & 2,678 & 13,512 & 942 \\
|
||||||
|
666 & 174 & 3,172 & 174 \\
|
||||||
|
46,726 & 22,629 & 116,463 & 9,015 \\
|
||||||
|
146,033 & 129,333 & 53,423 & 1,531 \\
|
||||||
|
678 & 117 & 2,047 & 24 \\
|
||||||
|
1,866 & 715 & 2,055 & 26 \\
|
||||||
|
802 & 161 & 2,149 & 109 \\
|
||||||
|
774 & 218 & 25,840 & 12,291 \\
|
||||||
|
5,007 & 2,509 & 4,497 & 759 \\
|
||||||
|
2,317 & 929 & 116,129 & 7,064 \\
|
||||||
|
\hline
|
||||||
|
\end{tabular}
|
||||||
|
}
|
||||||
\end{table}
|
\end{table}
|
||||||
%------------------------------------------------------------------------------
|
%------------------------------------------------------------------------------
|
||||||
|
|
||||||
@ -1179,7 +1207,7 @@ the same performance under both versions: there is no slowdown. The
|
|||||||
in the database, but they do benefit from indexing in the dynamic
|
in the database, but they do benefit from indexing in the dynamic
|
||||||
representation of the search space, as their running times halve.
|
representation of the search space, as their running times halve.
|
||||||
|
|
||||||
The \BreastCancer and \GeneExpression applications use data in
|
The \BreastCancer and \GeneExpr applications use data in
|
||||||
1NF (that is, unstructured data). The benefit here is mostly from
|
1NF (that is, unstructured data). The benefit here is mostly from
|
||||||
multiple-argument indexing. \BreastCancer is particularly
|
multiple-argument indexing. \BreastCancer is particularly
|
||||||
interesting. It consists of 40 binary relations with 65k elements
|
interesting. It consists of 40 binary relations with 65k elements
|
||||||
@ -1199,90 +1227,6 @@ indexing. \Thermolysin is smaller and performs some
|
|||||||
computation per query: even so, indexing improves performance by an
|
computation per query: even so, indexing improves performance by an
|
||||||
order of magnitude.
|
order of magnitude.
|
||||||
|
|
||||||
\begin{table*}[ht]
|
|
||||||
\centering
|
|
||||||
\caption{Memory Performance on Machine Learning (ILP) Datasets: memory
|
|
||||||
usage is given in KB}
|
|
||||||
\label{tab:ilp:memory}
|
|
||||||
\setlength{\tabcolsep}{3pt}
|
|
||||||
\begin {tabular}{|l|r|r||r|r|} \hline %\cline{1-3}
|
|
||||||
& \multicolumn{2}{|c||}{\bf Static Code} & \multicolumn{2}{|c|}{\bf Dynamic Code} \\
|
|
||||||
Benchmark & \textbf{Clause} & {\bf Index} & \textbf{Clause} & {\bf Index} \\
|
|
||||||
% \textbf{Benchmarks} & & Total & T & W & S & & Total & T & C & W & S \\
|
|
||||||
\hline
|
|
||||||
\BreastCancer
|
|
||||||
& 60,940 & 46,887
|
|
||||||
% & 46242 & 3126 & 125
|
|
||||||
& 630 & 14
|
|
||||||
% &42 & 18& 57 &6
|
|
||||||
\\
|
|
||||||
|
|
||||||
\Carcinogenesis
|
|
||||||
& 1801 & 2678
|
|
||||||
% &1225 & 587 & 865
|
|
||||||
& 13,512 & 942
|
|
||||||
%& 291 & 91 & 457 & 102
|
|
||||||
\\
|
|
||||||
|
|
||||||
\Choline & 666 & 174
|
|
||||||
% &67 & 48 & 58
|
|
||||||
& 3172 & 174
|
|
||||||
% & 76 & 4 & 48 & 45
|
|
||||||
\\
|
|
||||||
|
|
||||||
\GeneExpression
|
|
||||||
& 46,726 & 22,629
|
|
||||||
% &6780 & 6473 & 9375
|
|
||||||
& 116,463 & 9015
|
|
||||||
%& 2703 & 932 & 3910 & 1469
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{IE-Protein\_Extraction}
|
|
||||||
& 146,033 & 129,333
|
|
||||||
%&39279 & 24322 & 65732
|
|
||||||
& 53,423 & 1531
|
|
||||||
%& 467 & 108 & 868 & 86
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Krki} & 678 & 117
|
|
||||||
%&52 & 24 & 40
|
|
||||||
& 2047 & 24
|
|
||||||
%& 10 & 2 & 10 & 1
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Krki II} & 1866 & 715
|
|
||||||
%&180 & 233 & 301
|
|
||||||
& 2055 & 26
|
|
||||||
%& 11 & 2 & 11 & 1
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Mesh} & 802 & 161
|
|
||||||
%&49 & 18 & 93
|
|
||||||
& 2149 & 109
|
|
||||||
%& 46 & 4 & 35 & 22
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Pyrimidines} & 774 & 218
|
|
||||||
%&76 & 63 & 77
|
|
||||||
& 25,840 & 12,291
|
|
||||||
%& 4847 & 43 & 3510 & 3888
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Susi} & 5007 & 2509
|
|
||||||
%&855 & 578 & 1076
|
|
||||||
& 4497 & 759
|
|
||||||
%& 324 & 58 & 256 & 120
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Thermolysin} & 2317 & 929
|
|
||||||
%&429 & 184 & 315
|
|
||||||
& 116,129 & 7064
|
|
||||||
%& 3295 & 1438 & 2160 & 170
|
|
||||||
\\
|
|
||||||
\hline
|
|
||||||
\end{tabular}
|
|
||||||
\end{table*}
|
|
||||||
|
|
||||||
|
|
||||||
Table~\ref{tab:ilp:memory} shows the memory cost paid for \JITI. The
|
Table~\ref{tab:ilp:memory} shows the memory cost paid for \JITI. The
|
||||||
table presents data obtained at a point near the end of execution.
|
table presents data obtained at a point near the end of execution.
|
||||||
@ -1291,7 +1235,7 @@ memory usage should be at a maximum. The first two numbers show data
|
|||||||
usage on \emph{static} predicates. Static data-base sizes range from
|
usage on \emph{static} predicates. Static data-base sizes range from
|
||||||
146MB (\bench{IE-Protein\_Extraction} to less than a MB
|
146MB (\bench{IE-Protein\_Extraction} to less than a MB
|
||||||
(\bench{Choline}, \bench{Krki}, \bench{Mesh}). Indexing code can grow
|
(\bench{Choline}, \bench{Krki}, \bench{Mesh}). Indexing code can grow
|
||||||
to be as large as than the original code, as in \Carcinogenesis, or
|
to be as large as than the original code, as in \Carcino, or
|
||||||
almost as much, e.g., \bench{IE-Protein\_Extraction}. In most cases
|
almost as much, e.g., \bench{IE-Protein\_Extraction}. In most cases
|
||||||
the YAP \JITI adds at least a third and often a half to the original
|
the YAP \JITI adds at least a third and often a half to the original
|
||||||
data-base. A more detailed analysis shows the source of overhead to be
|
data-base. A more detailed analysis shows the source of overhead to be
|
||||||
@ -1306,16 +1250,15 @@ usage, but is never dominant.
|
|||||||
This version of ALEPH uses the internal data-base to store the IDB.
|
This version of ALEPH uses the internal data-base to store the IDB.
|
||||||
The size of reflects the search space, and is to some extent
|
The size of reflects the search space, and is to some extent
|
||||||
independent of the program's static data, although small applications
|
independent of the program's static data, although small applications
|
||||||
such as \bench{Krki} do tend to have a small search space. ALEPH's
|
such as \bench{Krki} tend to have a small search space. ALEPH's
|
||||||
author very carefully designed the system to work around overheads in
|
author very carefully designed the system to work around overheads in
|
||||||
accessing the data-base, so indexing should not be as critical. The
|
accessing the database, so indexing should not be as critical. The
|
||||||
low overheads suggest that the \JITI is working well, as confirmed in
|
low overheads suggest that \JITI is working well, as confirmed in
|
||||||
a more detailed analysis: most space is spent on hashes tables and on
|
a more detailed analysis: most space is spent on hash tables and on
|
||||||
internal nodes of tree, and relatively little space is spent on
|
internal nodes of tree, and relatively little space is spent on
|
||||||
\TryRetryTrust chains.
|
\TryRetryTrust chains.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\section{Concluding Remarks}
|
\section{Concluding Remarks}
|
||||||
%===========================
|
%===========================
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
|
Reference in New Issue
Block a user