remove muta from ILP benchmarks
git-svn-id: https://yap.svn.sf.net/svnroot/yap/trunk@1834 b08c6af1-5177-4d33-ba66-4b1c6b8b522a
This commit is contained in:
parent
f71e9d87c3
commit
a478f7cb04
@ -1122,7 +1122,7 @@ difference in this benchmark.
|
|||||||
\subsection{Performance of \JITI on ILP applications} \label{sec:perf:ILP}
|
\subsection{Performance of \JITI on ILP applications} \label{sec:perf:ILP}
|
||||||
%-------------------------------------------------------------------------
|
%-------------------------------------------------------------------------
|
||||||
The need for \JITI was originally noticed in inductive logic
|
The need for \JITI was originally noticed in inductive logic
|
||||||
programming applications. Table~\ref{tab:ilp:time} shows JITI
|
programming applications. Table~\ref{tab:ilp:time} shows \JITI
|
||||||
performance on some learning tasks using the ALEPH
|
performance on some learning tasks using the ALEPH
|
||||||
system~\cite{ALEPH}. The dataset \Krki tries to learn rules from a
|
system~\cite{ALEPH}. The dataset \Krki tries to learn rules from a
|
||||||
small database of chess end-games; \GeneExpression learns rules for
|
small database of chess end-games; \GeneExpression learns rules for
|
||||||
@ -1132,7 +1132,7 @@ patient reports towards predicting whether an abnormality may be
|
|||||||
malignant; \IEProtein processes information extraction from paper
|
malignant; \IEProtein processes information extraction from paper
|
||||||
abstracts to search proteins; \Susi learns from shopping patterns; and
|
abstracts to search proteins; \Susi learns from shopping patterns; and
|
||||||
\Mesh learns rules for finite-methods mesh design. The datasets
|
\Mesh learns rules for finite-methods mesh design. The datasets
|
||||||
\Carcinogenesis, \Choline, \Mutagenesis, \Pyrimidines, and
|
\Carcinogenesis, \Choline, \Pyrimidines, and
|
||||||
\Thermolysin try to predict chemical properties of compounds. The
|
\Thermolysin try to predict chemical properties of compounds. The
|
||||||
first three datasets store properties of interest as tables, but
|
first three datasets store properties of interest as tables, but
|
||||||
\Thermolysin learns from the 3D-structure of a molecule's
|
\Thermolysin learns from the 3D-structure of a molecule's
|
||||||
@ -1140,9 +1140,7 @@ conformations. Several of these datasets are standard across the
|
|||||||
Machine Learning literature. \GeneExpression~\cite{ilp-regulatory06}
|
Machine Learning literature. \GeneExpression~\cite{ilp-regulatory06}
|
||||||
and \BreastCancer~\cite{DBLP:conf/ijcai/DavisBDPRCS05} were partly
|
and \BreastCancer~\cite{DBLP:conf/ijcai/DavisBDPRCS05} were partly
|
||||||
developed by some of the paper's authors. Most datasets perform simple
|
developed by some of the paper's authors. Most datasets perform simple
|
||||||
queries in an extensional database. The exception is \Mutagenesis
|
queries in an extensional database.
|
||||||
where several predicates are defined intensionally, requiring
|
|
||||||
extensive computation.
|
|
||||||
|
|
||||||
%------------------------------------------------------------------------------
|
%------------------------------------------------------------------------------
|
||||||
\begin{table}[t]
|
\begin{table}[t]
|
||||||
@ -1165,7 +1163,6 @@ extensive computation.
|
|||||||
\bench{Krki} & 0.3 & 0.3 & 1 \\
|
\bench{Krki} & 0.3 & 0.3 & 1 \\
|
||||||
\bench{Krki II} & 1.3 & 1.3 & 1 \\
|
\bench{Krki II} & 1.3 & 1.3 & 1 \\
|
||||||
\Mesh & 4 & 3 & 1.3 \\
|
\Mesh & 4 & 3 & 1.3 \\
|
||||||
\bench{Mutagenesis} & 51,775 & 27,746 & 1.9 \\
|
|
||||||
\Pyrimidines & 487,545 & 253,235 & 1.9 \\
|
\Pyrimidines & 487,545 & 253,235 & 1.9 \\
|
||||||
\Susi & 105,091 & 307 & 342 \\
|
\Susi & 105,091 & 307 & 342 \\
|
||||||
\Thermolysin & 50,279 & 5,213 & 10 \\
|
\Thermolysin & 50,279 & 5,213 & 10 \\
|
||||||
@ -1175,12 +1172,11 @@ extensive computation.
|
|||||||
%------------------------------------------------------------------------------
|
%------------------------------------------------------------------------------
|
||||||
|
|
||||||
We compare times for 10 runs of the saturation/refinement cycle of the
|
We compare times for 10 runs of the saturation/refinement cycle of the
|
||||||
ILP system. Table~\ref{tab:ilp:time} shows time results. The
|
ILP system. Table~\ref{tab:ilp:time} shows time results. The \Krki
|
||||||
\Krki datasets have small search spaces and small databases, so
|
datasets have small search spaces and small databases, so they achieve
|
||||||
they achieve the same performance under both versions:
|
the same performance under both versions: there is no slowdown. The
|
||||||
there is no slowdown. The \Mesh, \Mutagenesis, and
|
\Mesh and \Pyrimidines applications do not benefit much from indexing
|
||||||
\Pyrimidines applications do not benefit much from indexing in
|
in the database, but they do benefit from indexing in the dynamic
|
||||||
the database, but they do benefit from indexing in the dynamic
|
|
||||||
representation of the search space, as their running times halve.
|
representation of the search space, as their running times halve.
|
||||||
|
|
||||||
The \BreastCancer and \GeneExpression applications use data in
|
The \BreastCancer and \GeneExpression applications use data in
|
||||||
@ -1266,12 +1262,6 @@ order of magnitude.
|
|||||||
%& 46 & 4 & 35 & 22
|
%& 46 & 4 & 35 & 22
|
||||||
\\
|
\\
|
||||||
|
|
||||||
\bench{Mutagenesis} & 1412 & 1848
|
|
||||||
%&1045 & 291 & 510
|
|
||||||
& 4302 & 595
|
|
||||||
%& 156 & 114 & 264 & 61
|
|
||||||
\\
|
|
||||||
|
|
||||||
\bench{Pyrimidines} & 774 & 218
|
\bench{Pyrimidines} & 774 & 218
|
||||||
%&76 & 63 & 77
|
%&76 & 63 & 77
|
||||||
& 25,840 & 12,291
|
& 25,840 & 12,291
|
||||||
@ -1300,20 +1290,18 @@ Because dynamic memory expands and contracts, we chose a point where
|
|||||||
memory usage should be at a maximum. The first two numbers show data
|
memory usage should be at a maximum. The first two numbers show data
|
||||||
usage on \emph{static} predicates. Static data-base sizes range from
|
usage on \emph{static} predicates. Static data-base sizes range from
|
||||||
146MB (\bench{IE-Protein\_Extraction} to less than a MB
|
146MB (\bench{IE-Protein\_Extraction} to less than a MB
|
||||||
(\bench{Choline}, \bench{Krki}, \bench{Mesh}). Indexing code can be
|
(\bench{Choline}, \bench{Krki}, \bench{Mesh}). Indexing code can grow
|
||||||
more than the original code, as in \bench{Mutagenesis}, or almost as
|
to be as large as than the original code, as in \Carcinogenesis, or
|
||||||
much, e.g., \bench{IE-Protein\_Extraction}. In most cases the YAP \JITI
|
almost as much, e.g., \bench{IE-Protein\_Extraction}. In most cases
|
||||||
adds at least a third and often a half to the original data-base. A
|
the YAP \JITI adds at least a third and often a half to the original
|
||||||
more detailed analysis shows the source of overhead to be very
|
data-base. A more detailed analysis shows the source of overhead to be
|
||||||
different from dataset to dataset. In \bench{IE-Protein\_Extraction}
|
very different from dataset to dataset. In
|
||||||
the problem is that hash tables are very large. Hash tables are also
|
\bench{IE-Protein\_Extraction} the problem is that hash tables are
|
||||||
where most space is spent in \bench{Susi}. In \BreastCancer
|
very large. Hash tables are also where most space is spent in
|
||||||
hash tables are actually small, so most space is spent in
|
\bench{Susi}. In \BreastCancer hash tables are actually small, so most
|
||||||
\TryRetryTrust chains. \bench{Mutagenesis} is similar: even though YAP
|
space is spent in \TryRetryTrust chains. Storing sets of matching
|
||||||
spends a large effort in indexing it still generates long
|
clauses at \jitiSTAR nodes takes usually over 10\% of total memory
|
||||||
\TryRetryTrust chains. Storing sets of matching clauses at \jitiSTAR
|
usage, but is never dominant.
|
||||||
nodes takes usually over 10\% of total memory usage, but is never
|
|
||||||
dominant.
|
|
||||||
|
|
||||||
This version of ALEPH uses the internal data-base to store the IDB.
|
This version of ALEPH uses the internal data-base to store the IDB.
|
||||||
The size of reflects the search space, and is to some extent
|
The size of reflects the search space, and is to some extent
|
||||||
|
Reference in New Issue
Block a user