initial version of LEMUR
This commit is contained in:
parent
b25c9e5b61
commit
ce12c424f3
@ -441,6 +441,9 @@ The files \texttt{*.uni} that are present for some of the examples are used by
|
||||
\item EMBLEM (EM over Bdds for probabilistic Logic programs Efficient Mining): an implementation of EM for learning parameters that computes expectations directly on BDDs \cite{BelRig11-IDA,BelRig11-CILC11-NC,BelRig11-TR}
|
||||
\item SLIPCASE (Structure LearnIng of ProbabilistiC logic progrAmS with Em over bdds): an algorithm for learning the structure of programs by searching directly the theory space \cite{BelRig11-ILP11-IC}
|
||||
\item SLIPCOVER (Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space): an algorithm for learning the structure of programs by searching the clause space and the theory space separatery \cite{BelRig13-TPLP-IJ}
|
||||
\item LEMUR (LEarning with a Monte carlo Upgrade of tRee search): an algorithm
|
||||
for learning the structure of programs by searching the clase space using
|
||||
Monte-Carlo tree search.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Input}
|
||||
@ -449,7 +452,7 @@ To execute the learning algorithms, prepare four files in the same folder:
|
||||
\item \texttt{<stem>.kb}: contains the example interpretations
|
||||
\item \texttt{<stem>.bg}: contains the background knowledge, i.e., knowledge valid for all interpretations
|
||||
\item \texttt{<stem>.l}: contains language bias information
|
||||
\item \texttt{<stem>.cpl}: contains the LPAD for you which you want to learn the parameters or the initial LPAD for SLIPCASE. For SLIPCOVER, this file should be absent
|
||||
\item \texttt{<stem>.cpl}: contains the LPAD for you which you want to learn the parameters or the initial LPAD for SLIPCASE and LEMUR. For SLIPCOVER, this file should be absent
|
||||
\end{itemize}
|
||||
where \texttt{<stem>} is your dataset name. Examples of these files can be found in the dataset pages.
|
||||
|
||||
@ -504,7 +507,7 @@ For RIB, if there are unseen predicates, i.e., predicates that are present in th
|
||||
unseen(<predicate>/<arity>).
|
||||
\end{verbatim}
|
||||
|
||||
For SLIPCASE and SLIPCOVER, you have to specify the language bias by means of mode declarations in the style of
|
||||
For SLIPCASE, SLIPCOVER and LEMUR, you have to specify the language bias by means of mode declarations in the style of
|
||||
\href{http://www.doc.ic.ac.uk/\string ~shm/progol.html}{Progol}.
|
||||
\begin{verbatim}
|
||||
modeh(<recall>,<predicate>(<arg1>,...).
|
||||
@ -558,7 +561,7 @@ modeb(*,samecourse(+course, -course)).
|
||||
modeb(*,samecourse(-course, +course)).
|
||||
....
|
||||
\end{verbatim}
|
||||
SLIPCOVER also requires facts for the \verb|determination/2| predicate that indicate which predicates can appear in the body of clauses.
|
||||
SLIPCOVER and LEMUR lso requires facts for the \verb|determination/2| predicate that indicate which predicates can appear in the body of clauses.
|
||||
For example
|
||||
\begin{verbatim}
|
||||
determination(professor/1,student/1).
|
||||
@ -592,17 +595,21 @@ In order to set the algorithms' parameters, you have to insert in \texttt{<stem>
|
||||
The available parameters are:
|
||||
\begin{itemize}
|
||||
\item \verb|depth| (values: integer or \verb|inf|, default value: 3): depth of derivations if \verb|depth_bound| is set to \verb|true|
|
||||
\item \verb|single_var| (values: \verb|{true,false}|, default value: \verb|false|, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): if set to \verb|true|, there is a random variable for each clauses, instead of a separate random variable for each grounding of a clause
|
||||
\item \verb|single_var| (values: \verb|{true,false}|, default value: \verb|false|, valid for CEM, EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): if set to \verb|true|, there is a random variable for each clauses, instead of a separate random variable for each grounding of a clause
|
||||
\item \verb|sample_size| (values: integer, default value: 1000): total number of examples in case in which the models in the \verb|.kb| file contain a \verb|prob(P).| fact. In that case, one model corresponds to \verb|sample_size*P| examples
|
||||
\item \verb|epsilon_em| (values: real, default value: 0.1, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): if the difference in the log likelihood in two successive EM iteration is smaller
|
||||
\item \verb|epsilon_em| (values: real, default value: 0.1, valid for CEM,
|
||||
EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): if the difference in the log likelihood in two successive EM iteration is smaller
|
||||
than \verb|epsilon_em|, then EM stops
|
||||
\item \verb|epsilon_em_fraction| (values: real, default value: 0.01, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): if the difference in the log likelihood in two successive EM iteration is smaller
|
||||
\item \verb|epsilon_em_fraction| (values: real, default value: 0.01, valid for
|
||||
CEM, EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): if the difference in the log likelihood in two successive EM iteration is smaller
|
||||
than \verb|epsilon_em_fraction|*(-current log likelihood), then EM stops
|
||||
\item \verb|iter| (values: integer, defualt value: 1, valid for EMBLEM, SLIPCASE and SLIPCOVER): maximum number of iteration of EM parameter learning. If set to -1, no maximum number of iterations is imposed
|
||||
\item \verb|iterREF| (values: integer, defualt value: 1, valid for SLIPCASE and SLIPCOVER):
|
||||
\item \verb|iter| (values: integer, defualt value: 1, valid for EMBLEM,
|
||||
SLIPCASE, SLIPCOVER and LEMUR): maximum number of iteration of EM parameter learning. If set to -1, no maximum number of iterations is imposed
|
||||
\item \verb|iterREF| (values: integer, defualt value: 1, valid for SLIPCASE,
|
||||
SLIPCOVER and LEMUR):
|
||||
maximum number of iteration of EM parameter learning for refinements. If set to -1, no maximum number of iterations is imposed.
|
||||
\item \verb|random_restarts_number| (values: integer, default value: 1, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): number of random restarts of EM learning
|
||||
\item \verb|random_restarts_REFnumber| (values: integer, default value: 1, valid for SLIPCASE and SLIPCOVER): number of random restarts of EM learning for refinements
|
||||
\item \verb|random_restarts_number| (values: integer, default value: 1, valid for CEM, EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): number of random restarts of EM learning
|
||||
\item \verb|random_restarts_REFnumber| (values: integer, default value: 1, valid for SLIPCASE, SLIPCOVER and LEMUR): number of random restarts of EM learning for refinements
|
||||
\item \verb|setrand| (values: rand(integer,integer,integer)): seed for the random functions, see Yap manual for allowed values
|
||||
\item \verb|minimal_step| (values: [0,1], default value: 0.005, valid for RIB): minimal increment of $\gamma$
|
||||
\item \verb|maximal_step| (values: [0,1], default value: 0.1, valid for RIB): maximal increment of $\gamma$
|
||||
@ -610,14 +617,19 @@ than \verb|epsilon_em_fraction|*(-current log likelihood), then EM stops
|
||||
\item \verb|delta| (values: negative integer, default value -10, valid for RIB): value assigned to $\log 0$
|
||||
\item \verb|epsilon_fraction| (values: integer, default value 100, valid for RIB): in the computation of the step, the value of $\epsilon$ of \cite{DBLP:journals/jmlr/ElidanF05} is obtained as $\log |CH,T|\times$\verb|epsilon_fraction|
|
||||
\item \verb|max_rules| (values: integer, default value: 6000, valid for RIB and SLIPCASE): maximum number of ground rules. Used to set the size of arrays for storing internal statistics. Can be increased as much as memory allows.
|
||||
\item \verb|logzero| (values: negative real, default value $\log(0.000001)$, valid for SLIPCASE and SLIPCOVER): value assigned to $\log 0$
|
||||
\item \verb|logzero| (values: negative real, default value $\log(0.000001)$, valid for SLIPCASE, SLIPCOVER and LEMUR): value assigned to $\log 0$
|
||||
\item \verb|examples| (values: \verb|atoms|,\verb|interpretations|, default value \verb|atoms|, valid for SLIPCASE): determines how BDDs are built: if set to \verb|interpretations|, a BDD for the conjunction of all the atoms for the target predicates in each interpretations is built.
|
||||
If set to \verb|atoms|, a BDD is built for the conjunction of a group of atoms for the target predicates in each interpretations. The number of atoms in each group is determined by the parameter \verb|group|
|
||||
\item \verb|group| (values: integer, default value: 1, valid for SLIPCASE): number of target atoms in the groups that are used to build BDDs
|
||||
\item \verb|nax_iter| (values: integer, default value: 10, valid for SLIPCASE and SLIPCOVER): number of interations of beam search
|
||||
\item \verb|max_var| (values: integer, default value: 1, valid for SLIPCASE and SLIPCOVER): maximum number of distinct variables in a clause
|
||||
\item \verb|max_var| (values: integer, default value: 1, valid for SLIPCASE,
|
||||
SLIPCOVER and LEMUR): maximum number of distinct variables in a clause
|
||||
\item \verb|verbosity| (values: integer in [1,3], default value: 1): level of verbosity of the algorithms
|
||||
\item \verb|beamsize| (values: integer, default value: 20, valid for SLIPCASE and SLIPCOVER): size of the beam
|
||||
\item \verb|mcts_beamsize| (values: integer, default value: 3, valid for LEMUR): size of the MCTS beam
|
||||
|
||||
\item \verb|mcts_visits| (values: integer, default value: +inf, valid for LEMUR): maximum number of visits (Nicola controlla)
|
||||
|
||||
\item \verb|megaex_bottom| (values: integer, default value: 1, valid for SLIPCOVER): number of mega-examples on which to build the bottom clauses
|
||||
\item \verb|initial_clauses_per_megaex| (values: integer, default value: 1, valid for SLIPCOVER):
|
||||
number of bottom clauses to build for each mega-example
|
||||
@ -627,7 +639,7 @@ If set to \verb|atoms|, a BDD is built for the conjunction of a group of atoms f
|
||||
maximum number of theory search iterations
|
||||
\item \verb|background_clauses| (values: integer, default value: 50, valid for SLIPCOVER):
|
||||
maximum numbers of background clauses
|
||||
\item \verb|maxdepth_var| (values: integer, default value: 2, valid for SLIPCOVER): maximum depth of
|
||||
\item \verb|maxdepth_var| (values: integer, default value: 2, valid for SLIPCOVER and LEMUR): maximum depth of
|
||||
variables in clauses (as defined in \cite{DBLP:journals/ai/Cohen95}).
|
||||
\item \verb|score| (values: \verb|ll|, \verb|aucpr|, default value \verb|ll|, valid for SLIPCOVER): determines the score function for refinement: if set to \verb|ll|, log likelihood is used, if set to \verb|aucpr|, the area under the
|
||||
Precision-Recall curve is used.
|
||||
@ -673,7 +685,19 @@ and call
|
||||
\begin{verbatim}
|
||||
?:- sl(stem).
|
||||
\end{verbatim}
|
||||
|
||||
To execute LEMUR, load \texttt{lemur.pl} with
|
||||
\begin{verbatim}
|
||||
?:- use_module(library('cplint/lemur')).
|
||||
\end{verbatim}
|
||||
and call
|
||||
\begin{verbatim}
|
||||
?:- "mcts(stem,depth,c,iter,rules,covering)
|
||||
\end{verbatim}
|
||||
where \verb|depth| (integer) is the maximum number
|
||||
of random specialization steps in the default policy, \verb|C| (real) is the value of the MCTS $C$ constant, \verb|iter| (integer) is the number of UCT rounds, \verb|rules| (integer) is
|
||||
the maximum number of clauses to be
|
||||
learned and \verb|covering| (Boolean) dentoes whether the search is peformed in
|
||||
the space of clauses (true) or theories (false) (Nicola controlla).
|
||||
|
||||
\subsection{Testing}
|
||||
To test the theories learned, load \texttt{test.pl} with
|
||||
|
Reference in New Issue
Block a user