initial version of LEMUR

2014-10-15 15:56:49 +02:00 · 2014-10-15 15:56:49 +02:00 · ce12c424f3
commit ce12c424f3
parent b25c9e5b61
1 changed files with 38 additions and 14 deletions
--- a/packages/cplint/doc/manual.tex
+++ b/packages/cplint/doc/manual.tex
@ -441,6 +441,9 @@ The files \texttt{*.uni} that are present for some of the examples are used  by
 \item EMBLEM (EM over Bdds for probabilistic Logic programs Efficient Mining): an implementation of EM for learning parameters that computes expectations directly on BDDs \cite{BelRig11-IDA,BelRig11-CILC11-NC,BelRig11-TR}
 \item SLIPCASE (Structure LearnIng of ProbabilistiC logic progrAmS with Em over bdds): an algorithm for learning the structure of programs by searching directly the theory space \cite{BelRig11-ILP11-IC}
 \item SLIPCOVER (Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space): an algorithm for learning the structure of programs by searching the clause space and the theory space separatery \cite{BelRig13-TPLP-IJ}
+\item LEMUR (LEarning with a Monte carlo Upgrade of tRee search): an algorithm 
+for learning the structure of programs by searching the clase space using 
+Monte-Carlo tree search.
 \end{itemize}

 \subsection{Input}
@ -449,7 +452,7 @@ To execute the learning algorithms, prepare four files in the same folder:
 \item \texttt{<stem>.kb}: contains the example interpretations 
 \item \texttt{<stem>.bg}: contains the background knowledge, i.e., knowledge valid for all interpretations
 \item \texttt{<stem>.l}: contains language bias information
-\item \texttt{<stem>.cpl}: contains the LPAD for you which you want to learn the parameters or the initial LPAD for SLIPCASE. For SLIPCOVER, this file should be absent
+\item \texttt{<stem>.cpl}: contains the LPAD for you which you want to learn the parameters or the initial LPAD for SLIPCASE and LEMUR. For SLIPCOVER, this file should be absent
 \end{itemize}
 where \texttt{<stem>} is your dataset name. Examples of these files can be found in the dataset pages.

@ -504,7 +507,7 @@ For RIB, if there are unseen predicates, i.e., predicates that are present in th
 unseen(<predicate>/<arity>).
 \end{verbatim}

-For SLIPCASE and SLIPCOVER, you have to specify the language bias by means of mode declarations in the style of 
+For SLIPCASE, SLIPCOVER and LEMUR, you have to specify the language bias by means of mode declarations in the style of 
 \href{http://www.doc.ic.ac.uk/\string ~shm/progol.html}{Progol}.
 \begin{verbatim}
 modeh(<recall>,<predicate>(<arg1>,...).
@ -558,7 +561,7 @@ modeb(*,samecourse(+course, -course)).
 modeb(*,samecourse(-course, +course)). 
 ....
 \end{verbatim}
-SLIPCOVER also requires facts for the \verb|determination/2| predicate that indicate which predicates can appear in the body of clauses. 
+SLIPCOVER and LEMUR lso requires facts for the \verb|determination/2| predicate that indicate which predicates can appear in the body of clauses. 
 For example
 \begin{verbatim}
 determination(professor/1,student/1).
@ -592,17 +595,21 @@ In order to set the algorithms' parameters, you have to insert in \texttt{<stem>
 The available parameters are:
 \begin{itemize}
 \item \verb|depth| (values: integer or \verb|inf|, default value: 3): depth of derivations if  \verb|depth_bound|  is set to \verb|true|
-\item \verb|single_var| (values: \verb|{true,false}|, default value: \verb|false|, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): if set to \verb|true|, there is a random variable for each clauses, instead of a separate random variable for each grounding of a clause
+\item \verb|single_var| (values: \verb|{true,false}|, default value: \verb|false|, valid for CEM, EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): if set to \verb|true|, there is a random variable for each clauses, instead of a separate random variable for each grounding of a clause
 \item \verb|sample_size| (values: integer, default value: 1000): total number of examples in case in which the models in the \verb|.kb| file contain a \verb|prob(P).| fact. In that case, one model corresponds to \verb|sample_size*P| examples
-\item \verb|epsilon_em| (values: real, default value: 0.1, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): if the difference in the log likelihood in two successive EM iteration is smaller
+\item \verb|epsilon_em| (values: real, default value: 0.1, valid for CEM, 
+EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): if the difference in the log likelihood in two successive EM iteration is smaller
 than \verb|epsilon_em|, then EM stops 
-\item \verb|epsilon_em_fraction| (values: real, default value: 0.01, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): if the difference in the log likelihood in two successive EM iteration is smaller
+\item \verb|epsilon_em_fraction| (values: real, default value: 0.01, valid for 
+CEM, EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): if the difference in the log likelihood in two successive EM iteration is smaller
 than \verb|epsilon_em_fraction|*(-current log likelihood), then EM stops
-\item \verb|iter| (values: integer, defualt value: 1, valid for EMBLEM, SLIPCASE and SLIPCOVER): maximum number of iteration of EM parameter learning. If set to -1, no maximum number of iterations is imposed
-\item \verb|iterREF| (values: integer, defualt value: 1, valid for  SLIPCASE and SLIPCOVER):
+\item \verb|iter| (values: integer, defualt value: 1, valid for EMBLEM, 
+SLIPCASE, SLIPCOVER and LEMUR): maximum number of iteration of EM parameter learning. If set to -1, no maximum number of iterations is imposed
+\item \verb|iterREF| (values: integer, defualt value: 1, valid for  SLIPCASE,
+ SLIPCOVER and LEMUR):
 maximum number of iteration of EM parameter learning for refinements. If set to -1, no maximum number of iterations is imposed.
-\item \verb|random_restarts_number| (values: integer, default value: 1, valid for CEM, EMBLEM, SLIPCASE and SLIPCOVER): number of random restarts of EM learning
-\item \verb|random_restarts_REFnumber| (values: integer, default value: 1, valid for  SLIPCASE and SLIPCOVER): number of random restarts of EM learning for refinements
+\item \verb|random_restarts_number| (values: integer, default value: 1, valid for CEM, EMBLEM, SLIPCASE, SLIPCOVER and LEMUR): number of random restarts of EM learning
+\item \verb|random_restarts_REFnumber| (values: integer, default value: 1, valid for  SLIPCASE, SLIPCOVER and LEMUR): number of random restarts of EM learning for refinements
 \item \verb|setrand| (values: rand(integer,integer,integer)): seed for the random functions, see Yap manual for allowed values
 \item \verb|minimal_step| (values: [0,1], default value: 0.005, valid for RIB): minimal increment of $\gamma$
 \item \verb|maximal_step| (values: [0,1], default value: 0.1, valid for RIB): maximal increment of $\gamma$
@ -610,14 +617,19 @@ than \verb|epsilon_em_fraction|*(-current log likelihood), then EM stops
 \item \verb|delta| (values: negative integer, default value -10, valid for RIB): value assigned to $\log 0$
 \item \verb|epsilon_fraction| (values: integer, default value 100, valid for RIB): in the computation of the step, the value of $\epsilon$ of \cite{DBLP:journals/jmlr/ElidanF05} is obtained as $\log |CH,T|\times$\verb|epsilon_fraction|
 \item \verb|max_rules| (values: integer, default value: 6000, valid for RIB and SLIPCASE): maximum number of ground rules. Used to set the size of arrays for storing internal statistics. Can be increased as much as memory allows.
-\item \verb|logzero| (values: negative real, default value $\log(0.000001)$, valid for SLIPCASE and SLIPCOVER): value assigned to $\log 0$
+\item \verb|logzero| (values: negative real, default value $\log(0.000001)$, valid for SLIPCASE, SLIPCOVER and LEMUR): value assigned to $\log 0$
 \item \verb|examples| (values: \verb|atoms|,\verb|interpretations|, default value \verb|atoms|, valid for SLIPCASE): determines how BDDs are built: if set to \verb|interpretations|, a BDD for the conjunction of all the atoms for the target predicates in each interpretations is built. 
 If set to \verb|atoms|, a BDD is built for the conjunction of a group of atoms for the target predicates in each interpretations. The number of atoms in each group is determined by the parameter \verb|group|
 \item \verb|group| (values: integer, default value: 1, valid for SLIPCASE): number of target atoms in the groups that are used to build BDDs
 \item \verb|nax_iter| (values: integer, default value: 10, valid for SLIPCASE and SLIPCOVER): number of interations of beam search
-\item \verb|max_var| (values: integer, default value: 1, valid for SLIPCASE and SLIPCOVER): maximum number of distinct variables in a clause
+\item \verb|max_var| (values: integer, default value: 1, valid for SLIPCASE,
+SLIPCOVER and LEMUR): maximum number of distinct variables in a clause
 \item \verb|verbosity| (values: integer in [1,3], default value: 1): level of verbosity of the algorithms
 \item \verb|beamsize|  (values: integer, default value: 20, valid for SLIPCASE and SLIPCOVER): size of the beam 
+\item \verb|mcts_beamsize|  (values: integer, default value: 3, valid for LEMUR): size of the MCTS beam
+
+\item \verb|mcts_visits|  (values: integer, default value: +inf, valid for LEMUR): maximum number of visits (Nicola controlla)
+
 \item \verb|megaex_bottom| (values: integer, default value: 1, valid for SLIPCOVER): number of mega-examples on which to build the bottom clauses
 \item \verb|initial_clauses_per_megaex| (values: integer, default value: 1, valid for SLIPCOVER): 
 number of bottom clauses to build for each mega-example
@ -627,7 +639,7 @@ If set to \verb|atoms|, a BDD is built for the conjunction of a group of atoms f
 maximum  number of theory search iterations
 \item \verb|background_clauses| (values: integer, default value: 50, valid for SLIPCOVER): 
 maximum numbers of background clauses
-\item \verb|maxdepth_var| (values: integer, default value: 2, valid for SLIPCOVER): maximum depth of
+\item \verb|maxdepth_var| (values: integer, default value: 2, valid for SLIPCOVER and LEMUR): maximum depth of
 variables in clauses (as defined in \cite{DBLP:journals/ai/Cohen95}).
 \item \verb|score| (values: \verb|ll|, \verb|aucpr|, default value \verb|ll|, valid for SLIPCOVER): determines the score function for refinement: if set to \verb|ll|, log likelihood is used, if set to \verb|aucpr|, the area under the 
 Precision-Recall curve is used. 
@ -673,7 +685,19 @@ and call
 \begin{verbatim}
 ?:- sl(stem).
 \end{verbatim}
-
+To execute LEMUR, load \texttt{lemur.pl} with
+\begin{verbatim}
+?:- use_module(library('cplint/lemur')).
+\end{verbatim}
+and call
+\begin{verbatim}
+?:- "mcts(stem,depth,c,iter,rules,covering)
+\end{verbatim}
+where \verb|depth| (integer) is the maximum number
+of random specialization steps in the default policy, \verb|C| (real) is the value of the MCTS $C$ constant, \verb|iter| (integer) is  the number of UCT rounds, \verb|rules| (integer) is
+the maximum number  of clauses to be
+learned and \verb|covering| (Boolean) dentoes whether the search is peformed in
+the space of clauses (true) or theories (false) (Nicola controlla).

 \subsection{Testing}
 To test the theories learned, load \texttt{test.pl} with