\texttt{cplint} is a suite of programs for reasoning with ICL \cite{DBLP:journals/ai/Poole97}, LPADs \cite{VenVer03-TR,VenVer04-ICLP04-IC} and CP-logic programs \cite{VenDenBru-JELIA06,DBLP:journals/tplp/VennekensDB09}. It contains programs both for inference and learning.
\texttt{cplint} is distributed in source code in the CVS version of Yap. It includes Prolog and C files. Download it by following the instruction in \url{http://www.ncc.up.pt/~vsc/Yap/downloads.html}.
After having performed \texttt{make install} you can do \texttt{make installcheck} that will execute a suite of tests of the various programs. If no error is reported you have a working installation of \texttt{cplint}.
Disjunction in the head is represented with a semicolon and atoms in the head are separated from probabilities by a colon. For the rest, the usual syntax of Prolog is used.
No parentheses are necessary. The \texttt{pi} are numeric expressions. It is up to the user to ensure that the numeric expressions are legal, i.e. that they sum up to less than one.
If the clause has an empty body, it can be represented like this
\begin{verbatim}
h1:p1 ; ... ;hn:pn.
\end{verbatim}
If the clause has a single head with probability 1, the annotation can be omitted and the clause takes the form of a normal prolog clause, i.e.
\begin{verbatim}
h1:- b1,...,bm,\+ c1,...,\+ cl.
\end{verbatim}
stands for
\begin{verbatim}
h1:1 :- b1,...,bm,\+ c1,...,\+ cl.
\end{verbatim}
The coin example of \cite{VenVer04-ICLP04-IC} is represented as (see file \texttt{coin.cpl})
\begin{verbatim}
heads(Coin):1/2 ; tails(Coin):1/2:-
toss(Coin),\+biased(Coin).
heads(Coin):0.6 ; tails(Coin):0.4:-
toss(Coin),biased(Coin).
fair(Coin):0.9 ; biased(Coin):0.1.
toss(coin).
\end{verbatim}
The first clause states that if we toss a coin that is not biased it has equal probability of landing heads and tails. The second states that if the coin is biased it has a slightly higher probability of landing heads. The third states that the coin is fair with probability 0.9 and biased with probability 0.1 and the last clause states that we toss a coin with certainty.
These modules answer queries using using goal-oriented procedures:
\begin{itemize}
\item\texttt{lpadsld.pl}: uses the top-down procedure described in
in \cite{Rig-AIIA07-IC} and \cite{Rig-RCRA07-IC}. It is based on SLDNF resolution and is an adaptation of the interpreter for ProbLog \cite{DBLP:conf/ijcai/RaedtKT07}.
It was proved correct \cite{Rig-RCRA07-IC} with respect to the semantics of LPADs for range restricted acyclic programs \cite{DBLP:journals/ngc/AptB91} without function symbols.
It is also able to deal with extensions of LPADs and CP-logic: the clause bodies can contain \texttt{setof} and \texttt{bagof}, the probabilities in the head may be depend on variables in the body and it is possible to specify a uniform distribution in the head with reference to a \texttt{setof} or \texttt{bagof} operator. These extended features have been introduced in order to represent CLP(BN) \cite{SanPagQaz03-UAI-IC} programs and PRM models \cite{Getoor+al:JMLR02}:
\texttt{setof} and \texttt{bagof} allow to express dependency of an attribute from an aggregate function of another attribute, as in CLP(BN) and PRM, while the possibility of specifying a uniform distribution allows the use of the reference uncertainty feature of PRM.
\item\texttt{picl.pl}: performs inference on ICL programs \cite{Rig09-LJIGPL-IJ}
\item\texttt{lpad.pl}: uses a top-down procedure based on SLG resolution \cite{DBLP:journals/jacm/ChenW96}. As a consequence, it works for any sound LPADs, i.e., any LPAD such that each of its instances has a two valued well founded model.
\item\texttt{cpl.pl}: uses a top-down procedure based on SLG resolution and moreover checks that the CP-logic program is valid, i.e., that it has at least an execution model.
\item\texttt{bestfirst.pl} performs best first \cite{BraRig10-ILP10-IC}
\item\texttt{montecarlo.pl} performs Monte Carlo \cite{BraRig10-ILP10-IC}
\item\texttt{mcintyre.pl}: implements the algorithm MCINTYRE (Monte Carlo INference wiTh Yap REcord) \cite{Rig11-CILC11-NC}
\end{itemize}
\item\texttt{approx/exact.pl} as \texttt{lpadsld.pl} but uses SimplecuddLPADs, a modification of the \href{www.cs.kuleuven.be/~theo/tools/simplecudd.html}{Simplecudd} instead of the \texttt{cplint} library for building BDDs and computing the probability.
\end{itemize}
These modules answer queries using the definition of the semantics of LPADs and CP-logic:
\begin{itemize}
\item\texttt{semlpadsld.pl}: given an LPAD $P$, it generates all the instances of $P$. The probability of a query $Q$ is computed by identifying all the instances where $Q$ is derivable by SLDNF resolution.
\item\texttt{semlpad.pl}: given an LPAD $P$, it generates all the instances of $P$. The probability of a query $Q$ is computed by identifying all the instances where $Q$ is derivable by SLG resolution.
\item\texttt{semlcpl.pl}: given an LPAD $P$, it builds an execution model of $P$, i.e., a probabilistic process that satisfy the principles of universal causation, sufficient causation, independent causation, no deus ex machina events and temporal precedence. It uses the definition of the semantics given in \cite{DBLP:journals/tplp/VennekensDB09}.
\end{itemize}
%For program with function symbols, the semantics of LPADs and CP-logic are not defined. However, the interpreter accepts programs with function symbols and, if it does not go into a loop, it returns an answer. What is the meaning of this answer is subject of current study.
\subsection{Commands}
%All six modules accept the same commands for reading in files and answering queries.
The LPAD or CP-logic program must be stored in a text file with extension \texttt{.cpl}. Suppose you have stored the example above in file \texttt{coin.cpl}.
In order to answer queries from this program, you have to run Yap,
load one of the modules (such as for example \texttt{lpad.pl}) by issuing the command
\begin{verbatim}
use_module(library(lpad)).
\end{verbatim}
at the command prompt.
Then you must parse the source file \texttt{coin.cpl} with the command
\begin{verbatim}
p(coin).
\end{verbatim}
if \texttt{coin.cpl} is in the current directory, or
\begin{verbatim}
p('path_to_coin/coin').
\end{verbatim}
if \texttt{coin.cpl} is in a different directory.
At this point you can pose query to the program by using the predicate \texttt{s/2} (for solve) that takes as its first argument a conjunction of goals in the form of a list and returns the computed probability as its second argument. For example, the probability of the conjunction \texttt{head(coin),biased(coin)} can be asked with the query
\begin{verbatim}
s([head(coin),biased(coin)],P).
\end{verbatim}
For computing the probability of a conjunction given another conjunction you can use the predicate \texttt{sc/3} (for solve conditional) that take takes as input the query conjunction as its first argument, the evidence conjunction as its second argument and returns the probability in its third argument.
For example, the probability of the query \texttt{heads(coin)} given the evidence \texttt{biased(coin)} can be asked with the query
\begin{verbatim}
sc([heads(coin)],[biased(coin)],P).
\end{verbatim}
After having parsed a program, in order to read in a new program you must restart Yap when using
\texttt{semlpadsld.pl} and \texttt{semlpad.pl}. With the other modules, you can directly parse a new program.
When using \texttt{lpad.pl}, the system can print the message ``Uunsound program'' in the case in which an instance with a three valued well founded model is found. Moreover, it can print the message ``It requires the choice of a head atom from a non ground head'': in this case, in order to answer the query, all the groundings of the culprit clause must be generated, which may be impossible for programs with function symbols.
When using \texttt{semcpl.pl}, you can print the execution process by using the command \texttt{print.}
after \texttt{p(file).} Moreover, you can build an execution process given a context by issuing the command \texttt{parse(file)}. and then
\texttt{build(context).} where \texttt{context} is a list of atoms that are true in the context.
\texttt{semcpl.pl} can print ``Invalid program'' in the case in which no execution process exists.
When using \texttt{cpl.pl} you can print a partial execution model including all the clauses involved in the query issued with \texttt{print.}\texttt{cpl.pl} can print the messages ``Uunsound program'', ``It requires the choice of a head atom from a non ground head'' and ``Invalid program''.
takes as input a list of goals \texttt{GoalsList} and returns a lower bound on the probability \texttt{ProbLow}, an upper bound on the probability \texttt{ProbUp}, the CPU time spent on performing resolution \texttt{ResTime} and the CPU time spent on handling BDDs \texttt{BddTime}.
takes as input a list of goals \texttt{GoalsList} and returns a lower bound on the probability \texttt{ProbLow}, the CPU time spent on performing resolution \texttt{ResTime} and the CPU time spent on handling BDDs \texttt{BddTime}.
takes as input a list of goals \texttt{GoalsList} and returns a lower bound on the probability \texttt{ProbLow}, an upper bound on the probability \texttt{ProbUp}, the number of BDDs generated by the algorithm \texttt{Count}, the CPU time spent on performing resolution \texttt{ResTime} and the CPU time spent on handling BDDs \texttt{BddTime}.
For \texttt{approx/montecarlo.pl}
the command
\begin{verbatim}
solve(GoalsList, Samples, Time, Low, Prob, Up)
\end{verbatim}
takes as input a list of goals \texttt{GoalsList} and returns the number of samples taken \texttt{Samples},
the time required to solve the problem \texttt{Time}, the
lower end of the confidence interval \texttt{Lower}, the estimated probability \texttt{Prob} and the upper end of the confidence interval \texttt{Up}.
takes as input a conjunction of goals \texttt{Goals} and returns the number of samples taken \texttt{Samples},
the CPU time required to solve the problem \texttt{CPUTime}, the wall time required to solve the problem \texttt{CPUTime}, the
lower end of the confidence interval \texttt{Lower}, the estimated probability \texttt{Prob} and the upper end of the confidence interval \texttt{Up}.
For \texttt{approx/exact.pl}
the command
\begin{verbatim}
solve(GoalsList, Prob, ResTime, BddTime)
\end{verbatim}
takes as input a conjunction of goals \texttt{Goals} and returns the probability \texttt{Prob}, the CPU time spent on performing resolution \texttt{ResTime} and the CPU time spent on handling BDDs \texttt{BddTime}.
then \texttt{cplint} adds the null events to the head. Default value 0.00001
\item\verb|save_dot| (valid for all goal-oriented modules): if \texttt{true} a graph representing the BDD is saved in the file \texttt{cpl.dot} in the current directory in dot format.
The variables names are of the form \verb|Xn_m| where \texttt{n} is the number of the multivalued
variable and \texttt{m} is the number of the binary variable. The correspondence between variables and
clauses can be evinced from the message printed on the screen, such as
\begin{verbatim}
Variables: [(2,[X=2,X1=1]),(2,[X=1,X1=0]),(1,[])]
\end{verbatim}
where the first element of each couple is the clause number of the input file (starting from 1).
In the example above variable \texttt{X0} corresponds to clause \texttt{2} with the substitutions \texttt{X=2,X1=1},
variable \texttt{X1} corresponds to clause \texttt{2} with the substitutions \texttt{X=1,X1=0} and
variable \texttt{X2} corresponds to clause \texttt{1} with the empty substitution.
\item\verb|ground_body|: (valid for \texttt{lpadsld.pl} and all semantic modules) determines how non ground clauses are treated: if \texttt{true}, ground clauses are obtained from a non ground clause by replacing each variable with a constant, if \texttt{false}, ground clauses are obtained by replacing only variables in the head with a constant. In the case where the body contains variables not in the head, setting it to false means that the body represents an existential event.
\item\verb|min_error|: (valid for \texttt{approx/deepit.pl}, \texttt{approx/deepdyn.pl}, \texttt{approx/bestk.pl}, \texttt{approx/bestfirst.pl}, \texttt{approx/montecarlo.pl} and \texttt{mcintyre.pl}) is the threshold under which the difference between upper and lower bounds on probability must fall for the algorithm to stop.
\item\verb|k|: maximum number of explanations for \texttt{approx/bestk.pl} and \texttt{approx/bestfirst.pl} and number of samples to take at each iteration for \texttt{approx/montecarlo.pl} and \texttt{mcintyre.pl}
\item\verb|prob_bound|: (valid for \texttt{approx/deepit.pl}, \texttt{approx/deepdyn.pl}, \texttt{approx/bestk.pl} and \texttt{approx/bestfirst.pl}) is the initial bound on the probability of explanations when iteratively building explanations
\item\verb|prob_step|: (valid for \texttt{approx/deepit.pl}, \texttt{approx/deepdyn.pl}, \texttt{approx/bestk.pl} and \texttt{approx/bestfirst.pl}) is the increment on the bound on the probability of explanations when iteratively building explanations
\item\verb|timeout|: (valid for \texttt{approx/deepit.pl}, \texttt{approx/deepdyn.pl}, \texttt{approx/bestk.pl}, \texttt{approx/bestfirst.pl} and \texttt{approx/exact.pl}) timeout for builduing BDDs
The three semantic modules need to produce a grounding of the program in order to compute the semantics.
They require an extra file with extension \texttt{.uni} (for universe) in the same directory where the \texttt{.cpl} file is.
There are two ways to specify how to ground a program. The first consists in providing the list of constants to which each variable can be instantiated. For example, in our case the current directory will contain a file \texttt{coin.uni} that is a Prolog file containing facts of the form
\begin{verbatim}
universe(var_list,const_list).
\end{verbatim}
where \verb|var_list| is a list of variables names (each must be included in single quotes) and \verb|const_list| is a list of constants. The semantic modules generate the grounding by instantiating in all possible ways the variables of \verb|var_list| with the constants of \verb|const_list|. Note that the variables are identified by name, so a variable with the same name in two different clauses will be instantiated with the same constants.
The other way to specify how to ground a program consists in using mode and type information. For each predicate, the file \texttt{.uni} must contain a fact of the form
\begin{verbatim}
mode(predicate(t1,...,tn)).
\end{verbatim}
that specifies the number and types of each argument of the predicate. Then, the list of constants that
are in the domain of each type \texttt{ti} must be specified with a fact of the form
\begin{verbatim}
type(ti,list_of_constants).
\end{verbatim}
The file \texttt{.uni} can contain both universe and mode declaration, the ones to be used depend on the value of the parameter \texttt{grounding}: with value \texttt{variables}, the universe declarations are used, with value \texttt{modes} the mode declarations are used.
With \texttt{semcpl.pl} only mode declarations can be used.
When using \texttt{lpadsld.pl}, the bodies can contain the predicates \texttt{setof/3} and \texttt{bagof/3} with the same meaning as in Prolog. Existential quantifiers are allowed in both, so for example the query
\begin{verbatim}
setof(Z, (term(X,Y))^foo(X,Y,Z), L).
\end{verbatim}
returns all the instantiations of \texttt{Z} such that there exists an instantiation of \texttt{X} and \texttt{Y} for which \texttt{foo(X,Y,Z)} is true.
An example of the use of \texttt{setof} and \texttt{bagof} is in the file \texttt{female.cpl}:
\begin{verbatim}
male(C):M/P ; female(C):F/P:-
person(C),
setof(Male,known_male(Male),LM),
length(LM,M),
setof(Female,known_female(Female),LF),
length(LF,F),
P is F+M.
person(f).
known_female(a).
known_female(b).
known_female(c).
known_male(d).
known_male(e).
\end{verbatim}
The disjunctive rule expresses the probability of a person of unknown sex of being male or female depending on the number of males and females that are known.
This is an example of the use of expressions in the probabilities in the head that depend on variables in the body. The probabilities are well defined because they always sum to 1 (unless \texttt{P} is 0).
Another use of \texttt{setof} and \texttt{bagof} is to have an attribute depend on an aggregate function of another attribute, similarly to what is done in PRM and CLP(BN).
So, in the classical school example (available in \texttt{student.cpl}) you can find the following
clauses:
\begin{verbatim}
student_rank(S,h):0.6 ; student_rank(S,l):0.4:-
bagof(G,R^(registr_stu(R,S),registr_gr(R,G)),L),
average(L,Av),Av>1.5.
student_rank(S,h):0.4 ; student_rank(S,l):0.6:-
bagof(G,R^(registr_stu(R,S),registr_gr(R,G)),L),
average(L,Av),Av =< 1.5.
\end{verbatim}
where \verb|registr_stu(R,S)| expresses that registration \texttt{R} refers to student \texttt{S} and \verb|registr_gr(R,G)| expresses that registration \texttt{R} reports grade \texttt{G} which is a natural number. The two clauses express a dependency of the rank of the student from the average of her grades.
Another extension can be used with \texttt{lpadsld.pl} in order to be able to represent reference uncertainty of PRMs. Reference uncertainty means that the link structure of a relational model is not fixed but is uncertain: this is represented by having the instance referenced in a relationship be chosen uniformly from a set. For example, consider a domain modeling scientific papers: you have a single entity, paper, and a relationship, cites, between paper and itself that connects the citing paper to the cited paper. To represent the fact that the cited paper and the citing paper are selected uniformly from certain sets, the following clauses can be used (see file \verb|paper_ref_simple.cpl|):
\begin{verbatim}
uniform(cites_cited(C,P),P,L):-
bagof(Pap,paper_topic(Pap,theory),L).
uniform(cites_citing(C,P),P,L):-
bagof(Pap,paper_topic(Pap,ai),L).
\end{verbatim}
The first clauses states that the paper \texttt{P} cited in a citation \texttt{C} is selected uniformly from the set of all papers with topic theory.
The second clauses expresses that the citing paper is selected uniformly from the papers with
topic ai.
These clauses make use of the predicate
\begin{verbatim}
uniform(Atom,Variable,List)
\end{verbatim}
in the head, where \texttt{Atom} must contain \texttt{Variable}. The meaning is the following: the set of all the atoms obtained by instantiating \texttt{Variable} of \texttt{Atom} with a term taken from \texttt{List} is generated and the head is obtained by having a disjunct for each instantiation with probability $1/N$ where $N$ is the length of \texttt{List}.
A more elaborate example is present in file \verb|paper_ref.cpl|:
where the cited paper depends on the topic of the citing paper. In particular, if the topic is theory, the cited paper is selected uniformly from the papers about theory with probability 0.9 and from the papers about ai with probability 0.1. if the topic is ai, the cited paper is selected uniformly from the papers about theory with probability 0.01 and from the papers about ai with probability 0.99.
PRMs take into account as well existence uncertainty, where the existence of instances is also probabilistic. For example, in the paper domain, the total number of citations may be unknown and a citation between any two paper may have a probability of existing. For example, a citation between two paper may be more probable if they are about the same topic:
\begin{verbatim}
cites(X,Y):0.005 :-
paper_topic(X,theory),paper_topic(Y,theory).
cites(X,Y):0.001 :-
paper_topic(X,theory),paper_topic(Y,ai).
cites(X,Y):0.003 :-
paper_topic(X,ai),paper_topic(Y,theory).
cites(X,Y):0.008 :-
paper_topic(X,ai),paper_topic(Y,ai).
\end{verbatim}
This is an example where the probabilities in the head do not sum up to one so the null event is automatically added to the head.
The first clause states that, if the topic of a paper \texttt{X} is theory and of paper \texttt{Y} is theory, there is a probability of 0.005 that there is a citation from \texttt{X} to \texttt{Y}. The other clauses consider the remaining cases for the topics.
In the directory where Yap keeps the library files (usually \texttt{/usr/local/share/ Yap}) you can find the directory \texttt{cplint} that contains the files:
testcpl.pl, testsemlpadsld.pl, testsemlpad.pl testsemcpl.pl}: Prolog programs for testing the modules. They are executed when issuing the command \texttt{make installcheck} during the installation. To execute them afterwords, load the file and issue the command \texttt{t.}
\item\texttt{alarm.cpl}: representation of the Bayesian network in Figure 2 of
\cite{VenVer04-ICLP04-IC}.
\item\texttt{coin.cpl}: coin example from \cite{VenVer04-ICLP04-IC}.
\item\texttt{coin2.cpl}: coin example with two coins.
\item\texttt{dice.cpl}: dice example from \cite{VenVer04-ICLP04-IC}.
\item\verb|twosideddice.cpl, threesideddice.cpl| game with idealized dice with two or three sides. Used in the experiments in \cite{Rig-RCRA07-IC}.
\item\texttt{ex.cpl}: first example in \cite{Rig-RCRA07-IC}.
\item\texttt{exapprox.cpl}: example showing the problems of approximate inference (see \cite{Rig-RCRA07-IC}).
\item\texttt{exrange.cpl}: example showing the problems with non range restricted programs (see \cite{Rig-RCRA07-IC}).
\item\texttt{female.cpl}: example showing the dependence of probabilities in the head from variables in the body (from \cite{VenVer04-ICLP04-IC}).
\item\texttt{mendel.cpl, mendels.cpl}: programs describing the Mendelian rules of inheritance, taken from \cite{Blo04-ILP04WIP-IC}.
\item\verb|paper_ref.cpl, paper_ref_simple.cpl|: paper citations examples, showing reference uncertainty, inspired by \cite{Getoor+al:JMLR02}.
\item\verb|paper_ref_not.cpl|: paper citations example showing that negation can be used also for predicates defined by clauses with \texttt{uniform} in the head.
\item\texttt{school.cpl}: example inspired by the example \verb|school_32.yap| from the
source distribution of Yap in the \texttt{CLPBN} directory.
\item\verb|school_simple.cpl|: simplified version of \texttt{school.cpl}.
\item\verb|student.cpl|: student example from Figure 1.3 of \cite{GetFri01-BC}.
\item\texttt{win.cpl, light.cpl, trigger.cpl, throws.cpl, hiv.cpl,}\\\texttt{ invalid.cpl}: programs taken from \cite{DBLP:journals/tplp/VennekensDB09}. \texttt{invalid.cpl} is an example of a program that is invalid but sound.
The files \texttt{*.uni} that are present for some of the examples are used by the semantical modules. Some of the example files contain in an initial comment some queries together with their result.
\item Subdirectory \texttt{doc}: contains this manual in latex, html and pdf.
\texttt{cplint} contains the following learning algorithms:
\begin{itemize}
\item CEM (\texttt{cplint} EM): an implementation of EM for learning parameters that is based on \texttt{lpadsld.pl}\cite{RigDiM11-ML-IJ}
\item RIB (Relational Information Bottleneck): an algorithm for learning parameters based on the Information Bottleneck \cite{RigDiM11-ML-IJ}
\item EMBLEM (EM over Bdds for probabilistic Logic programs Efficient Mining): an implementation of EM for learning parameters that computes expectations directly on BDDs \cite{BelRig11-CILC11-NC,BelRig11-TR}
\item SLIPCASE (Structure LearnIng of ProbabilistiC logic progrAmS with Em over bdds): an algorithm for learning the structure of program that is based on EMBLEM \cite{BelRig11-ILP11-IC}
\end{itemize}
\subsection{Input}
To execute the learning algorithms, prepare four files in the same folder:
\begin{itemize}
\item\texttt{<stem>.kb}: contains the example interpretations
\item\texttt{<stem>.bg}: contains the background knowledge, i.e., knowledge valid for all interpretations
\item\texttt{<stem>.l}: contains language bias information
\item\texttt{<stem>.cpl}: contains the LPAD for you which you want to learn the parameters or the initial LPAD for SLIPCASE
\end{itemize}
where \texttt{<stem>} is your dataset name. Examples of these files can be found in the dataset pages.
In \texttt{<stem>.kb} the example interpretations have to be given as a list of Prolog facts initiated by
\texttt{begin(model(<name>)).} and terminated by \texttt{end(model(<name>)).} as in
\begin{verbatim}
begin(model(b1)).
sameperson(1,2).
movie(f1,1).
movie(f1,2).
workedunder(1,w1).
workedunder(2,w1).
gender(1,female).
gender(2,female).
actor(1).
actor(2).
end(model(b1)).
\end{verbatim}
The interpretations may contain a fact of the form
\begin{verbatim}
prob(0.3).
\end{verbatim}
assigning a probability (0.3 in this case) to the interpretations. If this is omitted, the probability of each interpretation is considered equal to $1/n$ where $n$ is the total number of interpretations. \verb|prob/1| can be used to set different multiplicity for the different interpretations.
In order for RIB to work, the input interpretations must share the Herbrand universe. If this is not the case, you have to translate the interpretations in this was, see for example the \texttt{sp1} files in RIB's folder, that are the results of the conversion of the first fold of the IMDB dataset.
\texttt{<stem>.bg} can contain Prolog clauses that can be used to derive additional conclusions from the atoms in
the interpretations.
\texttt{<stem>.l} contains the declarations of the input and output predicates, of the unseen predicates and the commands for setting the algorithms' parameters.
Output predicates are declared as
\begin{verbatim}
output(<predicate>/<arity>).
\end{verbatim}
and define the predicates whose atoms in the input interpretations are used as the goals for the prediction of which you want to optimize the parameters. Derivations for these goals are built by the systems.
Input predicates are those for the predictions of which you do not want to optimize the parameters. You can declare closed world input predicates with
\begin{verbatim}
input_cw(<predicate>/<arity>).
\end{verbatim}
For these predicates, the only true atoms are those in the interpretations, the clauses in the input program are not used to derive atoms not present in the interpretations.
Open world input predicates are declared with
\begin{verbatim}
input(<predicate>/<arity>).
\end{verbatim}
In this case, if a subgoal for such a predicate is encountered when deriving the atoms for the output predicates,
both the facts in the interpretations and the clauses of the input program are used.
For RIB, if there are unseen predicates, i.e., predicates that are present in the input program but not in the interpretations, you have to declare them with
\begin{verbatim}
unseen(<predicate>/<arity>).
\end{verbatim}
For SLIPCASE, you have to specify the language bias by means of mode declarations in the style of
specifies the atoms that can appear in the head of clauses, while
\begin{verbatim}
modeb(<recall>,<predicate>(<arg1>,...).
\end{verbatim}
specifies the atoms that can appear in the body of clauses.
\texttt{<recall>} can be an integer or \texttt{*} (currently unused).
The arguments are of the form
\begin{verbatim}
+<type>
\end{verbatim}
for specifying an input variable of type \texttt{<type>}, or
\begin{verbatim}
-<type>
\end{verbatim}
for specifying an output variable of type \texttt{<type>}.
\subsection{Parameters}
In order to set the algorithms' parameters, you have to insert in \texttt{<stem>.l} commands of the form
\begin{verbatim}
:- set(<parameter>,<value>).
\end{verbatim}
The available parameters are:
\begin{itemize}
\item\verb|depth| (values: integer or \verb|inf|, default value: 3): depth of derivations if \verb|depth_bound| is set to \verb|true|
\item\verb|single_var| (values: \verb|{true,false}|, default value: \verb|false|, valid for CEM only): if set to \verb|true|, there is a random variable for each clauses, instead of a separate random variable for each grounding of a clause
\item\verb|sample_size| (values: integer, default value: 1000): total number of examples in case in which the models in the \verb|.kb| file contain a \verb|prob(P).| fact. In that case, one model corresponds to \verb|sample_size*P| examples
\item\verb|epsilon_em| (values: real, default value: 0.1, valid for CEM only): if the difference in the log likelihood in two successive EM iteration is smaller
than \verb|epsilon_em|, then EM stops
\item\verb|epsilon_em_fraction| (values: real, default value: 0.01, valid for CEM only): if the difference in the log likelihood in two successive EM iteration is smaller
than \verb|epsilon_em_fraction|*(-current log likelihood), then EM stops
\item\verb|random_restarts_number| (values: integer, default value: 1, valid for CEM only): number of random restarts
\item\verb|setrand| (values: rand(integer,integer,integer)): seed for the random functions, see Yap manual for allowed values
\item\verb|minimal_step| (values: [0,1], default value: 0.005, valid for RIB only): minimal increment of $\gamma$
\item\verb|maximal_step| (values: [0,1], default value: 0.1, valid for RIB only): maximal increment of $\gamma$
\item\verb|logsize_fraction| (values: [0,1], default value 0.9, valid for RIB only): RIB stops when $\mathbf{I}(CH,T;Y)$ is above \verb|logsize_fraction| times its maximum value ($\log |CH,T|$, see \cite{DBLP:journals/jmlr/ElidanF05})
\item\verb|delta| (values: negative integer, default value -10, valid for RIB only): value assigned to $\log0$
\item\verb|epsilon_fraction| (values: integer, default value 100, valid for RIB only): in the computation of the step, the value of $\epsilon$ of \cite{DBLP:journals/jmlr/ElidanF05} is obtained as $\log |CH,T|\times$\verb|epsilon_fraction|
\item\verb|max_rules| (values: integer, default value: 6000, valid for RIB only): maximum number of ground rules. Used to set the size of arrays for storing internal statistics. Can be increased as much as memory allows.
The modules in the approx subdirectory use SimplecuddLPADs, a modification of the \href{www.cs.kuleuven.be/~theo/tools/simplecudd.html}{Simplecudd} library whose copyright is by Katholieke Universiteit Leuven and that follows the Artistic License 2.0.