yap-6.3/cplint/doc/manual.tex

% preambolo per doppia compilazione HTML/PDF
%\ifx\pdfoutput\undefined % compilazione htlatex
\documentclass{article}
\usepackage{graphicx}
\DeclareGraphicsExtensions{.png, .gif, .jpg}
\newcommand{\href}[2]{\Link[#1]{}{} #2 \EndLink}
\newcommand{\hypertarget}[2]{\Link[]{}{#1} #2 \EndLink}
\newcommand{\hyperlink}[2]{\Link[]{#1}{} #2 \EndLink}
%\else % compilazione pdflatex
%\documentclass[a4paper,12pt]{article}
%\usepackage[pdftex]{graphicx}
%\DeclareGraphicsExtensions{.pdf,.png,.jpg}
%\RequirePackage[hyperindex]{hyperref}
%\fi

%\documentclass{article}


\begin{document}
\title{\texttt{cplint} Version 1.0 Manual}


\author{Fabrizio Riguzzi\\
fabrizio.riguzzi@unife.it}

\maketitle


\section{Introduction}

\texttt{cplint} is an interpreter for LPADs \cite{VenVer03-TR,VenVer04-ICLP04-IC} and CP-logic programs \cite{VenDenBru-JELIA06,CP-logic-unp}. It is described in \cite{Rig-AIIA07-IC} and \cite{Rig-RCRA07-IC}. It is an adaptation of the interpreter for ProbLog \cite{DBLP:conf/ijcai/RaedtKT07}.

It was proved correct \cite{Rig-RCRA07-IC} for range restricted acyclic programs \cite{DBLP:journals/ngc/AptB91} without function symbols.

It is also able to deal with extensions of LPADs and CP-logic: the clause bodies can contain \texttt{setof} and \texttt{bagof}, the probabilities in the head may be depend on variables in the body and it is possible to specify a uniform distribution in the head with reference to a \texttt{setof} or \texttt{bagof} operator. These extended features have been introduced in order to represent CLP(BN) \cite{SanPagQaz03-UAI-IC} programs and PRM models \cite{Getoor+al:JMLR02}:
\texttt{setof} and \texttt{bagof} allow to express dependency of an attribute from an aggregate function of another attribute, as in CLP(BN)  and PRM, while the possibility of specifying a uniform distribution allows the use of the reference uncertainty feature of PRM.

These extensions are work in progress: they have been implemented but there is no paper yet that describes the semantics of the extended language.

%For program with function symbols, the semantics of LPADs and CP-logic are not defined. However, the interpreter accepts programs with function symbols and, if it does not go into a loop, it returns an answer. What is the meaning of this answer is subject of current study.

\section{Installation}
\texttt{cplint} is distributed to be run an a Linux machine. It has been tested under Debian 4.0 and Suse 10.2. Please contact the author for versions for Windows. Please contact the author also for the complete source code.
Download and install Yap Prolog version 5.1.2 from \href{http://www.ncc.up.pt/\%7Evsc/Yap/}{http://www.ncc.up.pt/\%7Evsc/Yap/}.
Make sure libglib1.2 is installed in your machine. It is a standard Linux package so, if it is not installed,  you can use your package manager to install it.
Download \texttt{cplint.zip} from \href{http://www.ing.unife.it/software/cplint/cplint.zip}{http://www.ing.unife.it/software/cplint/cplint.zip} and uncompress it.
By downloading \texttt{cplint.zip} you agree to the BSD-style license reported in section \ref{license}.

\section{Syntax}

Disjunction in the head is represented with a semicolon and atoms in the head are separated from probabilities by a colon. For the rest, the usual syntax of Prolog is used.
For example, the  CP-logic clause
$$h_1:p_1\vee \ldots \vee h_n:p_n\leftarrow b_1,\dots,b_m ,\neg c_1,\ldots,\neg c_l$$
is represented by
\begin{verbatim}
h1:p1 ; ... ; hn:pn :- b1,...,bm,\+ c1,....,\+ cl
\end{verbatim}
 No parentheses are necessary. The \texttt{pi} are numeric expressions that can involve
variables appearing in the body. It is up to the user to ensure that the numeric expressions are legal, i.e. that they sum up to less than one for every instantiation of the clause for which the body is true in an instance.

If the clause has an empty body, it can be represented like this
\begin{verbatim}
h1:p1 ; ... ;hn:pn.
\end{verbatim}
If the clause has a single head with probability 1, the annotation can be omitted and the clause takes the form of a normal prolog clause, i.e.
\begin{verbatim}
h1:- b1,...,bm,\+ c1,...,cl.
\end{verbatim}
stands for
\begin{verbatim}
h1:1 :- b1,...,bm,\+ c1,...,cl.
\end{verbatim}

The coin example of  \cite{VenVer04-ICLP04-IC} is represented as (see file \texttt{coin.cpl})
\begin{verbatim}
heads(Coin):1/2 ; tails(Coin):1/2:-
     toss(Coin),\+biased(Coin).

heads(Coin):0.6 ; tails(Coin):0.4:-
     toss(Coin),biased(Coin).

fair(Coin):0.9 ; biased(Coin):0.1.

toss(coin).
\end{verbatim}
The first clause states that if we toss a coin that is not biased it has equal probability of landing heads and tails. The second states that if the coin is biased it is slightly more probable that it lands heads. The third states that the coin is fair with probability 0.9 and biased with probability 0.1 and the last clause states that we toss a coin with certainty.


\section{Commands}
The program must be stored in a text file with extension \texttt{.cpl}. Suppose you have stored the example above in file \texttt{coin.cpl}.
Suppose also that \texttt{coin.cpl} is in the directory \texttt{cplint} that you obtained when you uncompressed \texttt{cplint.zip}. In order to answer queries from this program, you run \texttt{yap} from directory \texttt{cplint} and then load the interpreter with the command
\begin{verbatim}
compile(cplint).
\end{verbatim}
If \texttt{coin.cpl} is not in directory \texttt{cplint} then you must use the command
\begin{verbatim}
compile('path_to_cplint/cplint.pl').
\end{verbatim}
from the directory where \texttt{coin.cpl} is.
In this case \verb|path_to_cplint| is the path to \texttt{cplint.pl}.
Then you must parse the source file \texttt{coin.cpl}  with the command
\begin{verbatim}
p(coin).
\end{verbatim}
At this point you can pose query to the program. You have to use the predicate \texttt{s/2} (for solve) that takes as its first argument a conjunction of goals in the form of a list and returns the computed probability as its second argument. For example, the probability of the conjunction \texttt{head(coin), biased(coin)} can be asked with the query
\begin{verbatim}
s([head(coin),biased(coin)],P).
\end{verbatim}
For computing the probability of a conjunction given another conjunction you have to use the predicate \texttt{sc/3} (for solve conditional) that take takes as input the query conjunction as its first argument, the evidence conjunction as its second argument and returns the probability in its third argument.
For example, the probability of  the query \texttt{heads(coin)} given the evidence \texttt{biased(coin)} can be asked with the query
\begin{verbatim}
sc([heads(coin)],[biased(coin)],P).
\end{verbatim}
The package contains also a program \texttt{semantics.pl} that computes the probability of queries by using directly the semantics of LPADs (i.e. by generating all the ground instances and then testing the query with each of them). This is provided for testing purposes only.  After having compiled the program, you can use the same commands of \texttt{cplint.pl}. \texttt{semantics.pl} requires an extra file in the directory where \texttt{coin.cpl} is: a file with extension \texttt{.uni} (for universe) that contains, for each variable, the list of constants to which the variable can be instantiated. For example, in our case the current directory will contain a file \texttt{coin.uni} that is a Prolog file containing facts of the form
\begin{verbatim}
universe(VarList,ConstList).
\end{verbatim}
where \texttt{VarList} is a list of variables names (each must be included in single quotes) and \texttt{ConstList} is a list of constants. \texttt{semantics.pl} generates the grounding by instantiating in all possible ways the variables of \texttt{VarList} with the constants of \texttt{ConstList}. Note that the variables are identified by name, so a variable with the same name in two different clauses will be instantiated with the same constants.

You can test your installation of \texttt{cplint} by using the program \texttt{test.pl}:
compile \texttt{cplint.pl}, compile \texttt{test.pl} and exectue the query \texttt{t}.
A number of queries are executed against the example programs and the returned probabilities are checked: if \texttt{t.} succeeds, then \texttt{cplint} is working.


\section{Extensions}
In this section we will present the extensions to the syntax of LPADs and CP-logic programs that \texttt{cplint} can handle.

The first is the use of some standard Prolog predicates.
The bodies can contain the built-in predicates:
\begin{verbatim}
is/2
>/2
</2
>=/2
=</2
=:=/2
=\=/2
true/0
false/0
=/2
==/2
\=/2
\==/2
length/2
\end{verbatim}
The bodies can also contain the following
 library predicates:
\begin{verbatim}
member/2
max_list/2
min_list/2
nth0/3
nth/3
\end{verbatim}
plus the predicate
\begin{verbatim}
average/2
\end{verbatim}
that, given a list of numbers, computes its arithmetic mean.

Moreover, the bodies can contain the predicates \texttt{setof/3} and \texttt{bagof/3} with the same meaning as in Prolog. Existential quantifiers are allowed in both, so for example the query
\begin{verbatim}
setof(Z, (term(X,Y))^foo(X,Y,Z), L).
\end{verbatim}
returns all the instantiations of \texttt{Z} such that there exists an instantiation of \texttt{X} and \texttt{Y} for which \texttt{foo(X,Y,Z)} is true.

An example of the use of \texttt{setof} and \texttt{bagof} is in the file \texttt{female.cpl}:
\begin{verbatim}
male(C):M/P ; female(C):F/P:-
    person(C),
    setof(Male,known_male(Male),LM),
    length(LM,M),
    setof(Female,known_female(Female),LF),
    length(LF,F),
    P is F+M.

person(f).

known_female(a).

known_female(b).

known_female(c).

known_male(d).

known_male(e).
\end{verbatim}
The disjunctive rule expresses the probability of a person of unknown sex of being male or female depending on the number of males and females that are known.
This is an example of the use of expressions in the probabilities in the head that depend on variables in the body. The probabilities are well defined because they always sum to 1 (unless \texttt{P} is 0).

Another use of \texttt{setof} and \texttt{bagof} is to have an attribute depend on an aggregate function of another attribute, similarly to what is done in PRM and CLP(BN).

So, in the classical school example (available in \texttt{student.cpl}) you can find the following
clauses:
\begin{verbatim}
student_rank(S,h):0.6 ; student_rank(S,l):0.4:-
    bagof(G,R^(registr_stu(R,S),registr_gr(R,G)),L),
    average(L,Av),Av>1.5.

student_rank(S,h):0.4 ; student_rank(S,l):0.6:-
    bagof(G,R^(registr_stu(R,S),registr_gr(R,G)),L),
    average(L,Av),Av =< 1.5.
\end{verbatim}
where \verb|registr_stu(R,S)| expresses that registration \texttt{R} refers to student \texttt{S} and \verb|registr_gr(R,G)| expresses that registration \texttt{R} reports  grade \texttt{G} which is a natural number. The two clauses express a dependency of the rank of the student from the average of her grades.

Another extension has been introduced in order to be able to represent  reference uncertainty of PRMs. Reference uncertainty means that the link structure of a relational model is not fixed but is uncertain: this is represented by having the instance referenced in a relationship be chosen uniformly from a set. For example, consider a domain modeling scientific papers: you have a single entity, paper, and a relationship, cites, between paper and itself that connects the citing paper to the cited paper. To represent the fact that the cited paper and the citing paper are selected uniformly from certain sets, the following clauses can be used (see file \verb|paper_ref_simple.cpl|):
\begin{verbatim}
uniform(cites_cited(C,P),P,L):-
    bagof(Pap,paper_topic(Pap,theory),L).

uniform(cites_citing(C,P),P,L):-
    bagof(Pap,paper_topic(Pap,ai),L).
\end{verbatim}
The first clauses states that the  paper \texttt{P} cited in a citation \texttt{C} is selected uniformly from the set of all papers with topic theory.
The second clauses expresses that the citing paper is selected uniformly from the papers with
topic ai.

These clauses make use of the predicate
\begin{verbatim}
uniform(Atom,Variable,List)
\end{verbatim}
in the head, where \texttt{Atom} must contain \texttt{Variable}. The meaning is the following: the set of all the atoms obtained by instantiating \texttt{Variable} of \texttt{Atom} with a term taken from \texttt{List} is generated and the head is obtained by having a disjunct for each instantiation with probability $1/N$ where $N$ is the length of \texttt{List}.


A more elaborate example is present in file \verb|paper_ref.cpl|:
\begin{verbatim}
uniform(cites_citing(C,P),P,L):-
    setof(Pap,paper(Pap),L).

cites_cited_group(C,theory):0.9 ; cites_cited_group(C,ai):0.1:-
    cites_citing(C,P),paper_topic(P,theory).

cites_cited_group(C,theory):0.01;cites_cited_group(C,ai):0.99:-
    cites_citing(C,P),paper_topic(P,ai).

uniform(cites_cited(C,P),P,L):-
    cites_cited_group(C,T),bagof(Pap,paper_topic(Pap,T),L).
\end{verbatim}
where the cited paper depends on the topic of the citing paper. In particular, if the topic is theory, the cited paper is selected uniformly  from the papers about theory with probability 0.9 and from the papers about ai with probability 0.1. if the topic is ai, the cited paper is selected uniformly  from the papers about theory with probability 0.01 and from the papers about ai with probability 0.99.

PRMs take into account as well existence uncertainty, where the existence of instances is also probabilistic. For example, in the paper domain, the total number of citations may be unknown and  a citation between any two paper may have a probability of existing. For example, a citation between two paper may be more probable if they are about the same topic:
\begin{verbatim}
cites(X,Y):0.005 :-
    paper_topic(X,theory),paper_topic(Y,theory).

cites(X,Y):0.001 :-
    paper_topic(X,theory),paper_topic(Y,ai).

cites(X,Y):0.003 :-
    paper_topic(X,ai),paper_topic(Y,theory).

cites(X,Y):0.008 :-
    paper_topic(X,ai),paper_topic(Y,ai).
\end{verbatim}
This is an example of a CP-logic program, because the probabilities in the head do not sum up to one.
The first clause states that, if the topic of a paper \texttt{X} is theory and  of  paper \texttt{Y} is theory, there is a probability of 0.005 that there is a citation from \texttt{X} to \texttt{Y}. The other clauses consider the remaining cases for the topics.

\section{Parameters}
The proof procedure has two parameters that can be set with the command
\begin{verbatim}
set(parameter,value).
\end{verbatim}
from the Yap prompt after having compiled \texttt{cplint.pl}.
The current value can be read with
\begin{verbatim}
setting(parameter,Value).
\end{verbatim}
from the Yap prompt.
Available parameters:
\begin{itemize}
\item
	 \verb|epsilon_parsing|: if (1 - the sum of the probabilities of all the head atoms) is smaller than
    \verb|epsilon_parsing|
		then the clause is considered as an LPAD clause, otherwise it is considered as a CP-logic
		clause. Default value 0.00001
\item 	\texttt{savedot}: if true a graph representing the BDD is saved in the file \texttt{cpl.dot} in the current directory in dot format.
		The variables names are of the form \verb|Xn_m| where \texttt{n} is the number of the multivalued
		variable and \texttt{m} is the number of the binary variable. The correspondence of variables to
		clauses can be evinced from the list printed on the screen of the form
\begin{verbatim}
Variables: [(2,[X=2,X1=1]),(2,[X=1,X1=0]),(1,[])]
\end{verbatim}
		where the first element of each couple is the clause number of the input file (starting from 1).
		In the example above variable \texttt{X0} corresponds to clause \texttt{2} with the substitutions \texttt{X=2,X1=1},
		variable \texttt{X1} corresponds to clause \texttt{2} with the substitutions \texttt{X=1,X1=0} and
		variable \texttt{X2} corresponds to clause \texttt{1} with the empty substitution.
		You can view the graph with \texttt{graphviz} (\href{www.graphviz.org}{www.graphviz.org}) using the
		command
\begin{verbatim}
dotty cpl.dot &
\end{verbatim}
\end{itemize}


\section{Content of the Directory}
\begin{itemize}
\item \texttt{cplint.pl}: Prolog code of the interpreter.
\item \verb|cplint.so|: dynamic library that includes  CUDD. It is called by \texttt{cplint.pl}.
\item \texttt{semantics.pl}: Prolog program for computing the probability according to the semantics.
\item \texttt{test.pl}: Prolog program for testing the system.
\item Examples:
\begin{itemize}
\item \texttt{alarm.cpl}: representation of the Bayesian network in Figure 2 of
 \cite{VenVer04-ICLP04-IC}.
\item \texttt{coin.cpl}: coin example from   \cite{VenVer04-ICLP04-IC}.
\item \texttt{coin2.cpl}: coin example with two coins.
\item \texttt{dice.cpl}: dice example from \cite{VenVer04-ICLP04-IC}.
\item \verb|twosideddice.cpl,threesideddice.cpl|  game with idealized dice with two or three  sides. Used in the experiments in \cite{Rig-RCRA07-IC}.
\item \texttt{es.cpl}: first example in \cite{Rig-RCRA07-IC}.
\item \texttt{esapprox.cpl}: example showing the problems of approximate inference (see \cite{Rig-RCRA07-IC}).
\item \texttt{esrange.cpl}: example showing the problems with non range restricted programs (see \cite{Rig-RCRA07-IC}).
\item \texttt{female.cpl}: example showing the dependence of probabilities in the head from variables in the body (from \cite{VenVer04-ICLP04-IC}).
\item \texttt{mendel.cpl}: program describing the Mendelian rules of inheritance, taken from \cite{Blo04-ILP04WIP-IC}.
\item \verb|paper_ref.cpl,paper_ref_simple.cpl|: paper citations examples, showing reference uncertainty, inspired by \cite{Getoor+al:JMLR02}.
\item \verb|paper_ref_not.cpl|: paper citations example showing that negation can be used also for predicates defined by clauses with \texttt{uniform} in the head.
\item \texttt{school.cpl}: example inspired by the example \verb|school_32.yap| from the
source distribution of Yap in the \texttt{CLPBN} directory.
\item \verb|school_simple.cpl|: simplified version of \texttt{school.cpl}.
\item \verb|student.cpl|: student example from Figure 1.3 of \cite{GetFri01-BC}.
\end{itemize}
The files \texttt{*.uni} that are present for some of the examples are used  by \texttt{semantics.pl}. Some of the example files contain in an initial comment some queries together with their result.
\item \verb|doc|: directory that contains this manual in pdf and html
\end{itemize}
\section{License}
\label{license}

Copyright (c) 2007, Fabrizio Riguzzi

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
\begin{itemize}
\item Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
\item Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
\item Neither the name of the University of Ferrara nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
\end{itemize}
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

\vspace{3mm}

The program uses the library \href{http://vlsi.colorado.edu/\%7Efabio/}{CUDD} for manipulating BDDs.
For the use of CUDD, the following license must be accepted:

\vspace{3mm}

Copyright (c) 1995-2004, Regents of the University of Colorado

All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

\begin{itemize}
\item
Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
\item
Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
\item
Neither the name of the University of Colorado nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
\end{itemize}
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.


\bibliographystyle{plain}
\bibliography{bib}

\end{document}