\documentclass[11pt]{article} \usepackage{times} \usepackage{pl} \usepackage{plpage} \usepackage{html} \sloppy \makeindex \onefile \htmloutput{html} % Output directory \htmlmainfile{index} % Main document file \bodycolor{white} % Page colour \renewcommand{\runningtitle}{CLIB -- System interfaces} \begin{document} \title{SWI-Prolog C-library} \author{Jan Wielemaker \\ HCS, \\ University of Amsterdam \\ The Netherlands \\ E-mail: \email{J.Wielemaker@cs.vu.nl}} \maketitle \begin{abstract} This document describes commonly used foreign language extensions to \url[SWI-Prolog]{http://www.swi-prolog.org} distributed as a package known under the name {\em clib}. The package defines a number of Prolog libraries with accompagnying foreign libraries. On Windows systems, the \pllib{unix} library can only be used if the whole SWI-Prolog suite is compiled using \url[Cywin]{http://www.cygwin.com}. The other libraries have been ported to native Windows. \end{abstract} \vfill \pagebreak \tableofcontents \vfill \vfill \newpage \section{Introduction} Many useful facilities offered by one or more of the operating systems supported by SWI-Prolog are not supported by the SWI-Prolog kernel distribution. Including these would enlarge the {\em footprint} and complicate portability matters while supporting only a limited part of the user-community. This document describes \pllib{unix} to deal with the Unix process API, \pllib{socket} to deal with inet-domain TCP and UDP sockets, \pllib{cgi} to deal with getting CGI form-data if SWI-Prolog is used as a CGI scripting language, \pllib{crypt} to provide password encryption and verification, \pllib{sha} providing cryptographic hash functions and \pllib{memfile} providing in-memorty pseudo files. \section{Unix Process manipulation library} The \pllib{unix} library provides the commonly used Unix primitives to deal with process management. These primitives are useful for many tasks, including server management, parallel computation, exploiting and controlling other processes, etc. The predicates are modelled closely after their native Unix counterparts. Higher-level primitives, especially to make this library portable to non-Unix systems are desirable. Using these primitives and considering that process manipulation is not a very time-critical operation we anticipate these libraries to be developed in Prolog. \begin{description} \predicate{fork}{1}{-Pid} Clone the current process into two branches. In the child, \arg{Pid} is unified to \const{child}. In the original process, \arg{Pid} is unified to the process identifier of the created child. Both parent and child are fully functional Prolog processes running the same program. The processes share open I/O streams that refer to Unix native streams, such as files, sockets and pipes. Data is not shared, though on most Unix systems data is initially shared and duplicated only if one of the programs attempts to modify the data. Unix \funcref{fork}{} is the only way to create new processes and fork/2 is a simple direct interface to it. \predicate{exec}{1}{+Command(...Args...)} Replace the running program by starting \arg{Command} using the given commandline arguments. Each command-line argument must be atomic and is converted to a string before passed to the Unix call \funcref{execvp}{}. Unix \funcref{exec}{} is the only way to start an executable file executing. It is commonly used together with fork/1. For example to start \program{netscape} on an URL in the background, do: \begin{code} run_netscape(URL) :- ( fork(child), exec(netscape(URL)) ; true ). \end{code} Using this code, netscape remains part of the process-group of the invoking Prolog process and Prolog does not wait for netscape to terminate. The predicate wait/2 allows waiting for a child, while detach_IO/0 disconnects the child as a deamon process. \predicate{wait}{2}{-Pid, -Status} Wait for a child to change status. Then report the child that changed status as well as the reason. \arg{Status} is unified with \term{exited}{ExitCode} if the child with pid \arg{Pid} was terminated by calling \funcref{exit}{} (Prolog halt/[0,1]). \arg{ExitCode} is the return=status. \arg{Status} is unified with \term{signaled}{Signal} if the child died due to a software interrupt (see kill/2). \arg{Signal} contains the signal number. Finally, if the process suspended execution due to a signal, \arg{Status} is unified with \term{stopped}{Signal}. \predicate{kill}{2}{+Pid, +Signal} Deliver a software interrupt to the process with identifier \arg{Pid} using software-interrupt number \arg{Signal}. See also on_signal/2. Signals can be specified as an integer or signal name, where signal names are derived from the C constant by dropping the \const{SIG} prefix and mapping to lowercase. E.g.\ \const{int} is the same as \const{SIGINT} in C. The meaning of the signal numbers can be found in the Unix manual.\footnote{kill/2 should support interrupt-names as well}. \predicate{pipe}{2}{-InSream, -OutStream} Create a communication-pipe. This is normally used to make a child communicate to its parent. After pipe/2, the process is cloned and, depending on the desired direction, both processes close the end of the pipe they do not use. Then they use the remaining stream to communicate. Here is a simple example: \begin{code} :- use_module(library(unix)). fork_demo(Result) :- pipe(Read, Write), fork(Pid), ( Pid == child -> close(Read), format(Write, '~q.~n', [hello(world)]), flush_output(Write), halt ; close(Write), read(Read, Result), close(Read) ). \end{code} \predicate{dup}{2}{+FromStream, +ToStream} Interface to Unix dup2(), copying the underlying filedescriptor and thus making both streams point to the same underlying object. This is normally used together with fork/1 and pipe/2 to talk to an external program that is designed to communicate using standard I/O. Both \arg{FromStream} and \arg{ToStream} either refer to a Prolog stream or an integer descriptor number to refer directly to OS descriptors. See also \file{demo/pipe.pl} in the source-distribution of this package. \predicate{detach_IO}{0}{} This predicate is intended to create Unix deamon-processes. It preforms two actions. First of all, the I/O streams \const{user_input}, \const{user_output} and \const{user_error} are closed and rebound to a Prolog stream that returns end-of-file on any attempt to read and starts writing to a file named \file{/tmp/pl-out.pid} (where is the process-id of the calling Prolog) on any attempt to write. This file is opened only if there is data available. This is intended for debugging purposes.% \footnote{More subtle handling of I/O, especially for debugging is required: communicate with the syslog deamon and optionally start a debugging dialog on a newly created (X-)terminal should be considered.} Finally, the process is detached from the current process-group and its controlling terminal. \end{description} \input{process.tex} \section{File manipulation library} The \pllib{files} library provides additional operations on files from SWI-Prolog. It is currently very incomplete. \begin{description} \predicate{set_time_file}{3}{+File, -OldTimes, +NewTimes} Query and set POSIX time attributes of a file. Both \arg{OldTimes} and \arg{NewTimes} are lists of option-terms. Times are represented in SWI-Prolog's standard floating point numbers. New times may be specified as \const{now} to indicate the current time. Defined options are: \begin{description} \termitem{access}{Time} Describes the time of last access of the file. This value can be read and written. \termitem{modified}{Time} Describes the time the contents of the file was last modified. This value can be read and written. \termitem{changed}{Time} Describes the time the file-structure itself was changed by adding (link()) or removing (unlink()) names. \end{description} Here are some example queries. The first retrieves the access-time, while the second sets the last-modified time to the current time. \begin{code} ?- set_time_file(foo, [acess(Access)], []). ?- set_time_file(foo, [], [modified(now)]). \end{code} \end{description} \section{Socket library} The \pllib{socket} library provides TCP and UDP inet-domain sockets from SWI-Prolog, both client and server-side communication. The interface of this library is very close to the Unix socket interface, also supported by the MS-Windows {\em winsock} API. SWI-Prolog applications that wish to communicate with multiple sources have three options: \begin{enumerate} \item Use I/O multiplexing based on wait_for_input/3. On Windows systems this can only be used for sockets, not for general (device-) file handles. \item Use multiple threads, handling either a single blocking socket or a pool using I/O multiplexing as above. \item Using XPCE's class \class{socket} which synchronises socket events in the GUI event-loop. \end{enumerate} \begin{description} \predicate{tcp_socket}{1}{-SocketId} Creates an \const{INET}-domain stream-socket and unifies an identifier to it with \arg{SocketId}. On MS-Windows, if the socket library is not yet initialised, this will also initialise the library. \predicate{tcp_close_socket}{1}{+SocketId} Closes the indicated socket, making \arg{SocketId} invalid. Normally, sockets are closed by closing both stream handles returned by open_socket/3. There are two cases where tcp_close_socket/1 is used because there are no stream-handles: \begin{itemize} \item After tcp_accept/3, the server does a fork/1 to handle the client in a sub-process. In this case the accepted socket is not longer needed from the main server and must be discarded using tcp_close_socket/1. \item If, after discovering the connecting client with tcp_accept/3, the server does not want to accept the connection, it should discard the accepted socket immediately using tcp_close_socket/1. \end{itemize} \predicate{tcp_open_socket}{3}{+SocketId, -InStream, -OutStream} Open two SWI-Prolog I/O-streams, one to deal with input from the socket and one with output to the socket. If tcp_bind/2 has been called on the socket. \arg{OutSream} is useless and will not be created. After closing both \arg{InStream} and \arg{OutSream}, the socket itself is discarded. \predicate{tcp_bind}{2}{+Socket, ?Port} Bind the socket to \arg{Port} on the current machine. This operation, together with tcp_listen/2 and tcp_accept/3 implement the {\em server}-side of the socket interface. If \arg{Port} is unbound, the system picks an arbitrary free port and unifies \arg{Port} with the selected port number. \arg{Port} is either an integer or the name of a registered service. See also tcp_connect/4. \predicate{tcp_listen}{2}{+Socket, +Backlog} Tells, after tcp_bind/2, the socket to listen for incoming requests for connections. \arg{Backlog} indicates how many pending connection requests are allowed. Pending requests are requests that are not yet acknowledged using tcp_accept/3. If the indicated number is exceeded, the requesting client will be signalled that the service is currently not available. A suggested default value is 5. \predicate{tcp_accept}{3}{+Socket, -Slave, -Peer} This predicate waits on a server socket for a connection request by a client. On success, it creates a new socket for the client and binds the identifier to \arg{Slave}. \arg{Peer} is bound to the IP-address of the client. \predicate[deprecated]{tcp_connect}{2}{+Socket, +Host:+Port} Connect \arg{Socket}. After successful completion, tcp_open_socket/3 can be used to create I/O-Streams to the remote socket. New code should use tcp_connect/4, which can be hooked to allow for proxy negotiation. \predicate{tcp_connect}{4}{+Socket, +Host:+Port, -Read, -Write} Client-interface to connect a socket to a given \arg{Port} on a given \arg{Host}. \arg{Port} is either an integer or the name of a registered service. The fragment below connects to the \url{http://www.swi-prolog.org} using the service name instead of the hardcoded number `80'. \begin{code} Adress = 'www.swi-prolog.org':http, tcp_socket(Socket), tcp_connect(Socket, Adress, Read, Write), \end{code} This predicate can be hooked by defining the multifile-predicate socket:tcp_connect_hook/4. This hook is specifically intented for proxy negotiation. The code below shows the structure of such a hook. The predicates \nopredref{proxy}{1} and \nopredref{proxy_connect}{3} must be provided by the user. \begin{code} :- multifile socket:tcp_connect_hook/4. socket:tcp_connect_hook(Socket, Address, Read, Write) :- proxy(ProxyAdress), tcp_connect(Socket, ProxyAdress), tcp_open_socket(Socket, Read, Write), proxy_connect(Address, Read, Write). \end{code} \predicate{tcp_setopt}{2}{+Socket, +Option} Set options on the socket. Defined options are: \begin{description} \termitem{reuseaddr}{} Allow servers to reuse a port without the system being completely sure the port is no longer in use. \termitem{nodelay}{} Same as \term{nodelay}{true} \termitem{nodelay}{Bool} If \const{true}, disable the Nagle optimization on this socket, which is enabled by default on almost all modern TCP/IP stacks. The Nagle optimization joins small packages, which is generally desirable, but sometimes not. Please note that the underlying TCP_NODELAY setting to setsockopt() is not available on all platforms and systems may require additional privileges to change this option. If the option is not supported, tcp_setopt/2 raises a domain_error exception. See \url[Wikipedia]{http://en.wikipedia.org/wiki/Nagle's_algorithm} for details. \termitem{broadcast}{} UDP sockets only: broadcast the package to all addresses matching the address. The address is normally the address of the local subnet (i.e. 192.168.1.255). See udp_send/4. \termitem{dispatch}{Bool} In GUI environments (using XPCE or the Windows plwin.exe executable) this flags defines whether or not any events are dispatched on behalf of the user interface. Default is \const{true}. Only very specific situations require setting this to \const{false}. \end{description} \predicate{tcp_fcntl}{3}{+Stream, +Action, ?Argument} Interface to the Unix \funcref{fcntl}{} call. Currently only suitable to deal switch stream to non-blocking mode using: \begin{code} ... tcp_fcntl(Stream, setfl. nonblock), ... \end{code} As of SWI-Prolog 3.2.4, handling of non-blocking stream is supported. An attempt to read from a non-blocking stream returns -1 (or \const{end_of_file} for read/1), but at_end_of_stream/1 fails. On actual end-of-input, at_end_of_stream/1 succeeds. \predicate{tcp_host_to_address}{2}{?HostName, ?Address} Translate between a machines host-name and it's (IP-)address. If \arg{HostName} is an atom, it is resolved using \funcref{gethostbyname}{} and the IP-number is unified to \arg{Address} using a term of the format \term{ip}{Byte1, Byte2, Byte3, Byte4}. Otherwise, if \arg{Address} is bound to a \functor{ip}{4} term, it is resolved by \funcref{gethostbyaddr}{} and the canonical hostname is unified with \arg{HostName}. \predicate{gethostname}{1}{-Hostname} Return the official fully qualified name of this host. This is achieved by calling gethostname() followed by gethostbyname() and return the official name of the host (\const{h_name}) of the structure returned by the latter function. \end{description} \subsection{Server applications} The typical sequence for generating a server application is defined below: \begin{code} create_server(Port) :- tcp_socket(Socket), tcp_bind(Socket, Port), tcp_listen(Socket, 5), tcp_open_socket(Socket, AcceptFd, _), \end{code} There are various options for . The most commonly used option is to start a Prolog thread to handle the connection. Alternatively, input from multiple clients can be handled in a single thread by listening to these clients using wait_for_input/3. Finally, on Unix systems, we can use fork/1 to handle the connection in a new process. Note that fork/1 and threads do not cooperate well. Combinations can be realised but require good understanding of POSIX thread and fork-semantics. Below is the typical example using a thread. Note the use of setup_call_cleanup/3 to guarantee that all resources are reclaimed, also in case of failure or exceptions. \begin{code} dispatch(AcceptFd) :- tcp_accept(AcceptFd, Socket, _Peer), thread_create(process_client(Socket, Peer), _, [ detached(true) ]), dispatch(AcceptFd). process_client(Socket, Peer) :- setup_call_cleanup(tcp_open_socket(Socket, In, Out), handle_service(In, Out), close_connection(In, Out)). close_connection(In, Out) :- close(In, [force(true)]), close(Out, [force(true)]). handle_service(In, Out) :- ... \end{code} \subsection{Client applications} The skeleton for client-communication is given below. \begin{code} create_client(Host, Port) :- setup_call_catcher_cleanup(tcp_socket(Socket), tcp_connect(Socket, Host:Port), exception(_), tcp_close_socket(Socket)), setup_call_cleanup(tcp_open_socket(Socket, In, Out), chat_to_server(In, Out), close_connection(In, Out)). close_connection(In, Out) :- close(In, [force(true)]), close(Out, [force(true)]). chat_to_server(In, Out) :- ... \end{code} To deal with timeouts and multiple connections, wait_for_input/3 and/or non-blocking streams (see tcp_fcntl/3) can be used. \subsection{The stream_pool library} The \pllib{streampool} library dispatches input from multiple streams based on wait_for_input/3. It is part of the clib package as it is used most of the time together with the \pllib{socket} library. On non-Unix systems it often can only be used with socket streams. With SWI-Prolog 5.1.x, multi-threading often provides a good alternative to using this library. In this schema one thread watches the listening socket waiting for connections and either creates a thread per connection or processes the accepted connections with a pool of \jargon{worker threads}. The library \pllib{http/thread_httpd} provides an example realising a mult-threaded HTTP server. \begin{description} \predicate{add_stream_to_pool}{2}{+Stream, :Goal} Add \arg{Stream}, which must be an input stream and ---on non-unix systems--- connected to a socket to the pool. If input is available on \arg{Stream}, \arg{Goal} is called. \predicate{delete_stream_from_pool}{1}{+Stream} Delete the given stream from the pool. Succeeds, even if \arg{Stream} is no member of the pool. If \arg{Stream} is unbound the entire pool is emtied but unlike close_stream_pool/0 the streams are not closed. \predicate{close_stream_pool}{0}{} Empty the pool, closing all streams that are part of it. \predicate{dispatch_stream_pool}{1}{+TimeOut} Wait for maximum of \arg{TimeOut} for input on any of the streams in the pool. If there is input, call the \arg{Goal} associated with add_stream_to_pool/2. If \arg{Goal} fails or raises an exception a message is printed. \arg{TimeOut} is described with wait_for_input/3. If \arg{Goal} is called, there is \emph{some} input on the associated stream. \arg{Goal} must be careful not to block as this will block the entire pool.% \footnote{This is hard to achieve at the moment as none of the Prolog read-commands provide for a timeout.} \predicate{stream_pool_main_loop}{0}{} Calls dispatch_stream_pool/1 in a loop until the pool is empty. \end{description} Below is a very simple example that reads the first line of input and echos it back. \begin{code} :- use_module(library(streampool)). server(Port) :- tcp_socket(Socket), tcp_bind(Socket, Port), tcp_listen(Socket, 5), tcp_open_socket(Socket, In, _Out), add_stream_to_pool(In, accept(Socket)), stream_pool_main_loop. accept(Socket) :- tcp_accept(Socket, Slave, Peer), tcp_open_socket(Slave, In, Out), add_stream_to_pool(In, client(In, Out, Peer)). client(In, Out, _Peer) :- read_line_to_codes(In, Command), close(In), format(Out, 'Please to meet you: ~s~n', [Command]), close(Out), delete_stream_from_pool(In). \end{code} \subsection{UDP protocol support} The current library provides limited support for UDP packets. The UDP protocol is a \emph{connection-less} and \emph{unreliable} datagram based protocol. That means that messages sent may or may not arrive at the client side and may arrive in a different order as they are sent. UDP messages are often used for streaming media or for service discovery using the broadcasting mechanism. \begin{description} \predicate{udp_socket}{1}{-Socket} Similar to tcp_socket/1, but create a socket using the \const{SOCK_DGRAM} protocol, ready for UDP connections. \predicate{udp_receive}{4}{+Socket, -Data, -From, +Options} Wait for and return the next datagram. The data is returned as a Prolog string object (see string_to_list/2). \arg{From} is a term of the format \mbox{ip(\arg{A},\arg{B},\arg{C},\arg{D}):\arg{Port}} indicating the sender of the message. \arg{Socket} can be waited for using wait_for_input/3. Defined \arg{Options}: \begin{description} \termitem{as}{+Type} Defines the returned term-type. \arg{Type} is one of \const{atom}, \const{codes} or \const{string} (default). \end{description} The typical sequence to receive UDP data is: \begin{code} receive(Port) :- udp_socket(S), tcp_bind(S, Port), repeat, udp_receive(Socket, Data, From, [as(atom)]), format('Got ~q from ~q~n', [Data, From]), fail. \end{code} \predicate{udp_send}{4}{+Socket, +Data, +To, +Options} Send a UDP message. Data is a string, atom or code-list providing the data. \arg{To} is an address of the form \arg{Host}:\arg{Port} where Host is either the hostname or a term ip/4. \arg{Options} is currently unused. A simple example to send UDP data is: \begin{code} send(Host, Port, Message) :- udp_socket(S), udp_send(S, Message, Host:Port, []), tcp_close_socket(S). \end{code} A broadcast is achieved by using \term{tcp_setopt}{Socket, broadcast} prior to sending the datagram and using the local network broadcast address as a ip/4 term. \end{description} The normal mechanism to discover a service on the local network is for the client to send a broadcast message to an agreed port. The server receives this message and replies to the client with a message indicating further details to establish the communication. \input{uri.tex} \section{CGI Support library} This is currently a very simple library, providing support for obtaining the form-data for a CGI script: \begin{description} \predicate{cgi_get_form}{1}{-Form} Decodes standard input and the environment variables to obtain a list of arguments passed to the CGI script. This predicate both deals with the CGI {\bf GET} method as well as the {\bf POST} method. If the data cannot be obtained, an \const{existence_error} exception is raised. \end{description} Below is a very simple CGI script that prints the passed parameters. To test it, compile this program using the command below, copy it to your cgi-bin directory (or make it otherwise known as a CGI-script) and make the query \verb$http://myhost.mydomain/cgi-bin/cgidemo?hello=world$ \begin{code} % pl -o cgidemo --goal=main --toplevel=halt -c cgidemo.pl \end{code} \begin{code} :- use_module(library(cgi)). main :- set_stream(current_output, encoding(utf8)), cgi_get_form(Arguments), format('Content-type: text/html; charset=UTF-8~n~n', []), format('~n', []), format('~n', []), format('Simple SWI-Prolog CGI script~n', []), format('~n~n', []), format('~n', []), format('

', []), print_args(Arguments), format('~n~n', []). print_args([]). print_args([A0|T]) :- A0 =.. [Name, Value], format('~w=~w
~n', [Name, Value]), print_args(T). \end{code} \subsection{Some considerations} Printing an HTML document using format/2 is not a neat way of producing HTML because it is vulnerable to required escape sequences. A high-level alternative is provided by \pllib{http/html_write} from the HTTP library. The startup-time of Prolog is relatively long, in particular if the program is large. In many cases it is much better to use the SWI-Prolog HTTP server library and make the main web-server relay requests to the SWI-Prolog webserver. See the SWI-Prolog \url[HTTP package]{http://www.swi-prolog.org/pldoc/package/http.html} for details. The CGI standard is unclear about handling Unicode data. The above two declarations ensure the CGI script will send all data in UTF-8 and thus provide full support of Unicode. It is assumed that browsers generally send form-data using the same encoding as the page in which the form appears, UTF-8 or ISO Latin-1. The current version of cgi_get_form/1 assumes the CGI data is in UTF-8. \section{MIME decoding library} MIME (Multipurpose Internet Mail Extensions) is a format for serializing multiple typed data objects. It was designed for E-mail, but it is also used for other applications such packaging multiple values using the HTTP POST request on web-servers. Double Precision, Inc.\ has produced the C-libraries rfc822 (mail) and rfc2045 (MIME) for decoding and manipulating MIME messages. The \pllib{mime} library is a Prolog wrapper around the rfc2045 library for decoding MIME messages. The general name `mime' is used for this library as it is anticipated to add MIME-creation functionality to this library. Currently the mime library defines one predicate: \begin{description} \predicate{mime_parse}{2}{Data, Parsed} Parse \arg{Data} and unify the result to \arg{Parsed}. \arg{Data} is one of: \begin{description} \termitem{stream}{Stream} Parse the data from \arg{Stream} upto the end-of-file. \termitem{stream}{Stream, Length} Parse a maximum of \arg{Length} characters from \arg{Stream} or upto the end-of-file, whichever comes first. \termitem{\arg{Text}}{} Atoms, strings, code- and character lists are treated as valid sources of data. \end{description} \arg{Parsed} is a tree structure of \term{mime}{Attributes, Data, PartList} terms. Currently either \arg{Data} is the empty atom or \arg{PartList} is an empty list.% \footnote{It is unclear to me whether a MIME note can contain a mixture of content and parts, but I believe the answer is `no'.} \arg{Data} is an atom holding the message data. The library automatically decodes \jargon{base64} and \jargon{quoted-printable} messages. See also the \const{transfer_encoding} attribute below. \arg{PartList} is a list of \functor{mime}{3} terms. \arg{Attributes} is a list holding a subset of the following arguments. For details please consult the RFC2045 document. \begin{description} \termitem{type}{Atom} Denotes the Content-Type, how the \arg{Data} should be interpreted. \termitem{transfer_encoding}{Atom} How the \arg{Data} was encoded. This is not very interesting as the library decodes the content of the message. \termitem{character_set}{Atom} The character set used for text data. Note that SWI-Prolog's capabilities for character-set handling are limited. \termitem{language}{Atom} Language in which the text-data is written. \termitem{id}{Atom} Identifier of the message-part. \termitem{description}{Atom} Descrptive text for the \arg{Data}. \termitem{disposition}{Atom} Where the data comes from. The current library only deals with `inline' data. \termitem{name}{Atom} Name of the part. \termitem{filename}{Atom} Name of the file the data should be stored in. \end{description} \end{description} \section{Password encryption library} \label{sec:crypt} The \pllib{crypt} library defines crypt/2 for encrypting and testing passwords. The clib package also provides crytographic hashes as described in \secref{sha} \begin{description} \predicate{crypt}{2}{+Plain, ?Encrypted} This predicate can be used in three modes. To test whether a password matches an encrypted version thereof, simply run with both arguments fully instantiated. To generate a default encrypted version of \arg{Plain}, run with unbound \arg{Encrypted} and this argument is unified to a list of character codes holding an encrypted version. The library supports two encryption formats: traditional Unix DES-hashes\footnote{On non-Unix systems, crypt() is provided by the NetBSD library. The license header is added at the end of this document.} and FreeBSD compatible MD5 hashes (all platforms). MD5 hashes start with the magic sequence \verb|$1$|, followed by an up to 8 character \jargon{salt}. DES hashes start with a 2 character \jargon{salt}. Note that a DES hash considers only the first 8 characters. The MD5 considers the whole string. Salt and algorithm can be forced by instantiating the start of \arg{Encrypted} with it. This is typically used to force MD5 hashes: \begin{code} ?- append("$1$", _, E), crypt("My password", E), format('~s~n', [E]). $1$qdaDeDZn$ZUxSQEESEHIDCHPNc3fxZ1 \end{code} \arg{Encrypted} is always an ASCII string. \arg{Plain} only supports ISO-Latin-1 passwords in the current implementation. \arg{Plain} is either an atom, SWI-Prolog string, list of characters or list of character-codes. It is not advised to use atoms, as this implies the password will be available from the Prolog heap as a defined atom. \end{description} \section{SHA1 and SHA2 Secure Hash Algorithms} \label{sec:sha} The library \pllib{sha} provides \jargon{Secure Hash Algorihms} approved by FIPS (\jargon{Federal Information Processing Standard}). Quoting \url[Wikipedia]{http://en.wikipedia.org/wiki/SHA-1}: \textit{``The SHA (Secure Hash Algorithm) hash functions refer to five FIPS-approved algorithms for computing a condensed digital representation (known as a message digest) that is, to a high degree of probability, unique for a given input data sequence (the message). These algorithms are called `secure' because (in the words of the standard), ``for a given algorithm, it is computationally infeasible 1) to find a message that corresponds to a given message digest, or 2) to find two different messages that produce the same message digest. Any change to a message will, with a very high probability, result in a different message digest.''} The current library supports all 5 approved algorithms, both computing the hash-key from data and the \jargon{hash Message Authentication Code} (HMAC). Input is text, represented as an atom, packed string object or code-list. Note that these functions operate on byte-sequences and therefore are not meaningful on Unicode text. The result is returned as a list of byte-values. This is the most general format that is comfortable supported by standard Prolog and can easily be transformed in other formats. Commonly used text formats are ASCII created by encoding each byte as two hexadecimal digits and ASCII created using \jargon{base64} encoding. Representation as a large integer can be desirable for computational processing. \begin{description} \predicate{sha_hash}{3}{+Data, -Hash, +Options} Hash is the SHA hash of Data. \arg{Data} is either an atom, packed string or list of character codes. \arg{Hash} is unified with a list of integers representing the hash. The conversion is controlled by Options: \begin{description} \termitem{algorithm}{+Algorithm} One of \const{sha1} (default), \const{sha224}, \const{sha256}, \const{sha384} or \const{sha512} \end{description} \predicate{hmac_sha}{4}{+Key, +Data, -HMAC, +Options} Quoting \url[Wikipedia]{http://en.wikipedia.org/wiki/HMAC}: \textit{``A keyed-hash message authentication code, or HMAC, is a type of message authentication code (MAC) calculated using a cryptographic hash function in combination with a secret key. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. Any iterative cryptographic hash function, such as MD5 or SHA-1, may be used in the calculation of an HMAC; the resulting MAC algorithm is termed HMAC-MD5 or HMAC-SHA-1 accordingly. The cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, on the size and quality of the key and the size of the hash output length in bits.''} \arg{Key} and \arg{Data} are either an atom, packed string or list of character codes. \arg{HMAC} is unified with a list of integers representing the authentication code. \arg{Options} is the same as for sha_hash/3, but currently only \const{sha1} and \const{sha256} are supported. \end{description} \subsection{License terms} The underlying SHA-2 library is an unmodified copy created by Dr Brian Gladman, Worcester, UK. It is distributed under the license conditions below. The free distribution and use of this software in both source and binary form is allowed (with or without changes) provided that: \begin{enumerate} \item distributions of this source code include the above copyright notice, this list of conditions and the following disclaimer; \item distributions in binary form include the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other associated materials; \item the copyright holder's name is not used to endorse products built using this software without specific written permission. \end{enumerate} ALTERNATIVELY, provided that this notice is retained in full, this product may be distributed under the terms of the GNU General Public License (GPL), in which case the provisions of the GPL apply INSTEAD OF those given above. \section{Memory files} The \pllib{memfile} provides an alternative to temporary files, intended for temporary buffering of data. Memory files in general are faster than temporary files and do not suffer from security riscs or naming conflicts associated with temporary-file management. They do assume proper memory management by the hosting OS and cannot be used to pass data to external processes using a file-name. There is no limit to the number of memory streams, nor the size of them. However, memory-streams cannot have multiple streams at the same time (i.e.\ cannot be opened for reading and writing at the same time). These predicates are first of all intended for building higher-level primitives. See also sformat/3, atom_to_term/3, term_to_atom/2 and the XPCE primitive pce_open/3. \begin{description} \predicate{new_memory_file}{1}{-Handle} Create a new memory file and return a unique opaque handle to it. \predicate{free_memory_file}{1}{+Handle} Discard the memory file and its contents. If the file is open it is first closed. \predicate{open_memory_file}{3}{+Handle, +Mode, -Stream} Open the memory-file. \arg{Mode} is currently one of \const{read} or \const{write}. The resulting \arg{Stream} must be closed using close/1. \predicate{open_memory_file}{4}{+Handle, +Mode, -Stream, +Options} Open a memory-file as open_memory_file/3. Options: \begin{description} \termitem{encoding}{+Encoding} Set the encoding for a memory file and the created stream. Encoding names are the same as used with open/4. By default, memoryfiles represent UTF-8 streams, making them capable of storing arbitrary Unicode text. In practice the only alternative is \const{octet}, turning the memoryfile into binary mode. Please study SWI-Prolog Unicode and encoding issues before using this option. \termitem{free_on_close}{+Bool} If \const{true} (default \const{false} and the memory file is opened for reading, discard the file (see free_memory_file/1) if the input is closed. This is used to realise open_chars_stream/2 in library(charsio). \end{description} \predicate{size_memory_file}{2}{+Handle, -Bytes} Return the content-length of the memory-file it \arg{Bytes}. The file should be closed and contain data. \predicate{atom_to_memory_file}{2}{+Atom, -Handle} Turn an atom into a read-only memory-file containing the (shared) characters of the atom. Opening this memory-file in mode \const{write} yields a permission error. \predicate{memory_file_to_atom}{2}{+Handle, -Atom} Return the content of the memory-file in \arg{Atom}. \predicate{memory_file_to_atom}{3}{+Handle, -Atom, +Encoding} Return the content of the memory-file in \arg{Atom}, pretending the data is in the given \arg{Encoding}. This can be used to convert from one encoding into another, typically from/to bytes. For example, if we must convert a set of bytes that contain text in UTF-8, open the memory file as octet stream, fill it, and get the result using \arg{Encoding} is \const{utf8}. \predicate{memory_file_to_codes}{2}{+Handle, -Codes} Return the content of the memory-file as a list of character-codes in \arg{Codes}. \predicate{memory_file_to_codes}{3}{+Handle, -Codes, +Encoding} Return the content of the memory-file as a list of character-codes in \arg{Codes}, pretending the data is in the given \arg{Encoding}. \end{description} \section{Time and alarm library} The \pllib{time} provides timing and alarm functions. \begin{description} \predicate{alarm}{4}{+Time, :Callable, -Id, +Options} Schedule \arg{Callable} to be called \arg{Time} seconds from now. \arg{Time} is a number (integer or float). \arg{Callable} is called on the next pass through a call- or redo-port of the Prolog engine, or a call to the PL_handle_signals() routine from SWI-Prolog. \arg{Id} is unified with a reference to the timer. The resolution of the alarm depends on the underlying implementation. On Unix systems it is based on setitimer(), on Windows on timeSetEvent() using a resolution specified at 50 milliseconds.\bug{The maximum time for timeSetEvent() used by the Windows application is 1000 seconds. Calling with a higher time value raises a \const{resource_error} exception.} Long-running foreign predicates that do not call PL_handle_signals() may further delay the alarm. \arg{Options} is a list of \term{\arg{Name}}{Value} terms. Defined options are: \begin{description} \termitem{remove}{Bool} If \const{true} (default \const{false}), the timer is removed automatically after fireing. Otherwise it must be destroyed explicitly using remove_alarm/1. \termitem{install}{Bool} If \const{false} (default \const{true}), the timer is allocated but not scheduled for execution. It must be started later using install_alarm/1. \end{description} \predicate{alarm}{3}{+Time, :Callable, -Id} Same as \term{alarm}{Time, Callable, Id, []}. \predicate{alarm_at}{4}{+Time, :Callable, -Id, +Options} as alarm/3, but \arg{Time} is the specification of an absolute point in time. Absolute times are specified in seconds after the Jan 1, 1970 epoch. See also date_time_stamp/2. \predicate{install_alarm}{1}{+Id} Activate an alarm allocated using alarm/4 with the option \term{install}{false} or stopped using uninstall_alarm/1. \predicate{install_alarm}{1}{+Id, +Time} As install_alarm/1, but specifies a new timeout value. \predicate{uninstall_alarm}{1}{+Id} Deactivate a running alarm, but do not invalidate the alarm identifier. Later, the alarm can be reactivated using either install_alarm/1 or install_alarm/2. Reinstalled using install_alarm/1, it will fire at the originally scheduled time. Reinstalled using install_alarm/2 causes the alarm to fire at the specified time from now. \predicate{remove_alarm}{1}{+Id} Remove an alarm. If it is not yet fired, it will not be fired any more. \predicate{current_alarm}{4}{?At, ?:Callable, ?Id, ?Status} Enumerate the not-yet-removed alarms. \arg{Status} is one of \const{done} if the alarm has been called, \const{next} if it is the next to be fired and \arg{scheduled} otherwise. \predicate{call_with_time_limit}{2}{+Time, :Goal} True if \arg{Goal} completes within \arg{Time} seconds. \arg{Goal} is executed as in once/1. If \arg{Goal} doesn't complete within \arg{Time} seconds (wall time), exit using the exception \const{time_limit_exceeded}. See catch/3. Please note that this predicate uses alarm/4 and therefore is \emph{not} capable to break out of long running goals such as sleep/1, blocking I/O or other long-running (foreign) predicates. Blocking I/O can be handled using the timeout option of read_term/3. \end{description} \section{Limiting process resources} The \pllib{rlimit} library provides an interface to the POSIX getrlimit()/setrlimit() API that control the maximum resource-usage of a process or group of processes. This call is especially useful for servers such as CGI scripts and inetd-controlled servers to avoid an uncontrolled script claiming too much resources. \begin{description} \predicate{rlimit}{3}{+Resource, -Old, +New} Query and/or set the limit for \arg{Resource}. Time-values are in seconds and size-values are counted in bytes. The following values are supported by this library. Please note that not all resources may be available and accessible on all platforms. This predicate can throw a variety of exceptions. In portable code this should be guarded with catch/3. The defined resources are: \begin{quote} \begin{tabular}{ll} \const{cpu} & CPU time in seconds \\ \const{fsize} & Maximum filesize \\ \const{data} & max data size \\ \const{stack} & max stack size \\ \const{core} & max core file size \\ \const{rss} & max resident set size \\ \const{nproc} & max number of processes \\ \const{nofile} & max number of open files \\ \const{memlock} & max locked-in-memory address \\ \end{tabular} \end{quote} When the process hits a limit POSIX systems normally send the process a signal that terminates it. These signals may be catched using SWI-Prolog's on_signal/3 primitive. The code below illustrates this behaviour. Please note that asynchronous signal handling is dangerous, especially when using threads. 100\% fail-safe operation cannot be guaranteed, but this procedure will inform the user properly `most of the time'. \begin{code} rlimit_demo :- rlimit(cpu, _, 2), on_signal(xcpu, _, cpu_exceeded), ( repeat, fail ). cpu_exceeded(_Sig) :- format(user_error, 'CPU time exceeded~n', []), halt(1). \end{code} \end{description} \section*{NetBSD Crypt license} \begin{code} * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * * This code is derived from software contributed to Berkeley by * Tom Truscott. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. \end{code} \printindex \end{document}