DistributePP
A distributed parallel processing tool for MATLAB® by Michael D. DeVore


terms of use operation server processes client processes communication files limitations download
DistributePP is a parallel processing package for MATLAB® which is intended to support coarse granularity parallelism across a heterogeneous computing network with access to a shared file system. It is designed to be
Lightweight
The entire package consists of eight source files (5 MATLAB scripts and 3 C routines) and less than 400 lines of code. Installation is simple, just add the DistributePP directory to the MATLAB search path.
Robust
File system locking controls server coordination so server processes may be killed and started with impunity. Node crashes do not interfere with the proper operation of server processes on other nodes. Such crashes typically do not result in unsatisfied requests because restarted processes or server processes on other nodes will automatically service the abandoned requests. Decentralized coordination leaves the underlying file system as the only single point of failure, and this can be eliminated with file system redundancy.
Flexible
The software supports many-to-many client and server interaction. Client requests are serviced in the order posted and new clients may participate at any time. Newly started server processes immediately begin to satisfy any pending requests.


If you use the DistributePP package, like or dislike the package, have any comments, feedback, or suggestions, or if you have found useful extensions, I would love to hear from you. Please contact me by e-mail or through the link to my homepage above.

Terms of Use

This software is distributed freely for use by anyone for any purpose provided that:
  1. any modified files are documented internally to clearly indicate that they have been modified from the original release; and
  2. it is recognized that no liability is assumed by Michael D. DeVore nor by Washington University for the use or misuse of the software.
The intention of point 1 is that there not be any confusion as to the originality of the software in case modified versions of the code are circulated. The intention of point 2 should be clear.

Operation

To use the software, server processes are started on all machines which will participate in the computation. These processes may be started as background processes and there may be multiple server processes on any machine. To request that one of these server processes carry out a computation, a client invokes a function similar to MATLAB's feval() function. A description of the request is returned to the client which may carry out other processing while waiting for the request to complete. The description may be used by the client to check the completion status of the request, to block the client until the request is completed, and/or to retrieve the result of a completed request.

Server Processes

To start a server process, log into the machine on which the process is to execute, launch MATLAB, ensure that the DistributePP package is in the search path, and execute the function PP_SERVER. This function accepts two parameters:
Options
This is a string of items which customize the behavior of this server process of the form 'opt1=val1&opt2=val2&....'. The option names opt1, etc. refer to properties that can be customized and the values val1, etc. specify their value. Currently two options are available:
Communication Directory
This is a path to a directory that will be used for communication between client and server processes. A server will watch only a single communication directory and a client may specify the directory to which a request will be posted. This allows the set of server processes to be partitioned arbitrarily with clients making requests of alternate classes of servers. If unspecified, the current default directory will be used.
Server processes may be run as background jobs under Unix and Unix-like operating systems. Simply create a MATLAB script that invokes the PP_SERVER() function and then quits, as in the script bgserver.m below:
addpath DistributePP

[LV_STATUS,LV_HOSTNAME]=unix('hostname')
LV_HOSTNAME = LV_HOSTNAME(1:(end-1));

PP_SERVER(['NAME=' LV_HOSTNAME '&PAUSE=15'],'')
quit
Then launch MATLAB as a background process taking standard input from the script and sending standard output and errors to a log file, as:
matlab < bgserver.m >& bgserver.log &
If the machine is remote, you can use ssh to login and start the server in one step:
ssh node 'cd bgpath; matlab < bgserver.m >& `hostname`.log &'
where node is the name of the remote machine, bgpath is the location of the file bgserver.m and the output is redirected to a log file bearing the name of the server on which it executes.

If a request results in an error, the server process checks specifically to see if it resulted from an out-of-memory condition. If so, the request is left unsatisfied, but it is not removed from the communications directory. Another server process, if one exists, will attempt to satisfy the request. A server process will not pick up a single request more than once. If the request terminates with any other error, the error message is recorded and the request is assumed to be satisfied. No other server processes will attempt to pick up the request.

Each time after satisfying a request, the server processes check to see if they should terminate. The function PP_STOP_SERVERS() can be used to indicate that one or more server processes should exit. Specify the communication directory and name of the server as parameters to the function. If no server name is given, all servers using the communication directory will exit. This provides an orderly termination for the server processes. This function operates by creating a file STOP_name.ind, where name specifies the name of the server that should terminate or STOP_ALL.ind to indicate that all servers should terminate. You must manually remove these files in order to start new server processes with matching names using the same communication directory.

Client Processes

To make a request for computation to be performed by one of the server processes, a client need only call the function PP_FEVAL() which has a syntax similar to that of the MATLAB function feval(). Along with some options, the client specifies the name of a MATLAB function to be executed and the parameters that should be passed to the function. PP_FEVAL() captures the state of MATLAB's search path (so the server can properly locate the requested function), prepares a request for a server to satisfy, and returns a descriptor for the request to the client. The descriptor can be used by the client to check on the status of the request, to block until the request is satisfied, and/or to collect the return value of the function when it has completed. PP_FEVAL() accepts the following parameters:
Options
This is a string of items which customize the behavior of this server process of the form 'opt1=val1&opt2=val2&....'. The option names opt1, etc. refer to properties that can be customized and the values val1, etc. specify their value. Currently two options are available:
Communication Directory
This is a path to a directory used for communication between client and server processes. If unspecified, the current default directory will be used.
Function Name
This is the name of the function to be evaluated by a server process. This parameter may refer to any MATLAB function that is visible in the context of the client, including built-in, external, user-supplied, and compiled functions. The only requirement is that the function yield a single return value.
Arguments
The remaining arguments to PP_FEVAL() are passed unmodified to the requested function when it is invoked by a server process.
As an example, the following function call initiates a request that the sum of integers from 1 to 10 be computed and does not wait for processing to begin before returning.:
>> myDescriptor = PP_FEVAL('','','sum',1:10);
The current status of one or more requests can be checked with the function PP_GET_STATUS() which returns the value 1 or 0 for each descriptor indicating whether or not it has completed, the name of the server process which picked up the request, and the date and time processing began. For example,
>> [myStatus,myNode,myTime]=PP_GET_STATUS(myDescriptor)

myStatus =

     1


myNode = 

    'NODE_1'


myTime = 

    '02-Feb-2002 12:35:21'

The result of computation can be retrieved when processing is completed by calling PP_GET_RESULTS() with a list of descriptors. An optional polling time parameter can be specified which dictates how often the routines checks for completion of the requests. If specified, the routine will not exit until all requests have been completed. Otherwise the routine returns computation results from all requests that have been satisfied. For each descriptor, PP_GET_RESULTS() returns a 1 or 0 indicating whether or not the request has been completed, the return value from the requested function, and a text string indicating the nature of any errors that may have occured during processing. For example,
>> [myStatus,myResult,myError]=PP_GET_RESULTS(myDescriptor)

myStatus =

     1


myResult = 

    [55]


myError = 

    {''}

Communication Files

A list of all files placed in the communications directory is given below:
prefix.seq
These files contain the state of various numeric sequence generators. These generators are used to ensure that unique names are generated for server processes and processing requests.
prefix_seq.req
A file indicating the presence of a processing request made by a client. The prefix prefix is taken from an optional argument to PP_FEVAL() and the sequence number seq is generated to result in a unique request name. This file is locked by the server which is currently processing the request and is deleted by the server when processing is successfully completed. In future versions, this file may be used by the client to contain a list of specifications a server must be able to meet (minimum memory requirements or processor type, etc.). Any server session which cannot meet those specifications will not pick up the request.
prefix_seq.mat
A MATLAB data file which contains the name of the function to be evaluated, the parameters to that function, the search path of the requesting client, and any specified options. When processing of the request is complete, the server process records in this file the function result and any error messages generated while processing.
prefix_seq.prc
An indicator file created by a server process indicating that the request has been picked up and that processing has commenced. It is deleted by the client when the processing results are captured.
servername.log
A log file with the processing history of the respective server process.

Limitations

Download

The DistributePP package is distributed as a .zip file which can be expanded with unzip. It has compiled output from the C programs for the Sun Solaris and SGI platforms. If you need to build for an alternate platform, execute the cshell script BUILD_DistributePP. To use, simply place the DistributePP directory in your MATLAB search path.

Click here to download DistributePP