Saturday, February 25, 2006

Clustering programs

Lately, I have been working on refining my clustering program so that it functions without the need for sockets. After messing with the dirty things, I have found that file IO works better and more predictable than sockets. This does not imply that sockets never have any use. I am dealing with a machine that has restricted abilities to be connected to at any ports. This machine also has restricted abilities to send sockets out at any ports. My only link to these machines is through SSH.

The current plan has the central machine not only sending its input to all these machines, but a request-file. This file tells the servers, (1) what the program is. (2) the program's input, and (3) where the result should go to. The result isnt the output of the program but a low-bandwidth notification to the central machine that it has completed. The central machine at a later point probes for notifications. If all the notifications have arrived, and aggregation process occurs. As an added side-feature, a request-file may allow for sabotage. This is describes a situation where as soon as one server finishes it is obligated to kill other distributed processes. The sabotaging program then sends fake notifications from all the other processes.

The client is being written in Python and Common Lisp. For flexibility the server components were written in perl. The assumption being these machines should have perl installed. The most interesting aspect of coding this has thus far been effectively managing enormous file streams. Mismanaging this one crucial process makes the rest of the program pointless. It doesnt matter how fast your server programs parse the pieces of the input. A function that splits the original input in a horribly inefficient way is all it takes to kill a program.

0 Comments:

Post a Comment

<< Home