Monday, September 14, 2009

Pipes and Filters

One system that I had a hand in building and designing (based in no small part on work done by another member of this class actually) used the same basic concept to perform automated remote backups based on some simple user inputs. The users could go in, setup when they wanted the system to run the task and how often, and then go through and select the various files or databases they needed backed up, as well as a destination site (typically a separate server). Since we were dealing with potentially any number of data sets to be backed up, and three different database styles to deal with, and since we had no idea what order they'd be done in, we built independent modules capable of processing one kind of data each and then a datafile to contain the listing for repeated lookups. So the pipeline built by this was actually dynamically creating the ordering, but was also referencing a static listing at each stage.

For multiprocessing concerns the best configuration would be one where the pipeline used the Active styling, so performance isn't hurt by having to start the process/thread each time it's needed, and the length of time it's going to work outweighs the time it takes to push data to it and pull the data back out of it.

If the process of the pipeline as a whole should be started as a user triggered action. Since we cannot predict the start of task, leaving the thread spinning and eating up resources is not the best plan. This means we will have to make sure that starting this process is relatively inexpensive.

No comments:

Post a Comment