|Next| |Contents| |Previous|


5. The FAKtory Pipeline:

Processing and Assembling Fragments

The FAKtory realizes a general framework for controlling a process pipeline in which fragments pass through a number of processing stages, each of which augments a fragment's record. The pipeline may be customized to contain any number of stages of different types, and each stage may be individually programmed to perform a specific task, such as vector prescreening, low-quality trimming, common repetitive element detection, etc. Each stage can operate on fragments in one of three modes depending on the extent to which the user wishes to supervise its operation. Fragments can be advanced through stages of the pipeline, rolled back to previous stages for reprocessing, and discarded if found defective. Problems that occur during the processing of fragments can be reviewed and remedied during the operation or can be logged for later inspection. This framework is quite general and independent of the FAKtory's specific application domain. It permits us to add additional types of stages within this framework as might be required by evolving DNA sequencing protocols or customer demand.

One may customize project pipelines to consist of any number of stages where the first stage is always an Input stage, and the last two stages must be an Overlap followed by an Assembly stage. The intervening stages can be of type clip, vector, tag, or prescreen and can occur in any number and order. The exact nature of these stages will be discussed later, but we note for now that the clip, vector, and tag stage types are all actually special instances of the more general prescreen stage type. Before data enters a project, setting up the pipeline is simply a matter of opening the Pipeline Customization panel and then adding or deleting cartoon boxes of the types of stages from a picture of the current pipeline. Each stage in the cartoon of the pipeline must be given a unique stage name. After the stages, their names, and orders have been settled on, each stage must be individually configured. Once fragments begin to flow through the pipeline, the pipeline is no longer reconfigurable (but the individual stages are).

During pipeline processing, a fragment may arrive at a stage whose operation cannot be applied to the fragment or whose operation is applicable but the results seem suspect. The former events are termed errors and the latter warnings. Collectively they are called exceptions. As a fragment passes through a stage, the status of the operation on that fragment -- OK, Warning, or Error -- is recorded and attached to the fragment along with any diagnostic messages in the case of an exception. Fragments tagged with an error in a given stage cannot advance through later stages in the pipeline, but fragments tagged only with warnings can. At any moment the FAKtory knows for each fragment: (1) the most recent stage it has passed through, and (2) the stage status and messages (if any) for every stage it has passed through. Item (1) can actually be used in a FAKtory query with the built-in function stage().

Before a set of fragments is ordered to move through some pipeline stages, a user can select, on the Fragments panel, to monitor the process in one of three modes -- Auto, Supervised, or Manual. In Auto mode, all processing takes place without any supervision from the user. Any errors or warnings are noted by placing them on the FAKtory's global review log. The user is not otherwise informed that any exceptions have occurred. In Supervised mode, the user is presented with each warning or error situation, and given an opportunity to either trouble shoot it immediately or defer making a decision until later by noting it as in Auto mode. Finally, in Manual mode, the user inspects the results of every stage on every fragment, being given the opportunity to modify the results if desired. FAKtory pipeline processing proceeds stage-by-stage rather than fragment-by-fragment, so that in the two interactive modes, the manager can conveniently focus on a review packet of the fragments requiring examination for a given stage, one stage at a time, in pipeline order. One may also select to process fragments in a Custom mode in which case fragments pass through the stages in the, possibly different, modes specified for each on the Pipeline Modes Preferences panel.

The flow of fragments through a given pipeline is effected via controls on the Fragments panel. In a forward flow through the pipeline, called an advance, the fragments involved can either come from the FAKtory's database or from external input files, or both, depending on the control settings. If some of the fragments are being input, then depending on settings in the Input Preferences panel, either (1) a dialog appears in which the user can specify the files to be input, or (2) the FAKtory automatically imports all input files in a current directory (also settable) that have not yet been imported. If the source of fragments is the FAKtory database then the desired fragments should be the contents of the current fragment selection. To initiate the advance one simply picks a pipeline stage say S in the Advance menu. The selected fragments do not all have to be in the same initial pipeline stage, each fragment is advanced from whatever stage it last passed through until it passes through stage S. Fragments already in an error state, or those that become so during processing, do not proceed beyond the offending stage. If a selected fragment has already been through stage S then the command has no effect upon it. Fragments being imported through files always pass through the first Input stage whereupon they become part of FAKtory's database. Fragments may also be rolled back to previous states or discarded from the database (conceptually a roll back to a stage prior to Input). While fine-grained control of the fragment flow is possible with this schema, note that selecting to advance to the Assembly stage with the source specified (by default) to be all un-imported files, causes all available files to be input, processed, and assembled.

Both the review packets generated during interactive mode pipeline advances and the FAKtory review log of noted events, can be examined in order to oversee operations or make decisions about how to deal with pipeline exceptions. Both review packets and the global review log are conceptually lists of (fragment,stage) pairs we call events. These review lists can be processed with the Review Summary panel and Review Events panel. The Review Summary panel presents a review list as a scrollable list of the events with associated messages if they involve exceptions, further organized and displayable according to the stage(s) involved. The Review Events panel displays one event from the review list at any given moment, and permits the user to move forward and backward through the review list with a set of console buttons on its lower border. Within the Review Events panel is an editing panel specific to the stage for the event currently under examination. For example, for a clip-stage the editing panel shows the waveform or sequence of the fragment and the regions clipped by the stage. The user can modify the clip if desired. Thus the Review Summary panel is a way to manipulate the entire review list and the Review Events panel panel is a way to walk through a review list, editing the results of a stage on a fragment if desired. Note carefully, that the two panels work in concert: one can switch from one panel to the other and back again during the processing of a given review list.

To determine the disposition of the events in a review list, the user can mark each event with a status of Note, Approve, or Discard from either review panel. When one exits either panel, thus ending a review session, all noted events are posted to FAKtory's review log (if they are not already on it), the fragments of all discarded events are discarded from FAKtory's database, and the fragments of all approved events are given a stage status of OK for the stage of the approved event and resubmitted to the pipeline.

Finally, in overview the review system functions as follows. During an interactive mode advance, the Review Events panel is brought up on each review packet as it arises, effectively initiating an interactive review session. The user reviews the events, possibly editing and then approving some events, marking others as discardable, and postponing making decisions on yet others by marking them as noted. If need be they can flip to the Review Summary view of the packet if it helps them visualize their decision process. The review session is completed by exiting the current review panel. At that point, the events in the review packet are disposed of according to their event marks and the packet disappears. The other scenario involves a user, at a time of their convenience, deciding to review the events that have been posted to the FAKtory's review log. This is done by pressing a button on the Fragments panel to invoke either of the review panels on this special review list. The remainder and exit from the review session are identical to the interactive case.



|Previous| |Contents| |Next|