LONI Pipeline User Guide

  1. Introduction
  2. Installation
    1. Requirements
    2. Downloading
    3. Setup and launching
  3. Interface overview
    1. Server library
    2. Personal library
    3. Workflow area
    4. Connection manager
    5. Preferences
    6. Search feature
    7. Checking for latest updates
    8. Running from the command line
  4. Building a workflow
    1. Dragging in modules
    2. Connecting modules
    3. Setting parameter values
    4. Data sources and sinks
    5. Processing multiple inputs
    6. Enable/Disable parameters
    7. Annotations
    8. Variables
    9. IDA
  5. Execution
    1. Validation
    2. Executing a workflow
    3. Pause and Stop buttons
    4. Viewing output
    5. Debugging execution
    6. Report a Bug
    7. Client Disconnect/Reconnect
  6. Creating modules
    1. Simple modules
      1. Module tab
        1. General module information
        2. Citation information
      2. Parameters tab
        1. Executable location
        2. General parameter information
        3. Parameter arguments size
        4. Parameter types
        5. User defined filetypes
        6. Advanced parameter information
          1. Select dependencies
          2. Transformations
    2. Module groups
  7. Advanced Topics
    1. Syncing Execution Flow

1. Introduction

The LONI Pipeline is a workflow processing application that can be used to wrap any executable for use in the environment. No change to your programs (adding code, implementing interfaces, etc.) is required for you to use the program within the LONI Pipeline. All that you need is an understanding of the program's command line usage and you can begin using it in the Pipeline. For a visual introduction to using the Pipeline, you can take a look at some screencasts which cover a variety of different Pipeline topics.

2. Installation

2.1 Requirements

The only requirement of the Pipeline client is an installation of JRE 1.5 or higher, which can be downloaded from Sun. In terms of memory consumption, it's unlikely that you'll need to worry about having sufficient RAM to run the Pipeline.

2.2 Downloading

To get the latest version of the LONI Pipeline, go to the Pipeline web site and click on the download link in the navbar at the top.

2.3 Setup and launching

OS X: To install the program, double click the disk image file you downloaded, and drag the LONI Pipeline application into the Applications folder. Once the program is done copying you can unmount (eject) the disk image and throw it in the trash. To start the Pipeline, just go to your Applications folder and double-click on the LONI Pipeline application.

Windows: To install on Windows, double-click the installer and follow the on-screen instruction. Once it finishes installing, you can throw away the installer and launch the program by going to the Start menu->Programs->LONI Pipeline and start the program.

Linux/Unix: Extract the contents of the file to a location on disk, and execute the PipelineGUI script. Make sure you have the java binary in your path.

3. Interface overview

Picture of the LONI Pipeline

3.1 Server library

When you launch the Pipeline you'll notice the area to the left which just says 'Package' in it. This is the server library where you can see a list of all the modules available to you when you connect to different servers. Since this is your first time launching the program, you have nothing listed there, but we'll connect to a server soon to get access to some tools.

If you want to gain access to the tools LONI has made available through its Pipeline server apply for an account and after connecting to the server (cranium.loni.ucla.edu) you will see the library populated with all the tools available on the server.

Even after you disconnect from a Pipeline server, the list of tools on that server will still remain cached on your computer so you can construct workflows while you are offline.

3.2 Personal library

The Personal library can be accessed by going to 'Window->Personal Library.' This is where your library of workflows and personally created modules are stored. Simply select a directory (or use the default one specified in the preferences) and save all your personal modules and workflows in there. Then when you open up your personal library, you'll see all the modules in there, and you can drag them in or open up copies of them quickly and easily.

The main feature of the personal library is that it can be used in just the same way as the server library to construct workflows. If you have defined lots of modules that describe executables on your local computer or even module groups, you can drag in copies of them to create even more sophisticated workflows.

3.3 Workflow area

The large area with a faucet logo on it is the workflow area. You can open a workflow or start a new workflow from here by clicking on one of the buttons at the center of the area.

You can also see a list of recently accessed workflows, but that's not visible because we don't have any recently used workflows yet.

3.4 Connection manager

Often times you will want to connect to different Pipeline servers to get access to their tools and build workflows out of them. To bring up your list of connections, go to the 'Window' menu and click on 'Connections...'. Alternatively, you can click on the disconnected circles at the bottom right of the window, and in the popup menu click on 'Connections...'.

Userguide_ConnectionManager

In here you can add a connection to any Pipeline server that you want to access. If you don't know of any servers you can add the LONI Pipeline server (cranium.loni.ucla.edu) but you will need to apply for an account to actually connect to it. Once you've entered the connection, go ahead and click 'Connect' then close the dialog. After 30 seconds or so you'll notice that your server library has been populated with tools from the server.

3.5 Preferences

Bring up the preferences dialog:
OS X: Go to the 'LONI Pipeline' menu and select 'Preferences.'
Windows: Go to 'Tools'->'Options'
Linux/Unix: Go to 'Edit'->'Preferences'

Cache Tab:

  • The execution cache directory is where all intermediate data is written to while executing a workflow.
  • The library module cache can be cleared here (similar to a web browser cache).

Execution Tab:

  • Maximum simultaneous jobs is the number of jobs that will run in paralle when executing a workflow that has many jobs that are ready to submit at once. Once this limit is reached, the other jobs will be queued up until other ones complete.
  • It doesn't hurt to leave DRMAA enabled, but if you have a grid engine installed (SGE, condor, Torque/PBS...) it may cause problems, so you can disable it.

The Search feature in both the Library and Personal Library panels allow you to query for the modules that reside in the respective menu panes. The Search function will return results drawn from the module's name, author list, citations, tags, description, and parameter fields.

 Search 

3.7 Checking for latest updates

In the Help menu, you can check to see if you have the latest version of the Pipeline client by clicking on "Check for Updates..." - if you do need to download the client, you can find it on the Pipeline website at http://www.loni.ucla.edu/Collaboration/Pipeline/Pipeline_Download.jsp.

3.8 Running from the command line

Sometimes you may not want to use the graphical interface to the Pipeline (you may have several workflows that have minor differences in them, but you want to run them all at once). To launch the Pipeline's command line interface, you need to go into the directory where your Pipeline.jar file is located:

  • OS X Iinside the OS X bundle at LONI Pipeline.app/Contents/Resources/Java.
  • Windows In your install location, which is usually C:\Program Files\LONI\Pipeline\.

Once inside that directory, you can simply launch it with java -cp Pipeline.jar ui.cli.Main (Linux/Unix users can take advantage of the startCLI.sh script included with the download). If you provide no arguments, you will be presented with usage instructions. Here are a couple notes to supplement the usage text:

  • The -sync flag is required for a workflow that executes locally, otherwise the Pipeline will exit and the execution will not complete.
  • You can bind values to different parameters in the workflow using the syntax shown in the usage instructions. All bindings specified on the command line are in addition to any values already bound to them. Multiple bindings to the same parameter id will also be added onto the same parameter.
  • Some example bindings:
    • MyModule.OutputFile_0=pipeline://localhost//home/user/Desktop/myData.nii.gz
    • YourModule.InputImage_0=pipeline://localhost/C:\data\yourData.nii.gz
    • HerModule.InputAtlas_1=pipeline://cranium.loni.ucla.edu//usr/local/data/atlas1.img
  • -display is printed out in xml for easy parsing
  • -validate prints out all missing parameter bindings exceptions in xml. All other exceptions are printed out at the beginning in human readable form.

4. Building a workflow

For this example, we're going to build a workflow from modules provided to us by the LONI Pipeline server. You don't need to use the LONI server to create workflows though, and you can make your own modules as described later in this guide. First, open a new workflow by going to File->New.

4.1 Dragging in modules

Go to the server library at the left and expand the 'AIR' package. Click on the 'Align Linear' module and drag it into the workflow canvas that you just opened. Next drag in the 'Reslice AIR' module under the same package. Your screen should something like this.

undefined

Please note that in the current release of the LONI Pipeline, all modules that are used in a workflow must be from the same server. For example, you cannot mix modules from the LONI Pipeline server and modules from the Acme Pipeline server. We plan on releasing multi-server functionality in the future, but do not have a timeline at this point for the feature.

4.2 Connecting modules

Each module in a workflow can have some inputs and outputs. The inputs are on the top, and the outputs on the bottom. Go ahead and connect the output of the 'Align Linear' to the input of 'Reslice AIR.'

When you attempt to make a connection, the Pipeline does some initial checking to make sure the connection is valid. For example, it won't let you connect a file type parameter to a number type parameter, or connecting an output to another output and more.

Side note: the Pipeline supports the connection of a single output parameter to multiple input parameters, as well as the connection of multiple output parameters to a single input parameter. In the first case, the value of the output parameter is simply fed into all of the subsequent input parameters. In the latter case, the multiple outputs are all executed as a part of one command using the input parameter module's executable.

One_Output_Multiple_Inputs Multiple_Outputs_One_Input

4.3 Setting parameter values

Now we need to set the values of each of the input parameters on the 'Align Linear' module. Double-click on the left most parameter and select an image atlas. This is a neuroimaging specific file type so you may not have one. You can double-click on each parameter afterwards and enter a value for each one.

Once you've set the inputs of 'Align Linear' you'll want to specify a destination for the output of the 'Reslice AIR.' Double-click on its output parameter and specify the path and a filename you want the file to be written to.

undefined

Note that you can mix data that is located on your computer and the computer that the server resides on, and the Pipeline will take care of moving data back and forth for you. For example, the input to the 'Align Linear' could be located on your local drive, but you could set the output of the 'Reslice AIR' to be written to some location on the Pipeline server or vice versa.

4.4 Data sources and sinks

Sometimes you will want to use a single piece of data as an input to multiple modules in a workflow, or you just want to make the workflow easier to understand. In these cases you can take advantage of sources and sinks. Just right-click on any blank space in the workflow cavas and select 'Add Data Source.' In the dialog that opens enter some information about the data source, and then click on the 'Data' tab. From here, you can click on 'Add files' at the bottom of the dialog and multiple files into the list, or you can just type in the path to a file manually. Note that at the bottom there is an option for a server in case you want the data source to represent data on another computer.

undefined

Using this same method, you can right-click on the canvas and select 'Add Data Sink' for use in your workflow.

4.5 Processing multiple inputs

One of the strengths of the LONI Pipeline is its ability to simplify processing of multiple pieces of data, by using the same workflow you use to process a single input. The only change you need to create a data source to hold the multiple inputs. The data source can then be used as the input to any module in the workflow.

You can even provide multiple inputs to multiple parameters. For example, if you have a parameter on a module with a data source feeding in 4 inputs and another parameter also with a data source feeding in 4 inputs the Pipeline will submit 4 instances of that module for execution with each pair of inputs being submitted together. If you were to bind 4 inputs to a data source, and 5 inputs to another, the Pipeline would submit 20 instances of this module for execution. The commands will be composed of the dot product of all the inputs provided.

Alternately, you can use a .list file (a file ending with a .list extension which contains the path to all input files) to specify multiple input files.

Note that the cardinality of modules will be matched up whenever possible in the workflow, and whenever there is a mismatch, the inputs will be multiplied. Here is an example to illustrate.

undefined In this workflow the Pipeline will execute 4 instances of every module.
undefined In this workflow modules A and C will have 4 instances. Module D will have 5 instances and module B will have 20 instances.

Also, it is worth mentioning that it is valid to connect two output parameters to the same input parameter. Let's look at the example below:

undefined

Let's say that module A creates an output file called A_OUTPUT and module B creates an output called B_OUTPUT. Module C describes the GNU copy command, and has two input parameters - Source and Target, both taking one argument. The output parameters of module A and B are connected to module C's Source input parameter. Finally, let module C's Target parameter be bound to some target path, "/nethome/users/someuser/".

The resulting execution is as follows - module A and B will run and create their respective output files, and module C will then execute two commands:

cp A_OUTPUT /nethome/users/someuser/
cp B_OUTPUT /nethome/users/someuser/

If the location you're running this workflow at has a cluster, the pipeline will run both commands concurrently; if a cluster is not available, both commands will run in series and wait for completion before moving on to any subsequent modules.

4.6 Enable/Disable parameters

Most modules have 2-3 required parameters on them, and several more optional parameters. If you want to exercise any of those additional options, simply double-click on the module and you'll see a list of all the required and optional parameters for that module. For each additional option you want to use just click on the box on the left side of its name to enable it. Conversely, to disable it click on the box again. Notice that you are not able to disable parameters that are required.

4.7 Annotations

As your workflow becomes larger and larger at times you may forget what a particular section of it was meant to perform. To help jog your memory, you can add annotations to your workflow to remind you what you were doing later on, or as notes for other people who use your workflow. To add an annotation, right-click on an empty area of the canvas and select 'Add Annotation.' Type your text into the dialog that pops up and click OK. You should see a translucent box appear in your workflow where your clicked. You can move the annotation around by just clicking and dragging.

undefined

4.8 Variables

To make things easier when entering values for module parameters, you can define variables to represent a path name that can then be used as the input or output to a module parameter. You can access the variables window by going to Window -> Variables. Click on the Add button, then type in the Name (whatever you want to call the variable) and the Value (the path associated with the variable). If you want to continue adding more variables, click on the Add button again; otherwise, simply close the Variables dialog box. Now, in order to use a variable in your workflow, you use the convention {variableName} as the value for your input and output parameters (i.e. surround the variable name with curly braces). The Pipeline will parse the actual path location of the variable for you when it executes.

undefined 

4.9 IDA

The Pipeline has the capability to utilize data from the LONI Image Data Archive (IDA). You can download files from the IDA database, but there is no way to upload to the IDA database at the moment due to restrictions with IDA. In order to establish a connection to the database, go to Tools -> IDA Database. Enter in your username and password and click Connect. You will see on the right pane the data that you have access to through the IDA (you will have to either upload your own data through the IDA web interface at https://www.loni.ucla.edu/ida/login.jsp, or log into IDA and put existing files into your account). Select the files that you want to process with the Pipeline, and specify a path for the files to be downloaded to (the files must be pulled from the IDA database to some location either on your local machine or onto a server). If the destination is remote, check the Remote box and specify the server name. Click on Download, and the files will be put in the directory you specified.

Once this operation is complete, look into the directory and you will see your IDA files along with a .list file that has been generated to point at the paths for each of your files. Note: the list file generated pulls in all of the contents of whatever directory you specify, so it is wise to point the download location to a newly created empty directory. You can now use the contents of this list file in a Data Source in your workflow.

undefined

5. Execution

5.1 Validation

When you execute a workflow, the first thing the Pipeline does is validate it and check for errors, but you can do that without actually executing the workflow as well. To start the stand alone validation go to Execution->Validate, and validation will automatically begin. If a connection is needed to a server the Pipeline will prompt you for a username and password. If any errors are found a dialog will pop up listing all the errors found in the workflow.

undefined

If your workflow is very large, you may want to run validation periodically on it as you're building to check for errors early on. Of course, you don't have to but it will be run at the beginning of execution again (unless it's disabled in the preferences).

5.2 Executing a workflow

Once you've completed your workflow, you can execute the workflow by simply clicking on the 'Play' button at the bottom of the workflow area. If the program needs a connection to a server, it will prompt you for a username and password. If you've already stored a username and password to the server in your list of connections, then it will automatically connect for you.

Once all necessary connections have been made and validation has completed the workflow will begin to execute.

undefined

5.3 Pause and Stop buttons

While a workflow is executing, you can press the Pause or Stop button if desired. The Pause button allows you to pause the execution of a workflow, and no longer submits any more modules to be executed. You can even quit out of the Pipeline at this time and still be able to return to your paused workflow later. To resume a paused workflow, open up the Pipeline client if needed (the workflows will pop back up), and simply press the Play button. The workflow will resume processing and modules will once again be submitted for execution.

If you press the Stop button, then execution of the workflow is permanently stopped. There is no way to resume execution of the workflow at the point when you pressed Stop.

5.4 Viewing output

As the modules continue executing you can view the output and error streams of any completed module. You can bring up the log viewer by going to Window->Log Viewer or more easily, right-clicking on the module that you want to view information about and click on 'Show Output Logs.' This will bring up the log viewer and set its focus on the module that was clicked.

undefined

Once the log viewer is open, in the left hand column you can select the instance of the module that you want to view output for. In our example, we only have one instance, but if you provided multiple inputs to a single parameter of a module you would see just as many instances in the viewer's left hand column. Additionally, you can filter the column to just the instances that you're interested in seeing by typing in a range into the 'Instances' textfield and hitting enter. Some examples of possible ranges could be (assuming the module had 100 instances):

  • 0-50,88
  • 1,5,8,44,22-33,99

When you select a particular instance, you will see information about the server that it resides on in the 'Info' tab on the right hand side. An important piece of information you'll see if the command string that was submitted to the server to execute that instance of the module. You may want to take advantage of that piece of knowledge if you're ever trying to debug a module definition or trying to determine the cause of a problem in your Pipeline server setup.

The output and error log tabs each contain the data captured from the application's output and error streams, respectively. The output files tab contains a list of all the files created by that instance of the module, and allows you to download them to your local system, by selecting the files you want and clicking 'Get Files.' If you want to get all the output files of all the instances of a module, select all the instances you want in the left-hand column, then select all the output files in the right-hand tab and click 'Get Files.'

5.5 Debugging execution

Inevitably, some of the instances (or all of them) of a module will fail sometimes and the module will have a red ring around it denoting the failure. In this case, using the log viewer as mentioned in the previous section will show all the failed instances of the module highlighted in red. With the information from the output and error stream you can diagnose nearly all the problems you may encounter while executing a workflow.

5.6 Report a Bug

If you find a bug in the Pipeline, you can file a bug report through the Pipeline client. Select Help -> Report a Bug from the top menu bar. If desired, fill out the optional fields for name, email and Pipeline server username. You can also attach the workflow being processed and enter in any details about the bug. Please be as specific as possible in your bug description. Submitting the form will send the Pipeline team an email with all of the information, allowing us to debug your problem.

Report A Bug

 

5.7 Client Disconnect/Reconnect

In the case that you want to start a workflow and then check the progress on a different computer (i.e. if you start the workflow at work and want to check on the results from home), the Pipeline has the capability to accommodate this. After you have pressed play on the workflow and it is executing, quit out of the Pipeline. Make sure that you do NOT press stop, otherwise the workflow will stop running. Your workflow continues to execute even though the window is no longer open. To see the executing workflow again, start up the Pipeline client and use the Connection Manager to connect to the server on which you are running the workflow. Your workflow will pop up automatically along with any other active workflows that may be running, and you can continue to monitor its progress.

Note that in order to keep a completed workflow from popping up the next time you connect to the server, you must press the Reset button.

6. Creating modules

If you're going to be executing local executables in workflows, or setting up your own server you're gonna need to learn how to make module definitions. There are two types of modules that we can create: simple modules or module groups.

6.1 Simple modules

To create a simple module definition, open a workflow and then right-click on any blank part of the canvas. In the popup menu, click 'Create Module Definition' and you should be presented with a module definition window.

6.1.1 Module tab

When creating a module, whether it's a simple module or a module group, you will always encounter this tab for adding information about a module. While none of it is required, it helps to have the information because an unmarked circle in a workflow isn't helpful to anyone.

undefined

6.1.1.1 General module information

  • Module Authors is a list of all the authors who contributed in describing the executable's Pipeline definition (this would include you :-) )
  • Executable Authors is a list of all the programmers who contributed to writing the executable code.
  • Package is the name of the suite that the executable is a part of. For example, Align Linear is a part of the AIR package, Mincblur is a part of the MNI package, etc.
  • Version can refer to the package version or the individual executable version depending on how the developer manages their versioning. Use your best judgement to decide what would help users of your module definition more.
  • Name is the human readable name of the executable that you're describing.
  • Description should describe what the program does and any pertinent information that might help a user who wants to use the module.
  • Icon In the top right corner of the tab is a large square button. Click on it to select an image for use as the icon of this module. You don't have to worry about adjusting the size of the image to any special dimension (the Pipeline will take care of that for you), but the larger the image, the longer it takes for it to be moved into memory and resized and displayed.

6.1.1.2 Citation information

When creating a module definition, it's a good idea to enter citations of the papers/presentations/etc. that we're used to develop the module. When this information has been entered, users can easily be linked to the citation material through the use of Digital Object Identifiers (DOI) or PubMed IDs.

To add a citation to the module, click on the 'Edit' button next to the citations pane. A new dialog will appear, and you can click the 'Add' button and type in a citation in the new text box that appears below. If you want linkable DOIs or PubMed IDs just make sure to type them in the format defined in the window, and the Pipeline will take care of the rest. An example citation could look like:

Linus Torvalds, Bruce Schneier, Richard Stallman. Really cool research topic.
In Journal of High Regard, vol. 2, issue 3, pages 100-105. 
University of Southern California, April 2007. 10.1038/30974xj298 PMID: 3097817

You can even enter your citation information in bibtex format. When you've entered them all, click OK and you will see links to the DOIs and PMIDs that you've written into the citations.

6.1.2 Parameters tab

undefined

The parameters tab contains information describing the command line syntax of the executable you're describing. As a learning aid, we can use a fictional program called foo with a command line syntax of:

 foo [-abcd -e arg -farg1 arg2 arg3] file1 [file2 ...] -o outputFileArg

You'll notice our program has several optional parameters at the beginning with only two required parameters towards the end. Now let's go about describing this in the Pipeline.

6.1.2.1 Executable location

The first thing you'll want to do is specify the location of the executable. If this is a program on your local computer, just browse to the location of the program and select it. Please note that jar files (java executables) can not be directly executed through the Pipeline. You will need to wrap those in a script that launches the program. Here is an example of such a script:

#! /bin/bash
/path/to/jre/java -jar MyJarFile.jar $@
exit $?

The sample script executes the jar file and passes all arguments passed to it directly on to the jar file, and finally returns the same value that the jar program returned. Most likely, you won't encounter many jar files so you won't even have to worry about this.

If you're setting up a server and you're defining modules for use on it, then make sure you check the 'Remote' box, and type in the server address in the box, and that the path to the executable is the path of the executable on the computer the server is running on.

6.1.2.2 General parameter information

If we look back at our fictional program command line syntax, we see it has 8 total parameters. Let's start by adding the first 4 which are:

  • -a
  • -b
  • -c
  • -d

All four are optional and don't require any additional arguments to them, so go ahead and click the 'Add' button 4 times to add 4 new parameters. Now for each parameter, edit the name to something meaninful and then in the bottom half of the window change the 'Arguments' selector box to '0', which tells the Pipeline that these parameters don't take any arguments from the user. Additionally, for each parameter, fill in the 'Switch' field in the lower part of the dialog to the appropriate value (-a or -b or -c or -d). At this point you may want to fill in a description for each parameter, so users will know what they do when they are turned on.

Because these parameters don't take any arguments we don't need to set the 'Type.' So far your screen should look something like the following figure:

undefined

Now that we've added the first four, let's work on the next two parameters: -e and -f. Click 'Add' once for each parameter, and the Pipeline will add 2 more new parameters for you. Notice the order that you define the parameters, because that order is what the Pipeline will use to construct the command that gets issued to the system when it's executing workflows. In case any of your parameters are out of order, just click and drag them each into the order that you want.

Again, both of these parameters are optional so there's no need to check the 'Required' box in the parameter table. However, each of these are 'String' type parameters, so change the type from the default 'File' to 'String.' Also, notice that the -e takes in 1 argument and the -f takes in 3 arguments. Adjust each accordingly like you did with the previous parameters. Finally, enter the switch for each and give a helpful description of what each one does, so the end user can figure out how to work with the module.

There's something peculiar about the -f parameter and that's that it does not have a space separating it from its arguments on the command line. To tell the Pipeline about this in the module definition, uncheck the checkbox labeled 'Space after switch.'

Let's add the next parameter, so click 'Add' to place another parameter into the defintion. Another thing to notice about this parameter is that it takes 1 or more files, so we should set the 'Arguments' selector box to 'Infinite.' Also, because this parameter takes files as its arguments, we leave the 'Type' set to the default, however we can tell the Pipeline a little more about this parameter by selecting the specific type of file that the program expects, so let's select 'Text file.' This will help the Pipeline in checking for valid connections between different modules, or helping users in selecting files from their computer to be bound to this parameter when using the module. If the file type needed for a parameter that you're defining is not listed, you can just leave it set to 'File,' which will accept any type of File.

Go ahead and add the last parameter (-o outputArgFile) to the definition. Make sure to uncheck the input checkbox in the parameter table next to this parameter. Your definition should look something like this:

undefined undefined 

6.1.2.3 Parameter arguments size

Every parameter in the Pipeline needs to be assigned a number of arguments that it needs to accept (enumerated types are set to 1 automatically). In most cases this is simply some constant number (1,2,3,4,5,5,...) or it can be any number (Infinite). Sometimes the parameter arguments size depends on another parameter and in this case we have a special arguments value of 'n'. If you select 'n' for the arguments size, a textfield labeled 'Base' will appear, where you enter the name of the parameter that the 'n' sized parameter is based on. Then when the module is executed in a workflow, the base parameter will have a number of arguments equal to the base parameter, which should have its arguments size set to 'Infinite' for any practical purposes. Let's demonstrate this with an example.

Suppose you have a program that can take in an (theoretically) infinite number of inputs on the command line, and will process each of those inputs and create a corresponding output. Our command line syntax would look like the following:

./foo -inputs in1 in2 in3 in4... inn -outputs out1 out2 out3 out4... outn

So if we have 25 input files, we'll have 25 output files. To describe this in the Pipeline, make a new module with two parameters; one input and one output. Make the arguments size of the input 'Infinite' and the arguments size of the output 'n' with its base set to the name of the input parameter. Your module should then look something like the next figure:

undefined

6.1.2.4 Parameter types

When you come across programs that need other types of parameters, refer to this list for information about each type supported by the Pipeline:

Directory
Choose this type for inputs when a program expects the path to an _already existing_ directory.
Choose it as an output parameter if the program expects it as a path to write data out to. Please note that the Pipeline will not create output directories for programs. It will specify a path for a directory to be created at when generating commands, but the actualy directory creation is left up to the program.
Enumerated
This should be used for input parameters that accept an option that can be only from a limited set. For example, a program might one of the following: "xx", "yy", "zz".
File
The most common type of parameter, but can be further categorized by choosing a file type defined in the Pipeline. In the future, support will be added for user defined filetypes. (NOTE: Choosing file types allows the pipeline to establish connections between complementary parameters, and appends appropriate extension to intermediate files being created between modules, which some programs rely on.)
Number
Either a integers or floats
String
Any string of characters required by parameters

6.1.2.5 User defined filetypes

If you have a module that has an input parameter of type File, you must specify at least one file type for the parameter. It can be the generic File, or a specific type of file. If you need to define a new file type, click on the New button. Enter in the Name, a description of the file type, the extension, and also any need file(s) that have to be associated with this file type. Click OK, and the newly defined file type will be added as one of the options in the Acceptable file types window. Please note: the Pipeline determines filetype compatibility between connected parameters solely by checking for matching file extensions. The name and description of filetypes is not compared during compatibility tests.

undefined 

6.1.2.6 Advanced parameter information

While describing executables for use in the Pipeline, you will inevitably come across the need to use some of the advanced parameter features in the Pipeline. Right-click a simple module and select 'Edit Module Definition' to bring up the editing dialog for the module. Click on the Parameters tab, select a parameter you want to edit, and then click on the 'Advanced...' button at the bottom right of the dialog

6.1.2.6.1 Select dependencies

On the left side of the advanced parameter dialog, you'll find a list of all the parameters in the module, except for the parameter that you're currently editing. By checking a box for each dependency, you're telling the Pipeline that if a user enables the current parameter (the one you're editing), then you must also enable the following parameters (the ones you check in the advanced parameter dialog).

6.1.2.6.2 Transformations

Sometimes an executable will take in an output and will automatically create an output that is just some variation of the input. Let's use an example:

./foo infile

Let's assume the program creates the output to be the same name as the input but with a .out appended to it. To handle this, create an output parameter in the 'Parameters tab' and then click on the 'Advanced...' button of the output parameter. In the 'Transformations' area of the parameter set the base to the name of the input parameter. Then select the 'Append' transformation operation from the selection box and type in .out for the value. Click 'Add' and you're done! You've just created a side effect output. Note that as a result of specifying a base parameter in this dialog, the Pipeline will not place this parameter on the command line. It will simply use the transformed name as the location of the output and pass that on to successive modules for usage. Here are descriptions about how the other transformations work:

Append
Add a string or regular expression to the end of the filename. Example: append:xxx
/tmp/myfile.img becomes /tmp/myfile.imgxxx
Prepend
Add a suffix string or regular expression to the filename. Example: prepend:xxx
/tmp/myfile.img becomes /tmp/xxxmyfile.img
Replace
Replaces every occurrence of the find value with the replace value.
Example: find:my replace:your
/tmp/myfile.img becomes /tmp/yourfile.img
Subtract
Remove the string or regular expression from the end of a file. If the string is not found at the end of the file, nothing will happen.
Example: Subtract .img /tmp/myfile.img becomes /tmp/myfile
Example: Subtract .hdr /tmp/myfile.img stays as /tmp/myfile.img

Note that the transformation operations are only applied to the filename of the base parameter, not the entire path. Also, if you don't specify a base parameter, then the Pipelie will put this parameter on the command line, and will apply the transformations to the path string that gets passed on to the next module. If the parameter is an input, the transformations are applied to the incoming path string and then put on the command line. The transformations never change the actual filename, just the way references to them are made on the command line.

6.2 Module groups

As you continue to use the Pipeline, you will notice that your workflows are overflowing with modules. You might also have a grouping of a few modules together in many of your workflow that performs the same basic operation in all of them. In the spirit of promoting reusability and clean looking workflows, the Pipeline can represent a group of modules as a single module in a workflow. To demonstrate, let's use an example that is a combination of multiple modules available in the LONI Pipeline server library. If you don't have an account to the server, just follow along in the program and check the screenshots provided.

First off, make sure you've connected to the LONI Pipeline server before so you have the LONI library. Now we're going to create a reusable module group that performs an image registration and reslice.

  1. Drag the 'Align Linear' and 'Reslice AIR' modules into a new workflow
  2. Connect the output of 'Align Linear' to the input of 'Reslice AIR.'
  3. Double-click on the 'Module Number' parameter of 'Align Linear' and set it to any one of the values (doesn't matter what you set it to for this exercise)
  4. Right-click on the output of 'Reslice AIR' and click 'Export Parameter.' This will make the parameter visible on the outer module group (you'll see what that means in a second)
  5. Repeat step 4 on the 'Standard Volume' and 'Reslice Volume' parameters of the 'Align Linear' module as well.
  6. Now go to 'File->Properties' so we can fill in some info about this. Give the module group a name and a description and whatever else you want to fill in. You can even add an icon if you want. When you're done, click OK.
  7. Save the workflow into your personal library directory.

Now if we want to use this module group inside other workflows, all we have to do is open up the personal library, and drag in the module we just made (if your personal library was already open, click the refresh button in your personal library after you save the workflow for the module group to become visible). By default, it will be listed under the package name specified. If you did not specify a package name, it will be under 'Unknown.' Once you've found it, drag it into a workflow and bask in the fruits of your labor :-).

undefined

As you can see, only the parameters that you exported are visible on your module group. This allows you to hide the complexity of the inner modules, which is quite beneficial when you encapsulate very large and complex workflows. You could theoretically have a module group that contains dozens of modules with just a single input and ouput if you're task allowed/benefited from it.

Now it's nice to be able to hide all that complexity in a workflow, but sometimes you really need to get into it, so if you just double-click on a module group you'll zoom into the module and see its contents. If you notice the clickable 'Module Groupings' bread crumb bar at the top of the workflow, it will let you traverse through the levels in the workflow that you're viewing.

undefined

7. Advanced Topics

7.1 Syncing Execution Flow

If your pipeline requires the sequential execution of modules, but the modules do not have any dependencies on each other to regulate the ordering of the execution, then there is a way you can construct your workflow to preserve the order of execution. Let us look at a concrete example.

Module A feeds into Module B which in turn feeds into Module C. Module A creates an output directory /Test/MyDirectory/, which is used by both Module B and Module C. However, Module B does not create any output, and therefore there is no dependency between Module B and Module C, meaning that Module C is not guaranteed to wait until Module B completes before it starts executing. This is illustrated below:

01_Advanced_Syncing

If you require Module C to execute after Module B completes, then you can create a connection between Module A and Module C, and the connection between Module B and Module C becomes a dummy connection that will force Module C to wait until Module B completes. In order to configure the dummy connection, set the output parameter of Module B to be of type File, with the number of arguments at 0. Next, set the input parameter of Module C to type File, also with the number of arguments at 0. See the modified workflow below (the dummy connection is highlighted):

02_Advanced_Syncing

Now, both Module B and Module C can use the output directory that Module A creates, and the workflow will guarantee that Module C only executes after Module B is complete. This allows for syncing of modules that do not have a direct dependency on each other.

Last updated 04 February 2008 at 14:54 PST