27 March 2013

Galaxy on the Cloud with CloudMan

CloudMan is a tool that facilitates Galaxy deployments in cloud environments. Quoting the brief description from the official web site:

CloudMan is a cloud manager that orchestrates the steps required to provision and manage compute clusters on a cloud infrastructure, all through a web browser. In minutes, you can use a vanilla SGE cluster or a complete bioinformatics analysis environment.

As such it was an obvious choice for us to deploy our Galaxy based Image Processing and Analysis Toolkit on the NeCTAR cloud. Although spawning new clusters when the CloudMan is already setup on a cloud it’s quite easy, there is a fair amount of work required to get to this stage, which includes among others:

  • creating images and volume snapshot with our customized Galaxy
  • setting up the Cloudman/Galaxy configuration files in object storage
  • … and contributing to the CloudMan project itself to fix some issues with OpenStack compatibility

The good new is that we have managed to deploy CloudMan and our customized Galaxy with image processing tools  on an OpenStack cloud.  We can now build on-demand Galaxy/SGE (Sun Grid Engine) clusters with up to 20 nodes.

Galaxy in this configuration uses DRMAA specification to submit computational task to SGE to be scheduled for execution on the cluster. That (given access to sufficient cloud resource allocation) let’s us scale the application with increasing number of users.

We have also managed to successfully enable MPI support in the cloud SGE, which we can now use to execute MPI based tools (e.g. some of the X-TRACT components) utilizing the cluster resources for speeding up parallelizable computation.

The CloudMan relies heavily on the use of volumes (cloud block storage) and volume snapshots that are not currently publicly available in the NeCTAR cloud.  The experimental support has been around for while and it seems that production version may be available soon and then we can migrate out deployment to NeCTAR.

Alternatively we are considering modifying CloudMan to work without volumes. 

Neuro-imaging pipelines

We have implemented a number of pipelines as part of our NeCTAR RT035 project. Brief descriptions of the pipelines implemented are described below.

The SUVR tool provides intensity normalisation of PET images for quantitative purposes.

The Registration tool allows the user to perform affine or rigid transforms when registering two images together.

The segmentation tool allows the user to segment a brain for a given MRI image.

Alzheimer’s disease and other neuro degenerative diseases are associated with the loss of Grey matter in the cortex. It is therefore necessary to try and quantify this loss. We use the cortical thickness estimation (CTE) tool to provide us with this analysis.

Overview of main functions of CTE implemented:
  • Atlas registration
    • Align an atlas image to a target image
  • Segmentation
    • Segment the MRI into Grey matter (GM), white matter (WM) and cerebrospinal fluid
  • Bias Field Correction
    • Estimate and remove the noise from the image
  • Partial Volume Estimation
    • Quantify the amount of partial voluming inside each voxel
  • Topology Correction
    • Create the topology of the brain to ensure that it is genus zero
  • Thickness Estimation
    • Compute the thickness of the cortex for each grey matter voxel
Outputs from the CTE (above) are used as inputs to the CTE surface pipeline. The CTE surface transfers all cortical thickness values, generated by the CTE pipeline, to a common template mesh.

Overview of main functions implemented for CTE Surface:
  • Cortical Surface Extraction
    • Extract a 3D mesh from the brain segmentation
  • Topological Correction
    • Removes holes and handles from the mesh
  • Biomarker mapping on cortical surface
    • Mapping of various values on a mesh i.e. Thickness, PET values, MR Intensity
  • Surface registration
    • Align the meshes of any given subject to a template to obtain a correspondence across subjects
  • Transfer of biomarkers on template surface
    • Map all values from all subjects to a common space where they
    • Can be compared

Galaxy allows us to create a workflow by joining two or more pipelines together. We use workflows to connect the CTE with CTE surface pipelines as shown in the picture:

Galaxy based workflows for CTE
The partial volume effect is the loss of apparent activity in small objects or regions because of the limited resolution of the imaging system.  It occurs in medical imaging such as positron emission tomography (PET). The method to correct for the partial volume effect is referred to as the partial volume correction.

Overview of main functions implemented for PET PVC:
  • PVC registration
    • Registration of the PET Image to its corresponding MRI
  • Segmentation
    • Segmentation of the MRI into GM, WM and CSF
  • Partial Volume Correction (PVC)
    • Correction for spill in and spill over of the PET image using the MRI segmentation
Currently, all pipelines described above use version 3.20.1 of the Insight toolkit (ITK). We are migrating these to the latest version of ITK (4.3).

26 March 2013

Imaging workflows in Galaxy

We use Galaxy to provide a user-friendly access to our imaging tools. Galaxy allows you to create processing workflows by linking different tools to one another in a simple graphical way.

Constructing workflows in Galaxy is very easy. 

As you notice it works from a Web browser. The tool are on the left pane. This example uses Cellular Imaging tools (I'll give you a tour about them in my next post).
To keep you interested, soon we are going to post a series of video tutorials on how to use Galaxy to do image analysis.