no edit summary
To facilitate access to HPC from the Fiji environment, we utilize the in-house HEAppE Middleware framework allowing end users to access an HPC system through web services and remotely execute pre-defined tasks. Furthermore, HEAppE is designed to be universal and applicable to various HPC architectures. HEAppE also provides the mapping between the external users and internal cluster service accounts that are being used for the actual job submission to the cluster. It simplifies the access to the computation resources from the security and administrative point of view. For security purposes, users are permitted to run only a pre-prepared set of so-called command templates. Each command template defines an arbitrary script or an executable file which is to be run on the cluster, a set of input parameters modifiable at runtime, any dependencies or third-party software it might require, and the type of queue that should be used for the processing.
We developed a Fiji plugin underlain by HEAppE, which enables users to steer workflows running on a remote HPC resource. As a representative workflow we use a Snakemake based SPIM data processing pipeline operating on large image datasets. The Snakemake workflow engine resolves dependencies between subsequent steps and executes in parallel any tasks appearing to be independent, such as processing of individual time points of a time-lapse acquisition.
== SPIM data processing pipeline ==
The sheer amount of the SPIM data requires conversion from raw microscopy data to Hierarchical Data Format (HDF5) for efficient input/output access and visualization in Fiji's BigDataViewer (BDV) [https://imagej.net/BigDataViewer#Publication]. BDV uses an XML file to store experiment metadata (i.e. number of angles, time points, channels etc.). Although the conversion to HDF5 is a parallelizable procedure, further updating the XML file downstream in the pipeline is not; and per-time point XML files have to be created and then merged after completion of the registration and fusion steps. Consequently, the parallel processing of individual time points on an HPC resource (conversion to HDF5, registration, fusion and deconvolution) is interrupted by non-parallelizable steps (time-lapse registration and XML merging).
Pipeline input parameters are entered by a user into a config.yaml configuration file. In the first step, the .czi raw data are concurrently resaved into the HDF5 container in parallel on the cluster. Similarly, the individual time points are registered in parallel using fluorescent beads as fiduciary markers on the cluster. Subsequently, a non-parallel job executed by Snakemake consolidate the registration XML files into a single one, followed by time-lapse registration using the beads segmented during the spatial registration step. After this, the pipeline diverge into either parallel content-based fusion or parallel multi-view deconvolution. To achieve this divergence in practice, the Snakemake pipeline is launched from the Fiji plugin as two separate jobs using two different config.yaml files set to execute content-based fusion and deconvolution respectively. In the final stage of the pipeline, the fusion/deconvolution output is saved into a new HDF5 container. Figure below
shows results of registration, fusion and deconvolution in different time points.
= HPC Cluster =
Execution of the Snakemake pipeline from the implemented Fiji plugin was tested on the Salomon supercomputer, which consists of 1 008 compute nodes, each of which is equipped with 2x12-core Intel Haswell processors and 128 GB RAM, providing a total of 24 192 compute cores of x86-64 architecture and 129 TB RAM. Furthermore, 432 nodes are accelerated by two Intel Xeon Phi 7110P accelerators with 16 GB RAM each, providing additional 52 704 cores and 15 TB RAM. The total theoretical peak performance reaches 2 000 TFLOPS. The system runs a Red Hat Linux.
The pipeline was tested on a dataset used in experiments run on the Madmax cluster at MPI-CBG [https://imagej.net/Automated_workflow_for_parallel_Multiview_Reconstruction]. The Madmax cluster had 44 nodes with two Intel Xeon E5-2640, 2.5 GHz CPUs with 6 cores each (average CPU PassMark 9 498). In comparison, Salomon nodes are equipped with two Intel Xeon E5-2680v3, 2.5 GHz CPU with 12 cores each (average CPU PassMark 18 626). Salomon is running a newer generation of Xeon processors (Haswell) providing double the performance of the Sandy Bridge architecture used on Madmax.