Please Note: This version of the software is outdated. We highly recommend using the new automated workflow. It is much more user friendly, efficient and versatile using full automation of the processing.
Latest release: May 2013
Introduction
Light sheet microscopy such as SPIM produces enormous amounts of data especially when used in long-term time-lapse mode. In order to view and in some cases analyze the data it is necessary to process them which involves registration of the views within time-points, correction of sample drift across the time-lapse registration, fusion of data into single 3d image per time-point which may require multiview deconvolution and 3d rendering of the fused volumes. Here we describe how to perform such processing in parallel on a cluster computer.
We will use data derived from the Lightsheet Z.1 a commercial realisation of SPIM offered by Zeiss. The Lightsheet Z.1 data can be truly massive and cluster computing may well be the only way to deal with the data deluge coming of the microscope.
Every cluster is different both in terms of the used hardware and the software running on it, particularly the scheduling system. Here we use a cluster computer at the MPI-CBG that consists of 44 nodes each with 12 Intel Xeon E5-2640 cores running @ 2.50 GHz and enjoying 128GB of memory. The cluster nodes have access to 200TB of data storage provided by a dedicated Lustre Server architecture. For more info on Lustre see here, suffice to say that it is optimised for high performance input/output (read/write) operations which is crucial for the SPIM data volumes.
Each node of this cluster runs CentOS 6.3 Linux distribution. The queuing system running on the MPI-CBG cluster is LSF - Load Sharing Facility. The basic principles of job submission are the same across queuing systems, but the exact syntax will of course differ.
Note on versions
The SPIM registration is a piece of software that undergoes ongoing development. The original version gathered under plugins SPIM registration has been replaced in 2014 by new set of plugins gathered under Multiview reconstruction. Moreover, the cluster pipeline has been changed to use a centralised Linux style master file. In 2015 this pipeline was reimplemented as automated workflow using the workflow manager Snakemake. Which allows to map and dispatch the workflow logic automatically either on a single maschine or on a HPC cluster. Therefore there are 4 versions available. We highly recommend using the latest version:
Original SPIM registration pipeline - contains the most detailed description of the cluster pipeline using SPIM registration plugins. If you do not have much HPC/Linux experience start here.
NEW PIPELINE - also uses SPIM registration plugins and introduces the master file, less verbose requiring some experience with command line and HPC.
So, if you are new read a bit of the chapter 1 (original pipeline) to get familiar and then skip to chapter 3 (Multiview reconstruction pipeline) which is more up-to-date. To understand how the master file works refer to chapter 2 (NEW PIPELINE).
Original SPIM registration pipeline
Pre-requisites
Saving data on Lighsheet Z.1
The Lightsheet Z.1 data are saved into the proprietary Zeiss file format *.czi. Zeiss is working with Bio-Formats to make the .czi files compatible with Open Source platforms including Fiji. At the moment Fiji can only open .czi files that are saved as a single file per view where the left and right illumination images have been fused into one image inside the Zeiss ZEN software. This situation is going to change, for now, if you want to process the data with Fiji, save them in that way (TBD).
Getting familiar with Linux command line environment
It is very likely that the cluster computer does not run ANY Graphical User Interface and relies exclusively on the command line. Steering a cluster from the command line is fairly easy - I use about 10 different commands to do everything I need to do. Since the Linux command line may be unfamiliar to most biologists we start a separate Linux command line tutorial page that explains the bare essentials.
Transferring data
First we have to get the data to the cluster. This is easier said then done because we are potentially talking about terabytes of data. Moving data over 10Gb Ethernet is highly recommended otherwise the data transfer will take days.
Please note that currently the Zeiss processing computer does not support data transfer while the acquisition computer is acquiring which means that you need to include the transfer time when booking the instruments. Transferring 5TB of data over shared 1Gb network connection will take a while…
Installing Fiji on the cluster
In case you use the MPI-CBG cluster ‘madmax’ you might spare yourself some minutes and just hijack Pavel’s well maintained Fiji installation. Just skip the Fiji installation section and do not change the path to the Fiji executables (/sw/people/tomancak/packages/...) used in the example scripts shown below.)
Change to a directory where you have sufficient privileges to install software.
In all likelihood you will need the Linux (64 bit) version (unless you are of course using some sort of Windows/Mac cluster). Unzip and unpack the tarball
Change to the newly created Fiji-app directory and update Fiji from the command line
The output that follows may have some warnings and errors, but as long as it says somewhere “Done: Checksummer” and “Done: Downloading…” everything should be fine.
Done, you are ready to use Fiji on the cluster.
Renaming files
We need to change the file name from a simple index to a pattern that contains the time point and the angle information.
The output files of the Zeiss SPIM look like this:
In this example we have 5 angles. The files displayed show the first time point of this time series. The first file (does not contain a index) is the master file. This file would open all subsequent files in the Zeiss program, but also contains the first angle of the first time point. We need to give this file the index (0) in order to use it. Neglecting this file, will result in a frame shift in the data.
Now we can rename the files using the following shell script. Make a script with the name rename-zeis-files.sh Modify the angles, the last index and source pattern.
The files should now be named like this:
To check if all time points and angles of a time point are present, you can use the following script. Modify the time points and the number of angles.
The script will return you the specific time points that are missing or the time points that have missing angles.
Saving data as tif
As a first step we will open the .czi files and save them as .tif. This is necessary because Fiji’s bead based registration currently cannot open the .czi files. Opening hundreds of files several GB each sequentially and re-saving them as tif may take a long time on a single computer. We will use the cluster to speed-up that operation significantly.
The Lustre filesystem on MPI-CBG cluster is made to be able to handle such situation, where hundreds of nodes are going to simultaneously read and write big files to it. If your cluster is using a Network File System (NFS) this may not be such a good idea…
We have an 240 time-point, 3 view dataset (angles 325, 235 and 280) in a directory
we create a subdirectory jobs/resaving and change to it
Now we create a bash script create-resaving-jobs that will generate the so called job files that will be submitted to the cluster nodes (I use nano but any editor will do. Using nano type nano create-resaving-jobs` and cut&paste the script from below into that file.)
We customize the script by editing the parameters inside it. One can think of it as a template that is used as a starting point to adapt to the particular situation. For instance we can change the directory dir where the data are to be found, the place where the output will go jobs, the number of time-points to process for i in $(seq 1 240) and most importantly the angle to be processed -Dangle=280. The strategy we follow here is to create jobs to process one angle at a time for all available time-points.
In order to be able to run this (and other scripts we will create further below), you might have to execute the following command: chmod a+x create-resaving-jobs. Finally execute the script by calling ./create-resaving-jobs (you will have to be in the folder containing the script).
This will generate 240 resave-<number>.job files in the current directory
running this job a any cluster node will launch fiji in a so-called virtual frame buffer (the nodes don’t have graphics capabilities enabled but we can simulate that) and then inside Fiji it will launch a BeanShell script called resaving.bsh passing it thee parameters : the directory (/projects/tomancak_lightsheet/Tassos), the time-point (38) and the angle (280).
Lets create that script in the current directory
The t_begin=1000 t_end=1000 are parameters passed to Bio-Formats Importer. This is a hack. The .czi files think that they are part of a long time-lapse despite the fact that they were saved as single, per angle .czi. In order to trick bioformats into opening just the timepoint which contains actual data we set the time coordinate way beyond the actual length of the time-course (in this case 240). This results in Bio-Formats importing the “last” timepoint in the series which contains the data. This will change!
Now we need to create yet another bash script (last one) called submit-jobs
This will look into the current directory for all files ending with .job (we created them before) and submit all of them to the cluster with the bsub command.
-q short selects the queue to which the job will be submitted (this one allows jobs that run up to 4 hours on MPI-CBG cluster).
-n 1 specifies how many processors will the job request, in this case just one (we will only open and save one file)
-R span[hosts=1] says that if we were requesting more than one processor, they would be on a single physical machine (host).
-o "out.%J" will create output file called out.<job_number> in the current directory
-e "err.%J" will send errors to the file called err.<job_number> in the current directory
${1}/$file will evaluate to ./resave-<number>.job i.e. the bash script that the cluster node will run - see above
Lets recapitulate. We have created create-resaving-jobs that, when executed, creates many resave-<number>.job files. Those are going to be submitted to the cluster using submit-jobs and on the cluster nodes will run resaving.bsh using Fiji and the specified parameters.
So let’s run it. We need to issue the following command
the dot at the end tells submit job where to look for .job files i.e. in the current directory. What you should see is something like this
We can monitor running jobs with
or whatever your submission system offers. At the end of the run we will have a lot of err.<job_number> and out.<job_number> files in the working directory.
err.445490
out.445490
err.445491
out.445491
....
The err.* are hopefully empty. The out.* contain Fiji log output if any. In this case it should look something like this. Most importantly in the directory /projects/tomancak_lightsheet/Tassos we now have for each .czi file a corresponding .tif file which was the goal of the whole exercise
We can remove the .czi files (rm *.czi) as we do not need them anymore (but check some of the tifs first!).
Now we must repeat the whole procedure for the other two angles (325 and 235). Open create-resaving-jobs and change 280 to 325 and follow the recipe again. There are of course ways to automate that.
On our cluster powered by the Lustre filesystem the resaving operation takes only minutes. Imagine what is happening - up to 480 processors are accessing the file system reading .czi files and immediately resaving it to that very same filesystem as tif - all at the same time. The files are 1.8GB each. Beware: this may not work at all on lesser filesystems - the Lustre is made for this.
Registration
SPIM registration consists of within time-point registration of the views followed by across time-point registration of the time-series. Both are achieved using Fiji’s bead based SPIM registration plugin. The per-time-point registration is a pre-requisite for time-lapse registration. For detailed overview see here.
Bead-based multi-view registration
The first real step in the SPIMage processing pipeline, after re-saving as .tif, is to register the views within each timepoint. We will use for that the bead based registration plug-in in Fiji. The principle of the plug-in are described here while the parameters are discussed here.
This description focuses on cluster processing and is less verbose, for details see section on resaving as the principles are the same.
In a directory jobs/registration create bash script create-registration-jobs
Run it to create 240 registration-<number>.job bash scripts
which will run registration.bsh using Fiji
on a cluster node when submitted by submit-jobs
Some tips and tricks
the bead based registration code is NOT multi-threaded, thus 1 processor is sufficient (bsub -n 1)
the registration needs at least as much memory on the node to be able to simultaneously open all views (3x1.8GB here). Since our nodes have 128GB of shared memory it is not really an issue here, we can run registration using 12 cores on one machine at the same time.
the crucial parameter for bead based registation is the channel_0_threshold=0.0069; determine it on a local workstation using Fiji GUI. Clusters typically do not have graphical interface.
Time-lapse registration
Once the per-time-point registration is finished it is necessary to register all the time-points in the time-series to a reference time-point (to remove potential sample drift during imaging). The parameters for time series registration are described here.
The time-series registration is not really a cluster type of task as it is run on a single processor in a linear fashion. But since until now we have everything on the cluster filesystem it is useful to execute it here. Note: I do not mean that timelapse registration cannot be parallelized, we just have not implemented it because it runs fairly fast in the current, linear fashion.
It is a very bad idea to execute anything other then submitting jobs on a cluster head node. LSF offers a useful alternative - a special interactive queue allowing us to connect directly to a free node of the cluster and execute commands interactively.
We are now on node 27 and can use the filesystem as if we were on the head node (not every queuing system will enable this).
We create a bash script timelapse.interactive
It calls time-lapse.bsh that will run fiji with the appropriate parameters for timelapse registration plug-in
Executing the timelapse.interactive
will start a long stream of timelapse registration output. Its a good idea to redirect it to a file like this:
We can just as well run the timelapse registration from the head node by issuing
In this case the output will go into out.<job_number> file in the working directory.
Tips and tricks
The crucial parameter of timelapse registration is reference_timepoint=709. It could be either a timepoint with low registration error or a timepoint in the middle of the time series.
It is important to specify the z_resolution in timelapse.bsh (specify_calibration_manually xy_resolution=1.000 z_resolution=3.934431791305542), otherwise the plugin will open every raw data file to read the metadata which can take quite long.
the xy_resolution can be set to 1 since the plugin only uses the ratio between xy and z
For very long time-series where the sample potentially jumps in the field of view it may be necessary to register several segments of the series separately.
Fusion
In multi-view SPIM imaging fusion means combination of registered views into a single output image. Fiji currently implements two distinct fusion strategies: content based fusion and multi-view deconvolution. For detailed overview see SPIM registration page.
Content based multiview fusion
After registration we need to combine the views into a single output image. The content based fusion algorithm in Fiji solves that problem by evaluating local image entropy and weighing differentially the information in areas where several views overlap. For details see here.
As before we create a directory jobs/fusion and in there bash script create-fusion-jobs
that will generate many fusion-<number>.job scripts
Each of these will run fusion.bsh
on a cluster node when submitted by submit-jobs
Tips and tricks:
Fusion is memory intensive no matter what.
The content based fusion will necessarily degrade image quality. Thus it makes only sense to fuse the image for visualization purposes such as 3D rendering.
It is not necessary or even possible to 3D render the full resolution data. Thus we use the downsample_output=4 option to make it 4 times smaller.
The downsampling also reduces the storage requirements for the fused data which can be unrealistic for full resolution data (tens of terabytes).
The fusion code is multi-threaded, therefore we request 12 processors on one host bsub -n 12 -R span[hosts=1] and request as much memory as possible fiji-linux64 -Xms100g -Xmx100g. Requesting 12 hosts guarantees all the memory on a single node is available for the job (128GB). It may be difficult to get that when others are running small, single processor jobs on the cluster.
The integral image mediated weightening is much faster than the traditional gauss method, for large images it may be the only option as one can also run out of 128GB of RAM with this data.
Multiview deconvolution
Another, more advanced, way to fuse the registered data is multiview deconvolution which is described here.
The deconvolution can be executed either on the CPU (Central Processing Unit - i.e. the main processor of the computer) or on GPU (Graphical Processing Unit - i.e. the graphics card). The pre-requisite for the GPU processing is to have one or more graphics cards capable of CUDA such as NVIDIA Tesla or Quadro or GeForce. Since the GPU accelerated multi-view deconvolution is not yet published and the necessary C code has to be obtained from Stephan Preibisch by request we will focus for now on deconvolution using CPU.
The GPU mediated deconvolution is faster, but currently only by a factor of 2-3 and so the CPU version makes sense, especially when you have a big cluster of CPUs and no or few GPUs.
Multiview deconvolution on CPU
200px
In contrast to the multiview fusion plugin described above, Stephan Preibisch, in his infinite wisdom ;-), did not implement the option to scale down the data before deconvolution starts. Since deconvolution is a very expensive operation, it will take a very long time (hours) on full resolution data. If the sole purpose of fusing the data by deconvolution is to render them in 3D, the full resolution images are not necessary, ergo we need to downsample. Fortunately Stephan implemented a workaround in the form of a script that prepends a transformation (such as scaling) to the raw SPIM data.
The script can be found under Plugins › SPIM registration › Utilities › Apply external transformation (or press L and type Apply external transformation). The initial dialog is reminiscent of SPIM registration, the screen that comes after that is not.
What we are looking at is the so called Affine Transformation Matrix that will be pre-concatenated to the transformation matrix in the registration files from bead based registration. The m00, m11 and m22 entries of the matrix represent the scaling of the image and so by setting all three of them to 0.5 we will downscale the image by a factor of 2.
The output of running the Apply external transformation will look like this:
Pre-concatenating model:
3d-affine: (0.5, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0)
Applying model to: spim_TL400_Angle1.tif.registration.to_400
Applying model to: spim_TL400_Angle2.tif.registration.to_400
Applying model to: spim_TL400_Angle3.tif.registration.to_400
Applying model to: spim_TL400_Angle4.tif.registration.to_400
Applying model to: spim_TL400_Angle5.tif.registration.to_400
Applying model to: spim_TL400_Angle6.tif.registration.to_400
and the registration files spim_TL400_Angle1.tif.registration.to_400 in the registration/ directory will now end with something like this:
Now this looks elegant, but there are several caveats. The pre-concatenation of transformation models is not reversible (or at least not easily in the current code framework) and so before applying external transformation we recommend to archive the old, unmodified registration files. For example by packaging them to a tar archive
and decompressing in order to get to the original, unaltered transformation models
Second issue, AND IMPORTANT ONE. The new transformation (scaling) must be applied to every timepoint in the registered time-series INCLUDING the reference time-point. For good measure, it is also necessary to apply the transformation to the original non-time-series .registration files of the reference time-point ONLY. Don’t ask me why… These two steps (pre-concatenating transformation models to reference time point just once) are really not clusterizable and so we recommend to do them manually in Fiji on a local machine and copy the modified registration files to the registration/ directory on the cluster. Yes, it is clunky, but its better than nothing.
Now we are ready for the cluster mediated deconvolution on the downscaled data. By now you should know the drill… Create a directory jobs/deconvolution and in there a bash script create-deconvolution-jobs
that will generate many deconvolution-<number>.job scripts
Note the new parameter iter which specifies how many iterations of the multiview deconvolution we want to run. This should be determined empirically on a local GUI Fiji set-up.
Each of the *deconvolution-.job* scripts will run `deconvolution.bsh`
Stuff that matters here are the following parameters:
number_of_iterations=10 specifies the number of iterations (10 is a good guess)
compute_on=[CPU (Java)] here we indicate that we want to use CPU
compute=[in 512x512x512 blocks] most likely we will have to compute in blocks unless we have really a lot of memory available.
fiji.plugin.Multi_View_Deconvolution.psfSize = 31; this parameter should be considered advanced for now, it specifies the size of the area used to extract the Point Spread Function (PSF) from the beads in the image. Default is 19.
otherwise the parameters are similar to content based fusion or constants.
on a cluster node when submitted by submit-jobs
Tips and tricks:
Multiview deconvolution needs as much memory as possible.
The memory requirements can be mitigated by using smaller blocks and the processing will take longer.
The output deconvolved image will have extremely compressed dynamic range, i.e. will look pitch black upon opening. Set the min and max to 0.0 and 0.05 to see anything.
The PSFs of the beads will become smaller (ideally points) but brighter.
The image will appear much more noisy compared to content fused or raw data.
The deconvolution.bsh script by default downscales the images before deconvolution commences. If you want to do that do not forget to first downscale manually the reference time-point (as described - both the original and the timelapse registration versions), use it to define the crop area on a local machine and transfer the .registration and .registration.to_<reference timepoint> files FOR THE REFERENCE TIME-POINT to the cluster.
In fact it is best to perform the entire deconvolution process of the reference time-point locally and transfer the results to the cluster. First of all its good to experiment with the number of iterations and to look at what the deconvolution does to the data. Second, since on the cluster we are applying the downscaling to ALL the time-points - this includes the reference to which we applied the transformation on our local machine (see tip above). Therefore the reference time-point ends up downscaled twice. If you don’t get it - call me ;-).
In order to deconvolve full resolution data, no need to do the previous step however the pre-concatenation macro MUST BE commented out in the deconvolution script. Otherwise things will get really weird!
Multiview deconvolution on GPU
Coming soon.
3D rendering
Finally we want generate a beautiful 3D rendering of the downsampled, fused data and run it as movies at conferences… ;-).
The preparation phase of 3D rendering is a bit more complicated. We will use the interactive Stack Rotation plugin to position the specimen the way we want to render it and then send it to 3DViewer plugin. Here is the recipe:
Open fused image stack and launch Interactive Stack Rotation plugin (Plugins › Transform › Interactive Stack Rotation). Note: Familiarize yourself with the keystrokes that navigate the Interactive Stack Rotation. This is an extremely powerful way of looking at nearly isotropic 3D data coming from SPIM. More advanced version honoring these keystroke conventions is coming to Fiji soon (by Tobias Pietzsch).
Use the key commands to rotate the specimen into the position from which you want to 3D render it. Note that the top slice with the lower z-index will be facing towards you when rendering in 3d Viewer.
Record the transform by pressing E. The transformation matrix will appear in the Fiji log window.
Copy the transform into the render.bsh script shown below into line 41 (read the comments if unsure).
Press ↵ Enter to apply the transformation to the stack.
Now use the rectangle tool to define a crop area that will include the specimen with minimal background. Write down the x,ycoordinates width and height of the crop area and paste them into the render.bsh script (line 128). Note: A more efficient way to capture the numbers is to start macro record before and simply copy and paste them from the macro recorder window.
Apply crop (Image › Crop).
Determine the z-index where the specimen starts and ends and paste them into the render.bsh script (line 131).
Run Duplicate command (Image › Duplicate) and enter the z-index as range (for example 20-200). A tightly cropped specimen stack should be the result of this series of operations.
Adjust brightness and contrast on the stack to see the data well, perhaps slightly saturating and write the min and max into the render.bsh script (line 31).
Launch the 3d Viewer and experiment with threshold (3d Viewer then Edit › Adjust Threshold) and transparency (3DViewer then Edit › Change Transparency) and enter them into the render.bsh script (lines 154 and 156).
Finally modify the dimensions of the Snapshot that the 3D VIewer takes to match the dimensions of the crop area (width and height) on line 161.
We are ready to begin the cluster processing by creating our old friend, the create-render-job bash script in a directory jobs/3d_rendering
who will create render-<number>.job
Each bash script is passing a directory, timepoint and rendering angle parameters to render.bsh BeanShell script. The script is little more complicated than before. It combines Saalfeld’s BeanShell magic with my clumsy macro programming. It is necessary to change the parameters inside the script according to the recipe above for each individual rendering run.
We submit is as usual using ./submit-jobs .
Tips and tricks
This approach to making 3D rendering movies is still a hack, although its better than pure macro and TransformJ. We are working on a better solution.
We are using an interplay of Interactive Stack Rotation to pre-process the image by rotating it to the desired position and then calling 3dVIewer to render it at that position while zooming in a bit.
You can also use the TransformJ commands to rotate your fused stack to the orientation you like, possibly crop it if its too big and then open it in 3dViewer. Recording this as a macro and making it work (ala Stephan Preibisch) is possible, but it is incredibly laborious and nerve wracking.
Whoever wants to rewrite the macro parts of render.bsh into a real script is VERY welcome.
Even better would be, obviously, to pass the transformation matrix to the 3D Viewer, but it proved unreliable.
The key to making it work on a cluster is to provide specific parameters about screen size to the xvfb-run script (-as"-screen 0 1280x1024x24"). Otherwise it doesn’t work. Thanks to Stephan Saalfeld for figuring it out.
The cluster makes it fairly easy to experiment with parameters and angles of view - on a single computer the same task would take days and since we are using ImageJ macro you would not be able to touch the computer. MPI-CBG cluster renders 800+ timepoints in half an hour even under full load from other users.
The 3D rendering is relatively complex (we are working on a simpler solution) but extremely rewarding. Drosophila embryogenesis movie coming soon here.
Processing 2 channels
This part will deal with the processing of SPIM data with 2 channels. The registration and fusion works very similar and needs only a few adjustments to the scripts above, which I will point out specifically. There are 2 main differences:
The Zeiss SPIM currently does not allow to export individual channels when acquiring 2 channels in fast frame mode. Thus we need to split the channels and save them as separate .tif files. When using the sequential mode you can skip this step.
In our case only 1 channel will have beads visible. Thus we will perform the registration only on this channel. The fusion program however, requires that registration files are present for both channels. To work around that, we will just duplicate the registration files from the channel that contains the beads.
Separating the channels
Rename and save the data as .tif following the steps described above. The data should be present now in the following format:
Each file contains 2 channels, which we need to split into individual files before we can proceed. We will save the individual channels into a new subdirectory:
To separate the channels we create a bash script create-split-jobs. Save this script in a new subdirectory in the jobs directory.
The creat-split-jobs will create the jobs that will be send to the cluster. You will need to edit the directory, the number of time points and the angles.
Run this script:
The creat-split-jobs will use the split.bsh (written by Stephan Saalfeld). This script will split each file into individual channels and will save them into the directory split.
Using the script “submit-jobs” will send the jobs to the cluster.
It pays of to do a test run with a small set. You can then determine the rough runtime and memory requirements of the jobs by looking at the output files. This information will allow the queuing system to put you in faster and eliminates failed jobs. The splitting will be rather fast 1-2 min and will require very little memory.
“-W 00:05” will limit the wall time of the job to 5 minutes.
Run the script by:
Finally you should find the individual channels in the directory split. To the information of the time point and the angle, the channel information is added.
You can now remove the .tif files that contain both channels. Determine which channel contains the beads. The multi-view registration and the time-lapse registration will be performed only on this channel.
Multi-view registration for 2 channels
In this example the beads are visible in channel 1. Therefore, we will proceed to register this channel. Modify in the create-registration-jobs script the directory, the number of time points and the angles (-Dangles).
The registration.bsh stays in principle the same as in the single channel. The only thing you need to modify is pattern_of_spim otherwise the program will not recognise the files. Just add the name of the channel to the file name.
As before you would change the z-resolution, radius channel_0_radius_1, channel_0_radius_2 and the channel_0_threshold according to the parameters you would have determined manually.
The submit-jobs script is modified for the requirements of the registration. Determine these parameters with a small set before you apply them to all files.
-n 5 use one processor per angle.
-W 00:15 Walltime of the job restricted to 15 min.
-R rusage[mem=10000] 10000MB of memory is required.
The registration files should now be written in the directory registration. For each angle of each time point 3 registration files should be present:
Time-lapse registration for 2 channels
In the script timelapse.interactive modify -Ddir=, -Dtimepoint=, -Dreferencetp= (choose a good time point as reference) and -Dangles=.
Analog to the Multi-view registration add the channel information to the name of the file in the pattern_of_spim part of the script.
Give the z-resolution, channel_0_radius_1, channel_0_radius_2 and the channel_0_threshold as before.
Again modify the submit-jobs script according to the need of your timelapse registration. For my example these modifications worked:
An additional registration file will be created in the directory registration.
Duplicate registration files
Since the fusion requires the presents of registration files for both channel, we will duplicate the existing files of channel 1 and save them as registration files for channel 0. The following script duplicate_rename_registration.bsh will do just that. Create this script in the jobs directory.
You will need to modify the time points, angles, the used reference time point in registration.to_{your reference} and the directory.
The script will copy the existing files and save them under a new name with just the channel name changed.
Execute the script:
Now for each channel of each angle and timepoint registration files should be present.
Fusion for 2 channels
The create_fusion_jobs for 2 channels works the same as for the single channel fusion. Just modify the directory, the number of time points, the angles under -Dangles and choose a cropping area (-Dx, -Dy, -Dz, -Dw, -Dh, -Dd).
Execute script
The fusion.bsh script needs to be set for multi channel registration. Under select_channel, Multi-channel registration=[Individual registration of channel 1] registration=[Individual registration of channel 1] will be set. downsample_output in this case is set to 4.
The submit-jobs script is modified according to the requirements of the fusion:
-W 00:20
-R rusage[mem=50000]
Worked in my example, but again I would recommend that you modify this with the information from a small set of your own data.
The fused images will be saved into separate subdirectories for each time point into the output directory.
NEW PIPELINE
The new pipeline is centered around a configuration file, the master file, that contains all the relevant processing parameters. It increases the efficiency of the processing significantly since mainly this file is manipulated for each dataset, instead of the scripts in each processing step individually.
The master file has two parts. The first part contains all the relevant processing parameters for each individual processing step. The second part contains some more advanced settings and the links for the job scripts and directories.
The new pipeline also comes with a new set of scripts that are specifically modified to be used with the master file. The general idea is to have these job scripts together with the master file independent from the dataset. The scripts will use the master file as a source for the processing parameters. The jobs will be created and executed within the job directories just as before, the success of the jobs can be assessed with the output and the error files. The master file can be saved and can serve as a documentation for the processing.
Currently the master file is useable for the following steps.
Single-channel processing:
Rename .czi files
Resave .czi files
Resave .ome.tiff files
Multi-view registration
Timelapse registration
Content based multi-view fusion
External transformation
Multi-view deconvolution
3D-rendering
Export to hdf5 format
Multi-channel processing:
Rename .czi files
Resave .czi files
Registration
Timelapse registration
Content based multi-view fusion
3D-rendering for 2 channels
Export to hdf5 format
All the scripts work with padded zeros.
At the moment this tutorial is written for advanced users that already used the previous pipeline. For a more detailed introduction please read into the description of the previous pipeline.
Master file
There are two parts in this file:
Processing Parameters
Directories for scripts and advanced settings for processing
The first part contains everything relevant for processing and will be modified for each dataset. It is further structured according to each processing step.
The second part contains the links for the working directories and scripts. Since the jobs scripts should rest at one particular location these links need to be changed the first time you start processing. This part also contains more advanced settings for registration, fusion and deconvolution. Which should only be touched when fully understanding these steps.
We will discuss each section of this file with the associated processing step.
First time using the master file
Upon using the master file the first time please change the links for Fiji, the working directories and scripts in the second part of the file:
Then you need to change in each shell script particulary in the create-jobs scripts the link to the master file.
This link is given in the 3rd line of each shell script. For example in the rename-zeiss-files.sh:
Or the create-resaving-jobs file:
These settings only need to be changed once, if the job scripts and the master file stay in the same directories.
Single-channel Processing
First steps
We start by defining the general parameters of the spim dataset in the master file. First we give the directory that contains the data. Then we define the number of timepoints and angles, our example has 3 timepoints and 3 angles. Finally we need to modify the pattern of the spim data, define the reference timpoint and the calibration. For the sake of demonstration we included padded zeros in the pattern and will use padded zeros throughout the tutorial:
Rename .czi files
The example dataset is in the aforementioned directory:
It is very important to note that the Lightsheet Z.1 writes the first angle of the first timepoint without index. Thus we need to add (0) as the first index to this file. Forgetting this step will lead to a frameshift in the dataset during renaming.
For renaming the .czi files the relevant section in the master file looks like this:
The first index is 0. Since we have 3 timepoints with 3 angles, we have 9 timepoints in total. Thus the last index is 8. Then we define how the new name will start. The first timepoint will be 1 and there are three angles, thus (1 2 3). For demonstration we will use padded zeros. pad_rename_czi="2" means the output will look like this: 01. The “source pattern” states how the old .czi are named and the target_pattern defines how the new files will be named. It is important that these patterns are correct.
For renaming the .czi files we use the rename-zeiss-files.sh script:
The script should now get all the necessary parameters from the master file. Execute the script (use chmod a+x when executing a script the first time).
The .czi files are now renamed accordingly:
Resave .czi files
The next step is to resave the .czi files as .tif files. In the relevant part in the master file we just need to specify the angles and if we used padded zeros:
For creating the jobs for resaving use the create-resaving-jobs scripts:
Execute this script:
This should create jobs for each .czi file.
Each job file should contain the relevant parameters for the job, where to find Fiji and the actual job script:
The necessary parameters are passed from the jobs file to the resaving.bsh script upon processing the job. The script resaving.bsh looks as follows:
Submit the job by executing the submit script submit-jobs:
The .tif files can now be found in the data directory together with the .czi files:
After this step inspect the .tif files and check if all of them are present and that they correspond to the .czi files. If that is the case transfer the .czi files onto tape for long term storage.
To check if all the files are present use the checkpoint.sh script:
This script will return the number of the timepoint that is missing or has missing angles.
Resave .ome.tiff files
To resave .ome.tiff files as .tif we use the same part in the master file as when you would resave .czi files. In the relevant part in the master file we just need to specify the angles and if there are padded zeros:
The create create-ometiff-jobs file. Create the resaving jobs by executing this script:
The resaving-ometiff.bsh script:
Submit the job files using the submit-jobs script:
Multi-view registration
For the registration use the relevant part in the master file. Change the pattern of the spim data accordingly. You can choose between Difference of mean and Difference of gaussian registration, change the parameters accordingly. It is important to comment out for example the Difference of Gaussian parts in the registration.bsh script when you want to use the Difference of mean registration. There will be an error otherwise.
The create-registration-jobs script. Execute this script for creating the registration jobs.
The registration.bsh script. Important here is to comment out the indicated parts depending on the registration method. Otherwise the script may not work properly. For the Difference of Mean registration comment out Difference of Gaussian with Line 33-34 and 68-69. For Difference of Gaussian comment out Line 28-30 and Line 65-67.
Submit the jobs by using the submit-jobs script:
Timelapse registration
The timelapse registration is using the registration parameters specified before and the reference timepoint specified in the general parameters. The part in the master file for timelapse registration looks like this:
For creating the timelapse registration job (only one job) execute the create-timelapse-jobs script:
For the time-lapse.bsh script comment out the same parts that you commented out for the registration.bsh script.
Submit the register-timelapse.job to the cluster using the submit-jobs script:
Content based multi-view fusion
The relevant part in the master file:
For single-channel data use select_channel="Single-channel". For using the timelapse registration select registration_fusion="\"Time-point registration (reference=1) of channel 0\"" and specifify the correct reference timepoint in (reference=1). If you want to use the registration of the individual timepoints for the fusion select registration_fusion="\"Individual registration of channel 0\" instead.
Specify how much you want to downsample the fusion. However always use the cropping parameters for the full resolution when defining the cropping area.
Execute the create_fusion_jobs script for writing the fusion jobs:
The fusion.bsh script, for single channel comment out the additional "registration=[" + registration_fusion + "]" + " " + line:
For submitting the fusion jobs execute the submit-jobs script:
Multi-view deconvolution
Before performing multi-view deconvolution we need to apply the external transformation onto the registration files for downsampling the deconvolution. Make a copy of the registration files as backup before applying the external transformation.
External transformation
For the external transformation in the master file specify the pattern of the spim files, the timepoints that need to be transformed and for downsampling twice use 0.5:
The create_external_transformation script:
The external_transformation.bsh script:
Submit the external transformation job to the cluster with the submit-jobs script:
Deconvolution
For deconvolution specify again the spim pattern, the number of iterations and the cropping parameters. The cropping area is defined on the downsampled data. Therefore divide the full resolution cropping area by the factor you downsampled the registration files.
The create_deconvolution_jobs script:
The deconvolution.bsh script:
Submit the deconvolution jobs using the submit-jobs script:
3d Rendering
In the master file specify the working directory and script for rendering single-channel data. Under source_rendering give the directory where to find the fusion or deconvolution output. Under target_directory give the name of a directory where to save the data within the data directory. A directory will be made for you. Specify the number of frames and the min max values for setting the brightness and contrast for the rendering.
At the moment it is not possible to put in the orientation or the rotation parameters from the master file. We will work on this part. Thus you need to modify the render-mov1.bsh accordingly.
The create_render_jobs script:
In the single-render-mov.bsh script for getting a fixed orientation modify the transformation matrics and comment out the rotation function (Line 99-103) as well as the rotation command (Line 116: transform) in the rendering part of the script. For rotation comment out the transformation matrics (Line 93-94) and the orientation command (Line 115: orientation) in the rendering part of the script.
Submit the rendering jobs by using the submit-jobs script:
Hdf5 export
The part in the master file covering the hdf5 export:
The first step is to determine the number of necessary jobs. Execute the run_numjobs script. This job runs throught the export.bsh script and calculates the number of necessary jobs.
This script will write a job file getnumjobs which will be send directly to the cluster.
The output of this job will be the numjobsout file:
You can check if all parameters are correct. The important line is the last number of jobs: 3. This means we need to adjust the master file accordingly:
There must always be a “0” job. This job generates the .xml file. The other jobs will write .h5 files that contain the actual data. The rest works analogous to the other parts of the pipline. Create the jobs with the create_export_jobs script. This script will also create a new directory within the spim data directory.
Each job will use the export.bsh script:
Send the jobs to the cluster using the submit-jobs script:
Multi-channel Processing
The master file has all the necessary information to easily switch between single-channel and multi-channel data. You just need to make the correct settings in the master file and use 2 additional scripts for the current pipeline to process multi-channel datasets. In this chapter I will point out the necessary changes specifically.
First steps
Since we already did set up the master file and the scripts properly the only things we need to manipulate this time are the processing parameters (see First time using the master file)
Change the data directory dir=. The example dataset has 3 timepoints and 5 angles. For multi-channel processing select the option for multi-channel data: pattern_of_spim="spim_TL{tt}_Angle{a}_Channel{c}.tif". Also change the reference timepoint and the calibration settings.
The dataset is in the specified directory:
Add the (0) index to the first .czi file:
Rename .czi files
The renaming in the multi-channel data follows the exact same principle as in the single-channel data. Just modify the master file accordingly and then execute the rename-zeiss-files.sh script.
The .czi files should now be renamed:
Resave .czi files
The resaving also relies on the same scripts as for the single-channel data. Specify the correct angles and the correct parameter for the padded zero:
Create the jobs by executing the create-resaving-jobs script and submit them by using the submit-jobs script.
Split channels
The channels are then split into separated files. The algorithm will output the files with the following naming patterns:
In the master file Specify the number of angles and give a name for a new directory within the data directory where you want to save the resulting files. This directory will be created for you:
The create-split-jobs script:
The split.bsh script:
Submit the jobs using the submit-jobs script:
The split files will be now saved in a new directory:
We will proceed to work with the files were the channels are split. Since the data directory in these files is now different we need to use a different directory in the master file. From then on, all the output will be saved into this directory:
Multi-view registration
In the example dataset the beads were only visible in Channel1. We will perform a single-channel bead registration only on this channel. We therefore need to specify the spim data pattern accordingly:
Change the detection parameters for the chosen detection method.
Execute the create-registration-jobs script to create the jobs for registration and send them to the cluster by executing the submit-jobs script.
Timelapse registration
The time-lapse registration uses the already defined multi-view registration. Specify the timepoints you want to use for timelapse registration.
Create the register-timelapse.job by executing the create-timelapse-jobs and then submit them to the cluster.
Dublicate registration files
For further processing we need registration files for both channels. Therefore we dublicate the existing registration files and rename them to the missing channel. We advise to make a backup of the registration files at this point.
In the master file specify which channel was registered (channel_source) and which channel still needs registration files (channel_target).
For dublicating the registration files just execute the dublicate_rename_registration.sh script.
Content based multi-view fusion
For the content based multi-view fusion use select_channel="Multi-channel". Specify the registration, the downsampling and the cropping accordingly:
In the fusion.bsh script comment in the additional registration "registration=[" + registration_fusion + "]" + " " + line (Line 46).
Create the fusion jobs by executing the create_fusion_jobs and submit them to the cluster.
3D-rendering for 2 channels
The relevant part in the master file:
First specify the directory of the jobs. The 2 channel rendering uses the multi-render-mov.bsh script, you need to select this script for rendering. Specify which output you want to process and where you want to save the results of the rendering within in the original directory. Finally give the number of frames, the min and max values and the number of slices of the output.
The create-render-jobs script:
The jobs will use the multi-render-mov.bsh script for rendering. The postion can be set by changing the transformation matrix (line 141-143). For rotation comment in Line 147-149.
Execute the create-render-jobs script and submit the jobs to the cluster with the submit-jobs script.
Hdf5 export
Change the necessary parameters in the master file:
The getnumjobs file:
The output of the getnumjobs:
Modify the master file accordingly. Create the jobs using the create_export_jobs script and submit them to the cluster.
New Multiview Reconstruction pipeline
The key change in the Multiview Reconstruction (MVR) pipeline is that all results are written into an XML. This poses new problems for cluster processing, because several concurrently running jobs need to update the same file.
Stephan Preibisch solved that problem by allowing to write one XML file per job (usually a timepoint) and then merging the job specific XMLs into one XML for the entire dataset.
In practice it means the following steps need to be executed:
Define XML dataset - creates one XML for the entire timelapse
Re-save data as HDF5 - converts data into HDF5 container optimised for fast access in BigDataViewer
Run per time-point registrations - creates as many XMLs as there are timepoints
Merge XMLs - consolidates the per-timepoint XMLs back into a single XML
Some new parameters are introduced and some old parameters change names. Therefore, use the master file described in this chapter to process with the MVR pipeline.
Define XML
First step in Multiview Reconstruction is to define an XML file that describes the imaged dataset. This is very flexible and can be adapted to datasets with several angles, channels, illumination sides and timepoints. The relevant portion of the master file looks like this:
. .
and describes a multi timepoint time-lapse with single channel, one illumination direction and multiple angles. (Note that the timepoints and angles are defined elsewhere in the general part of the master file).
The parameters in the master file are sourced by a create-dataset-jobs bash script
which creates a create-dataset.job bash script that passes the parameters to Fiji by executing define_xml.bsh beanshell script
Since in this case it makes no sense to parallelise, it is best to launch the create-dataset.job in interactive mode on one of the nodes of the cluster (ideally not the headnode). On our cluster this will look like this:
[tomancak@madmax define_xml]$ ./create-dataset-jobs
/projects/tomancak_lightsheet/Valia/Valia/new_pipeline/jobs_master_beta_2.0/define_xml//create- dataset.job
[tomancak@madmax define_xml]$ bsub -q interactive -Is bash
Job <484001> is submitted to queue <interactive>.
<<Waiting for dispatch ...>>
<<Starting on n42>>
[tomancak@n42 define_xml]$ ./create-dataset.job
12 cores available for multi-threading
type of dataset=Image Stacks (ImageJ Opener)
xml filename=dataset.xml
multiple_timepoints=YES (one file per time-point)
multiple_channels=NO (one channel)
multiple_illumination_directions=NO (one illumination direction)
multiple_angles=YES (one file per angle)
dir=/projects/tomancak_lightsheet/Valia/Valia/raw/
pattern_of_spim=spim_TL{t}_Angle{a}.tif
timepoint=1-715
angles=1,2,3,4,5,6
xy_resolution=1
z_resolution=3.497273
imglib_container=ArrayImg (faster)
1
Minimal resolution in all dimensions over all views is: 1.0
(The smallest resolution in any dimension; the distance between two pixels in the output image will be that wide)
Saved xml '/projects/tomancak_lightsheet/Valia/Valia/raw/dataset.xml'.
End result should be a dataset.xml created in the directory where the raw data reside.
Tips and tricks:
In order to change the definition of the dataset define it locally with gui and macro recorder turned on and copy/paste the relevant macro parameters to the master file.
Macro commands that consist of strings are usually surrounded by square brackets []. Do NOT put the brackets into the master file, they are provided by the BeanShell script.
Re-save as HDF5
This step is optional at this point. Re-saving to HDF5 can be done also after registration or not at all.
The purpose of this step is to convert the raw light sheet data (either .czi or .tif) into the HDF5 container that is optimised for fast viewing through the BigDataViewer Fiji plugin.
Relevant portion of the master file looks like this:
As usual, we create cluster jobs per timepoint by sourcing the master file parameters with create_export_jobs
Note that we first run a job with parameter run_only_job_number set to 0. This creates the master dataset.h5 file.
The rest of the hdf5-<number>.job bash scripts execute export.bsh BeanShell using Fiji
The hdf5-<number>.job bash scripts will be submitted to the cluster with the following submit-jobs script
and generate in the raw data directory a series of .h5 files. Each file contains the raw data for one time-point. At this point without any registration.
From now on, the data are in the HDF5 container (unregistered) and can be viewed in BigDataViewer. In the next step we register the data by running the registration pipeline and updating the XML.
Multiview registration
We now have to .xml files. dataset.xml created during the define xml step and hdf5_dataset.xml created after re-saving to HDF5. Lets first make a copy of the dataset.xml
and copy the hdf5_dataset.xml into dataset.xml
Like this we have a back-up of the two intermediate state XMLs and a dataset.xml to use as input for registration.
The parts of master file relevant for multiview registration look as follow:
and
The parameters are read from master through the create-registration-jobs
which generates registration_<number>.job bash scripts that launches registration.bsh in Fiji on the cluster
Note that the registration bash executes 3 macro commands.
Toggle Cluster Processing - activates cluster processing which makes cluster specific parameters of registration available
Detect Interest Points for Registration - detects beads or sample features used for registration
Register Dataset based on Interest Points - does the actual registration using the detected interest points
The registration_<number>.job scripts are submitted to the cluster with submit_jobs bash
The result of the registration are 10 XML files, one for each timepoint, in the raw data directory:
The per timepoint XMLs need to be merged into a single output XML. This can be done at any point of the cluster run, i.e. not all XMLs need to exist to perform the merge and the merge can be performed multiple times. It however makes sense to wait until all per-timepoint XMLs are created.
The merge step has a single specific parameter in the master’
creates merge.job that will execute merge_xml.bsh on the cluster node using Fiji
importij.IJ;importij.ImagePlus;importjava.lang.Runtime;importjava.io.File;importjava.io.FilenameFilter;runtime=Runtime.getRuntime();System.out.println(runtime.availableProcessors()+" cores available for multi-threading");xml_path=System.getProperty("xml_path");xml_filename=System.getProperty("xml_filename");System.out.println("directory="+xml_path);IJ.run("Merge Cluster Jobs","directory="+xml_path+" "+"filename_contains=job_ "+"filename_also_contains=.xml "+"display "+// "delete_xml's " +"merged_xml=registered_"+xml_filename);/* shutdown */runtime.exit(0);
merge.job should be executed on the cluster in interactive mode (see here).
The result of the merge is registration_dataset.xml. This is the final product of the registration pipeline. The results of registration can be viewed using BigDataViewer
Tips and tricks
the per-timepoint XML files can be deleted after the merge.
regardless of whether or not the per-timepoint files are deleted, new per-timepoint XMLs can be added by re-running the merge.job