FijiArchipelago is a plugin that brings Cluster functionality to Fiji
== Overview ==
FijiArchipelago is a tool designed to make it easy for programmers to export Fiji/ImageJ functionality over a network to several other computers.
The "root node ," or the computer on which the cluster is started , operates as a server. "Client nodes " must be able to reach that server over a network, and must also have access to a shared network file resource.
machines are started automatically through a user interface. Fiji is started by opening a remote shell (currently ssh using JSch ), then running fiji with some arguments that indicate how the client should access the server. This project currently works for machines that share the same local network, but is also intended to be used eventually on HPC clusters with a qsub or similar architecture, specifically the University of Texas's TACC. This work is ongoing.
== Requirements ==
Server and clients should all have the same version of Fiji installed.* FijiArchipelago makes use of ssh and ssh key pair authentication, so the server must have a private key file that matches a public key in authorized_hosts on the client . * Clients must be able to access the server at the configured port.
* Server and clients must have access to a shared network file server if file transfer is required.
So far, this has been tested only on Linux machines, but it should be platform-independent .
== Usage ==
To start a Cluster , navigate to Plugins->Cluster->Start Cluster...
==== Start Cluster Dialog ====
* Server Port Number: This is the port that the FijiArchipelago server will listen on. * Remote Machine User Name: The username to use to start fiji on remote machines * SSH Private Key File: The location of the private key file to use for authentication. Currently, this has been tested only with password-less keys, but it should work with password-protected keys as well. * Local Exec Root: The folder containing the fiji executable * Local File Root: A folder that is shared over the network * Default Exec Root for Remote Nodes: This is used as the default folder for the fiji location on remote machines. * Default File Root for Remote Nodes: This is used as the default network share location for remote nodes. This should reference the same resource as Local File Root .
==== Configure Nodes Dialog ====
[[File: FijiArchipelagoShot02. png|500px]]
===== Add a node =====
Click the Add Node... button to add a new node * Hostname: The hostname of the new client node. This hostname is used for ssh purposes. * User name: The user name to use for ssh access. This defaults to the name entered in the previous dialog. * Port: The ssh port for this machine, with a default of 22. * Number of Threads: The number of desired threads to use on this machine. * Remote Exec Root: The folder containing the fiji executable on this client. * Remote File Root: The folder on this client corresponding to the location of the shared resource folder entered as Local File Root in the previous dialog. Click OK
Load from File / Save to File ===== The nodes entered on this dialog may be saved to a configuration file for later use. Multiple cluster files may be loaded, to add several groups of similar machines to use with the cluster. For instance, you might save host01, host02, ... host10 to fiji.cluster, and different-host01, different-host02, ... different-host10 to fiji-different.cluster, then load both files later to add all twenty hosts to one FijiArchipelago configuration.
===== Start the Cluster ===== Once you press OK on the Configure Nodes Dialog window, each listed host will be contacted via ssh using the username and private key file that were provided. FijiArchipelago will attempt to start an instance of fiji in headless mode, which should then contact your local computer on the indicated port to submit itself as ready to accept jobs.
A window will appear with a big "Stop Cluster " button in it. Click this button to stop the cluster and all instances of fiji on your client machines.
==== SIFT Extraction Example ====
Use File->Import->Image Sequence... to import a virtual stack of many images.
Click Plugins->Cluster->Benchmark... to run a SIFT benchmark of your cluster against your local machine. This will start a cluster if there isn't one already. SIFT features will be extracted from all images in the stack using default parameters over the cluster, then using all available cores on your local machine (including virtual, or hyper-threaded cores).
=== Programmers == = FijiArchipelago implements [http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html ExecutorService]. Unlike a typical ExecutorService, any Callable or Runnable submitted to a Cluster will be exported over the network to a remote machine.
To make this work,
any objects submitted to a Cluster are Serialized and transmitted to a remote instance of Fiji. The returned result is Serialized remotely, then retrieved by the local, or root node. Any submitted Callable or Runnable must implement Serializable. Failure to do so will result in a NotSerializableException at runtime. A consequence of Serialization is that the deep equality of objects is not preserved. In other words, a Callable that is designed to return an object that has been instantiated prior to submission will effectively return a clone.
Currently, only one Cluster may be operated as a server at a time. This instance is referenced by Cluster.getCluster(). Cluster.activeCluster() indicates whether there is an active Cluster already.
An example may be found in [https://github.com/fiji/fiji/blob/master/src-plugins/Fiji_Archipelago/src/main/java/archipelago/example/Cluster_SIFT.java Cluster_SIFT]. An example that demonstrates the breakage of deep equality may be found in [https://github.com/fiji/fiji/blob/master/src-plugins/Fiji_Archipelago/src/main/java/archipelago/example/Equality_Example.java Equality_Example]
===== Planned Features =====
* On-the-fly addition of new cluster nodes - complete
* Volunteer cluster nodes, in other words, the ability to operate an instance of fiji in client mode through the plugin menu. - complete
* The ability to detect crashed nodes and re-queue their jobs - complete
* Menu-click supercomputing, or the ability to submit to a qsub-enabled HPC cluster from a local machine.
* Security - Use ssh streams or SSLSockets to transfer objects, potentially allowing insecure Sockets as a non-default option. - complete
Note: as of 04/2013 this wiki article is out of date. I'll correct this in the coming weeks. (-Larry).