Difference between revisions of "HPC Workflow Manager"

Line 58: Line 58:
  
 
A timer will appear in the download column. When it has completed the uploading the cell that corresponds to the job should indicate that it is “Done” (Figure 6).
 
A timer will appear in the download column. When it has completed the uploading the cell that corresponds to the job should indicate that it is “Done” (Figure 6).
 +
 +
Now that the script file is uploaded the job can be started. Right click the row of the job and select “Start Job” from the context menu.
 +
 +
To make the source code of the user cleaner and easier to understand the special functions that make parallelism available to the user are appended to the user script on upload and a new file is created called “mpitest.txt” which is the file that will be executed on the cluster.
 +
 +
To inspect the submitted file (for example for debugging) you can right click the job and select “Open macro in Editor” where you can see the contents of the user script along with the appended function definitions that provide parallelism.
 +
 +
Finally, to start the job right click on the job and select the “Start job” item from the context menu.
 +
 +
=== Inspecting progress ===
 +
There are two ways to inspect the progress of a job.
 +
 +
The first one is by looking at the “Status” of a job. This way you can see whether a job is running on the HPC Cluster or not. In the case of Figure 7 the job is “Queued”.
 +
 +
However, this is a very coarse-grained way to see the progress of the job and when it starts running it does not provide any useful information until it has ended (“Finished”, “Failed” etc.).
 +
 +
The second way is to open the “Job dashboard” for the desired job by either double click the job’s row or right click and select the “Job dashboard” context menu item. Note that the job must be in the state “Running” for this functionality to work, you may open the window earlier and it will start displaying the progress when the state changes automatically.
 +
 +
Select the tab “Macro Progress” and ignore the rest of the tabs for now (see section Job dashboard for descriptions of the rest of the tabs).
 +
 +
To view the progress, click on the “Macro Progress” if it is not already selected (it should be selected by default). Please be patient while the progress is loading. There is a status bar on the lower right corner of the window where you can monitor the process of getting the progress from the HPC Cluster (the progress is stored in a separate progress file for each compute node of the HPC Cluster it is run).
 +
 +
=== Job dashboard ===
 +
In the “Job dashboard” there are five tabs. “Macro Progress”, “Error output”, “Other output”, “Job directories” and “Data upload”. 
 +
* “Macro Progress” – this tab is described in the previous section Inspecting progress;
 +
* “Error output” - the error output and warnings that are redirected from the HPC Cluster live;
 +
* “Other output”- the redirected standard output from the cluster in the tab;
 +
* “Job directories” – contains a listing of the job directories (Input, Output and Working); and
 +
* “Data upload” – contains a listing of the files that were uploaded;
 +
 +
=== How to download the results ===
 +
Once the job has finished you can right click and select the item “Download result” which will now have become available.
 +
 +
When the timer in the “Download” column has finished and the state is “Done” the files will have been transferred. You can see the downloaded files by right clicking the job and selecting the item “Open job sub-directory”.

Revision as of 06:46, 8 October 2019

General Information

The HPC Workflow Manager Client supports two workflow types:

  • SPIM; and
  • Macro.

This guide will only explain how to use the newly added Macro workflow type.

How to use

How to start the plugin

From the Fiji menu bar select Plugins > Multiview Reconstruction > HPC Workflow Manager and fill in the Login dialog that will appear. For example, see the filled in dialog in Figure 1.

How to login

You need to enter the username, password, and email for your account. If it is the first time you use this installation of the program you must create a new directory anywhere to use as a working directory. If you have used HPC Workflow manager in the past you can use an already existing working directory. Select the working directory by clicking on the browse button or typing the path. The directory must already exist.

Press "Ok" and the dialog will should disappear, and a progress dialog should appear. If not, then a new message should inform you of the error made during filling in the dialog.

How to create a new job

After the connection to the HPC Cluster is made and the jobs are downloaded from the cluster you should see a window like the one in Figure 2. If it is the first time you run this plugin the table will be empty.

Right click in the empty table or an empty row of the table to display the context menu, an example of the context menu is featured in Figure 3.

Select the first option “Create a new job”. The “Create job” window will appear. From the “Workflow Type” section select the “Macro Execution” option.

In the input data location, you must provide a directory which contains your Macro script (it should be named “user.ijm”). If this is the first time you are using the HPC Workflow plugin with Macro support, you can use the example found in the following link: https://github.com/MKrumnikl/Ij1MPIWrapper/tree/addFeatureScatter/src/main/resources/ExampleScripts/HelloWorld

In the node configuration select four nodes (4) by pressing the up arrow in the numeric field four times.

In the “Output data location” section leave the default option, “Output data location”, selected.

Now, the filled-in form should look like Figure 4. If you are using Linux save the “HelloWorld” example script in your home directory (“~/HelloWorld/user.ijm”) and use that path instead of “C:/Documents/HelloWorld”. When you are sure that the form is filled-in correctly press the “Create” button.

How to start a job

If you have created a new job, the main window should look roughly like Figure 5.

Here you can see the following columns:

  • “Job ID” - Job’s identification number;
  • “Status” – The job’s current status which can be:
    • “Unknown” – the state of the job is not known;
    • “Configuring” – the job is being configured;
    • “Queued” – the job is in a queue and where there are available nodes it will be executed;
    • “Running” – the job was executed is currently running;
    • “Finished” – the job has stopped running successfully, completing its tasks;
    • “Failed” – the job has stopped running unsuccessfully, it did not complete its tasks;
    • “Canceled” – the job was stopped by the user; and
    • Disposed – the job was disposed.
  • “Creation time” – the time when the job was created.
  • “Start time” – the time when the job was last started.
  • “End time” – the time when the job last ended.
  • “Upload” – whether the job was uploaded.
  • “Download” – whether the job was downloaded.
  • “Workflow Type”- whether it is SPIM or Macro workflow type.

Right click on the new job to display the context menu (of Figure 3). You will notice that there are new enabled items.

Before you can start the job, you need to upload your script (“user.ijm”). To do this you must select the “Upload data” item from the context menu.

A timer will appear in the download column. When it has completed the uploading the cell that corresponds to the job should indicate that it is “Done” (Figure 6).

Now that the script file is uploaded the job can be started. Right click the row of the job and select “Start Job” from the context menu.

To make the source code of the user cleaner and easier to understand the special functions that make parallelism available to the user are appended to the user script on upload and a new file is created called “mpitest.txt” which is the file that will be executed on the cluster.

To inspect the submitted file (for example for debugging) you can right click the job and select “Open macro in Editor” where you can see the contents of the user script along with the appended function definitions that provide parallelism.

Finally, to start the job right click on the job and select the “Start job” item from the context menu.

Inspecting progress

There are two ways to inspect the progress of a job.

The first one is by looking at the “Status” of a job. This way you can see whether a job is running on the HPC Cluster or not. In the case of Figure 7 the job is “Queued”.

However, this is a very coarse-grained way to see the progress of the job and when it starts running it does not provide any useful information until it has ended (“Finished”, “Failed” etc.).

The second way is to open the “Job dashboard” for the desired job by either double click the job’s row or right click and select the “Job dashboard” context menu item. Note that the job must be in the state “Running” for this functionality to work, you may open the window earlier and it will start displaying the progress when the state changes automatically.

Select the tab “Macro Progress” and ignore the rest of the tabs for now (see section Job dashboard for descriptions of the rest of the tabs).

To view the progress, click on the “Macro Progress” if it is not already selected (it should be selected by default). Please be patient while the progress is loading. There is a status bar on the lower right corner of the window where you can monitor the process of getting the progress from the HPC Cluster (the progress is stored in a separate progress file for each compute node of the HPC Cluster it is run).

Job dashboard

In the “Job dashboard” there are five tabs. “Macro Progress”, “Error output”, “Other output”, “Job directories” and “Data upload”.

  • “Macro Progress” – this tab is described in the previous section Inspecting progress;
  • “Error output” - the error output and warnings that are redirected from the HPC Cluster live;
  • “Other output”- the redirected standard output from the cluster in the tab;
  • “Job directories” – contains a listing of the job directories (Input, Output and Working); and
  • “Data upload” – contains a listing of the files that were uploaded;

How to download the results

Once the job has finished you can right click and select the item “Download result” which will now have become available.

When the timer in the “Download” column has finished and the state is “Done” the files will have been transferred. You can see the downloaded files by right clicking the job and selecting the item “Open job sub-directory”.