2017-12-04 - Fiji + KNIP hackathon

Revision as of 05:17, 7 December 2017 by Eglinger (talk | contribs) (Technical Discussions: fix typos ;))

From Monday, December 4, 2017 through Friday, December 15, 2017, the Max Planck Institute of Molecular Cell Biology and Genetics hosts ~50 developers at the Center for Systems Biology in Dresden, Germany for a hackathon to develop ImageJ2 and Fiji core infrastructure and plugins.


https://gitter.im/fiji/hackathon_dd_2017


Voluntary hackathon calendar

https://tinyurl.com/ybjcq9qw

Hackathon google doc

https://docs.google.com/document/d/1h4uCt4PAEdeGQQwwVZC73o_Anq6-AI2KZiSzxY8kWng/edit

Hackathon on Twitter (#hackdd17)

https://twitter.com/hashtag/hackdd17?vertical=default&src=hash

Technical Discussions

N5, not HDF5

For more info on N5, check out the github repository here.

  • feels like HDF5, but stores chunks(blocks) in separate files in the file system.
  • is a Java library, but Constantin Pape already wrote a C++/python version of it: z5 (also matches "zarr" library), https://github.com/constantinpape/z5
  • attributes are stored in an additional JSON file
  • Discussion: should we define standard now as to how data should be stored in there to prevent an emergence of a zoo of different flavors as there is for HDF5?
    • how to do time series where each timestep / angle could have different image size
    • if we want a general "N5 viewer" for images, we'd have to add calibration data
    • put this information around the N5 dataset, because it behaves more like a dataset within an H5 file.
    • Perhaps make it versioned? Because a duck is not always a duck...
  • why another file format?
    • parallel writes (awesome for clusters with shared filesystem)
    • there is a special type for label blocks
    • blocks can have a halo
    • the block grid does not need to be filled dense, some blocks could be missing
    • couldn't this just be another flavor of HDF5?
  • are parallel writes to the same block prevented by some kind of locks?
  • the HDF5 team should be included in the discussions to learn from their mistakes - there is lots of information on parallel writing of HDF5 files out there
  • Try to write a N5 dataset into a FUSE filesystem/file??? Could this be a work-around for the many-small-files issue?

Hackathon Progress