Public data sets

Revision as of 16:09, 5 June 2012 by Rueden (talk | contribs) (Update URLs)

Do you need image data to try your algorithms on? Do you lack expert ground truth to test your methods? No problem! Here you have a list of available public data sets from the Fiji community and other sources:

30 sections from a serial section Transmission Electron Microscopy (ssTEM) data set of the Drosophila first instar larva ventral nerve cord (VNC). The microcube measures 2 x 2 x 1.5 microns approx., with a resolution of 4x4x50 nm/pixel.
The challenge: use this data set to train machine learning software for the purpose of automatic segmentation of neural structures in ssTEM.
The images are representative of actual images in the real-world: there is a bit of noise; there are image registration errors; there is even a small stitching error in one section. None of these led to any difficulties in the manual labeling of each element in the image stack by an expert human neuroanatomist. A software application that aims at removing or reducing human operation must be able to cope with all these issues.
See other manually segmented serial section TEM data sets.
The dataset consists of tubulin histone GFP coexpressing C. elegans embryos. All image planes were collected at 512x512 resolution in 8-bit grayscale.
4D data set, kindly provided by Dirk Sieger and Francesca Peri, EMBL.
Macrophages in 4D
The challenge: trace the macrophages in 4D, and measure their shape volumes, surfaces, positions and pixel value intensities.
The file is a tif hyperstack that can be loaded directly into the 3D Viewer.
The key idea is to setup the tracking system so that it requires minimal user interaction, and is general. For example, a user clicks on one cell in one time point, and the program should find all other time points of the same cell. Then the program should learn about the statistics of that cell, and offer the means to automatically find all other cells in the volume and track them as well. In summary, a machine learning approach to 4D tracking.
  • The Cell Image Library has many images related to cell biology; the images come with one of 4 licenses: Public Domain, Creative Commons Attribution, Creative Commons Attribution Non-Commercial Share-Alike, and Copyrighted (you need to ask the contributor whether you may use their image explicitly). This website allows you to contribute your own data sets, too!