Public data sets
Revision as of 17:44, 6 March 2013 by Albertcardona (new FIBSEM data set)
Do you need image data to try your algorithms on? Do you lack expert ground truth to test your methods? No problem! Here you have a list of available public data sets from the Fiji community and other sources:
- Segmented ssTEM stack of neural tissue, thanks to Albert Cardona.
- 30 sections from a serial section Transmission Electron Microscopy (ssTEM) data set of the Drosophila first instar larva ventral nerve cord (VNC). The microcube measures 2 x 2 x 1.5 microns approx., with a resolution of 4x4x50 nm/pixel.
- The challenge: use this data set to train machine learning software for the purpose of automatic segmentation of neural structures in ssTEM.
- The images are representative of actual images in the real-world: there is a bit of noise; there are image registration errors; there is even a small stitching error in one section. None of these led to any difficulties in the manual labeling of each element in the image stack by an expert human neuroanatomist. A software application that aims at removing or reducing human operation must be able to cope with all these issues.
- Sample data at LOCI, in a variety of file formats.
- Sample OME-TIFF data on openmicroscopy.org, thanks to Josh Bembenek.
- The dataset consists of tubulin histone GFP coexpressing C. elegans embryos. All image planes were collected at 512x512 resolution in 8-bit grayscale.
- Migrating macrophages in response to stimuli.
- 4D data set, kindly provided by Dirk Sieger and Francesca Peri, EMBL.
- The challenge: trace the macrophages in 4D, and measure their shape volumes, surfaces, positions and pixel value intensities.
- The file is a tif hyperstack that can be loaded directly into the 3D Viewer.
- The key idea is to setup the tracking system so that it requires minimal user interaction, and is general. For example, a user clicks on one cell in one time point, and the program should find all other time points of the same cell. Then the program should learn about the statistics of that cell, and offer the means to automatically find all other cells in the volume and track them as well. In summary, a machine learning approach to 4D tracking.
- The Cell Image Library has many images related to cell biology; the images come with one of 4 licenses: Public Domain, Creative Commons Attribution, Creative Commons Attribution Non-Commercial Share-Alike, and Copyrighted (you need to ask the contributor whether you may use their image explicitly). This website allows you to contribute your own data sets, too!
- Open Connectome Project hosts a 10 terabyte data set of the mouse visual cortex (Bock et al., 2011) with CATMAID, and will offer means to run arbitrary programs on the data very soon.
- A 3D electron microscopy dataset of rodent brain, courtesy of Graham Knott and Marco Cantoni at EPFL. In particular, a FIBSEM volume measuring 5x5x5 micrometers and taken from the CA1 hippocampus region of the brain, and imaged at the extraordinary resolution of 5x5x5 nanometers per voxel! In addition, Aurelien Lucchi from Pascal Fua's lab has made available, in the same page, image volumes contaning labels for mitochondria.