Software Tools, Data Structures, and Interfaces for ...cshprotocols.cshlp.org/content/2012/1/pdb.top067504.full.pdf · proprietary formats—QuickTime, AVI, WAV, etc. These compressed

Topic Introduction

Software Tools, Data Structures, and Interfaces forMicroscope Imaging

Nico Stuurman and Jason R. Swedlow

The arrival of electronic photodetectors in biological microscopy has led to a revolution in the appli-cation of imaging in cell and developmental biology. The extreme photosensitivity of electronic photo-detectors has enabled the routine use of multidimensional data acquisition spanning space and timeand spectral range in live cell and tissue imaging. These techniques have provided key insights intothe molecular and structural dynamics of living biology. However, digital photodetectors offeranother advantage—they provide a linear mapping between the photon flux coming from thesample and the electronic sample they produce. Thus, an image presented as a visual representationof the sample is also a quantitative measurement of photon flux. These quantitative measurementsare the basis of subsequent processing and analysis to improve signal contrast, to compare changesin the concentration of signal, and to reveal changes in cell structure and dynamics. For this reason,many laboratories and companies have committed their resources to software development, resultingin the availability of a large number of image-processing and analysis packages. In this article, we reviewthe software tools for image data analysis that are now available and give some examples of their use inimaging experiments to reveal new insights into biological mechanisms. In our final section, we high-light some of the new directions for image analysis that are significant unmet challenges and presentour own ideas for future directions.

BACKGROUND

Ahallmark of scientific experiment is the quantitative comparison of a control condition and ameasureof a change or difference after some perturbation. In biology, microscopes are used to visualize thestructure and behavior of cells, tissues, and organisms, and to assess changes before, during, or aftera perturbation. Amicroscope collects light from a sample and forms an image, which is a representationof the sample, biased by any contrast mechanisms used to emphasize specific aspects of the sample. Forthe first 300 years of microscopy, this image was recorded with pencil and paper, and this artistic rep-resentation was then shared with others. The addition of photographic cameras on microscopesenabled mass reproduction and substantially reduced, but by no means eliminated, the viewer’s biasin recording the image for the first time. Even though the microscope image was directly projectedonto the recording medium, the nonlinear response of film to photon flux and its relative insensitivityin low-light applications limited the application of microscopy for quantitative analysis.

DIGITAL IMAGES

What Is a Digital Image?

Microscope digital images are measurements of photon flux across a defined grid or area. They arerecorded using either a detector that is an array of photosensitive elements, or pixels, that records a

Adapted from Live Cell Imaging, 2nd edition (ed. Goldman et al.). CSHL Press, Cold Spring Harbor, NY, USA, 2010.

© 2012 Cold Spring Harbor Laboratory PressCite this article as Cold Spring Harbor Protoc; 2012; doi:10.1101/pdb.top067504

50

Cold Spring Harbor Laboratory Press on December 2, 2020 - Published by http://cshprotocols.cshlp.org/Downloaded from

http://cshprotocols.cshlp.org/

http://www.cshlpress.com

whole field simultaneously or a single-point detector that is scanned, usually as a raster, across thesample field to create a full image. The recorded value at each pixel in the image is a digitized measure-ment of photon flux at a specific point and corresponds to the voltage generated by electrons liberatedby photons interacting with the detector surface. Computer software is used to display, manipulate,and store the array of measured photon fluxes as what we recognize as a digital microscope image.

The Multidimensional Five-Dimensional Image

Each array of pixels generates a two-dimensional (2D) image, a representation of the sample.However, it is now common to assemble these 2D micrographs into larger entities. For example, aseries of 2D micrographs taken at a defined focus interval can be thought of as a single three-dimensional (3D) image that represents the 3D cell under the objective lens. Alternatively, a seriesof 2D micrographs taken at a single focal position at defined time intervals form a different kind of3D image—a time-lapse movie at a defined focal plane. It is also possible to record a focal seriesover time and to create a four-dimensional (4D) movie. Any of these approaches can be furtherexpanded by recording different contrast methods—the most common, by far, is the use of multiplefluorophores to simultaneously record the concentration of different molecules at the same time. Inthe limit, this generates a five-dimensional (5D) image. We have chosen to use the singular image toemphasize the integrated nature of this data structure and that the individual time points, focal planes,and spectral measurements all are part of a single measurement.

Regardless of the specific details of an experiment, an image actually has all these dimensions, butsome only are of unitary extent. In the simplest case, the recording of a fluorescence signal from asingle wavelength and focus position at a specific time generates a 5D image. The focus, time, andwavelength dimension all exist but are just equal to 1. Thus, recording more than one fluorophoresimply extends the spectral dimension, as recording a time series extends the time dimension. Inthis approach, extents change but the intrinsic dimensionality of the image does not. The advantageof this approach is that it provides a single data structure for all data storage, display, and processing.For example, processing of a 5D image only requires definition of focal planes, time points, and wave-lengths. In most cases, a single application, aware of the 5D form of the data file, suffices to handle dataof different extents.

One of themost difficult parts of working with these data structures is the lack of a defined nomen-clature for referring to the data. Images that sample space only are sometimes referred to as “3Dimages” or “stacks.” Time-lapse images are often referred to as “movies” or “4D images.” Time-lapsedata can be stored in their original format or compressed and assembled into a single file and stored inproprietary formats—QuickTime, AVI, WAV, etc. These compressed formats are convenient in thatthey substantially reduce the size of files and are supported by common presentation software (e.g.,PowerPoint and Keynote), but they do not necessarily retain the pixel data in a form that preservesthe integrity of the original data measurements. It is important to be aware of the distinctionbetween compression methods that are lossless (i.e., the original data can be restored) and lossy(often much better at reducing storage but losing the ability to restore the original data).

Monochrome versus Color

Microscope images can be either monochrome or color, and it is critical to know the derivation anddescription of the data in an image to understand what is actually being measured. Monochromeimages are single-channeled images and are themost directmapping of the photon fluxmeasurementsrequired by the photoelectronic detector. They are used as the basis of more elaborate displays usingcolor to encode different channels or different lookup tables. Color images may be created based onthe display of multiple monochrome images; however, images may also be stored as color (e.g.,JPEG).Analysis of color images (e.g., images of histology sections) is possible but often starts by decom-posing the color image into the individual RGB (red–green–blue) channels and by processing themseparately. Analysis based on differences in intensity should be undertaken with caution, as the filesthat store color images rarely retain the quantitativemapping of photon fluxmeasured by the detector.

Cite this article as Cold Spring Harbor Protoc; 2012; doi:10.1101/pdb.top067504 51

Tools for Microscope Imaging




Bit Depth

When stored in digital form, data from electronic detectors are stored in bits, or more formally, abase-2 representation of the quantitative measurements from the photodetector. By convention, abyte is a sequence of 8 bits and thus can represent numerical values from 0 to 28 – 1 or 255. Datathat can be represented in this range are referred to as 8-bit data and have a bit depth of 8 bits.Most scientific-grade charge-coupled device (CCD) cameras digitize their data to either 12 bits(data range from 0 to 212 – 1 or 4095) or 16 bits (data range from 0 to 216 – 1 or 65,535). Whenstored in computer memory, 8-bit data map easily to a single byte, whereas 12-bit or 16-bit datamust be stored in 2 bytes in sequence to properly represent the data. In general, data-acquisition soft-ware handles storage of these data without any intervention from the user. However, when movingdata between different software programs, unexpected transformations can occur. For example,data recorded with a 12-bit CCD camera will appear as 16-bit data to any visualization or analysisprogram that reads it. Most microscopy software tools handle this difference properly; however,some (such as Photoshop) display an image assuming a possible dynamic range of 216, requiringthe user to manually change the display settings. For these reasons, knowing the bit depth of one’sdata is helpful in understanding what is actually displayed.

Metadata (Nonpixel Data)

A critical component of any microscope digital image is the image “metadata”—that is, the nonpixeldata that describe the image pixel data or binary data. Metadata can include a large number ofmeasurements and quantities about the image. The most important are the dimensions of theimage, the bit depth or number of bits per pixel, and the physical size of each pixel in the sample;there are a large number of others. These are the most basic forms of metadata and largely relateto the image itself. Proper display and analysis of the 5D image described above depends on aknown specification that records the extent of each of the dimensions and as much informationabout sampling intervals, descriptions of spectral properties, imaging contract modes, etc., as possible.In addition, many imaging files also store information describing the image acquisition, detailing thesettings on the microscope used to record the image. In general, image metadata are a critical com-ponent of the image and are certainly mandatory for systematic use of software tools for image displayand analysis.

Proprietary Formats

Microscope digital images must be written as a file on a file system, and this file is written in a definedspecification or file format so that the data can be read by another software program. File formatsinclude facilities to store the image data themselves as well as a selection of image metadata definingaspects of the image and the data-acquisition process. Almost all packages support the concept of 5Ddata, as described above, but the details of file types used are quite variable: Some store the data in asingle file, whereas some use a directory on a file system to store all the individual frames of an imagein separate files. Many of these derive from commercial software used to run commercial turnkeyacquisition systems. Each commercial software package uses its own proprietary file format, andthe rapid growth in imaging products has spawned a large number of image file formats. Ourcount has identified at least 60 different formats used in biological microscopy. This creates a signifi-cant problem for anyone wanting to transfer data written between software packages or operatingsystems.

File Format Tools: Bio-Formats

Knowing the correct image, metadata are essential for working with images for visualization and pro-cessing. The metadata describe fundamental details such as the size of the image, the position of theorigin, and the number of bytes in the image file in each pixel and also can contain critical information

52 Cite this article as Cold Spring Harbor Protoc; 2012; doi:10.1101/pdb.top067504

N. Stuurman and J.R. Swedlow




about experimental settings (e.g., the optical section thickness); thus, having access to the metadataassociated with any image is critical for properly analyzing and viewing microscope images.

Most software write image data and metadata in their own format, and this has led to a plethora ofimage file formats across life sciences microscopy. Many of these derive from commercial software forwhich each commercial package uses its own preparatory file format; in addition, there are a smallnumber of public projects developing image-processing software for biological microscopy, and,again, many of these have their own file format. This creates a significant problem for anyonewanting to access data written by one software package in another software package; this lack of stan-dardization continues to plague the field. By far, the best way to deal with this problem is to use apublicly available library that can translate a large number of preparatory file formats into a standar-dized data structure available to essentially any software. This library, called Bio-Formats (http://www.openmicroscopy.org/site/products/bio-formats), is the result of an open project founded at the Uni-versity of Wisconsin in Madison and by Glencoe Software Incorporated. The library is written in Javaand is an open-source and open-development resource publicly available under the GPL (GeneralPublic License) (http://www.gnu.org/copyleft/gpl.html). As of this writing (mid-2009), this libraryreads image metadata and data for 70 different open and proprietary file formats and can be usedwithin ImageJ, MATLAB, and many other popular image-processing programs.

Standardized Formats

The panoply of microscopy image file formats creates problems for the scientist who requires datawritten with one software package to be read by another software package. Moreover, as imagingbecomes more of a mainstay of systems biology, the development and applications of new processingalgorithms is hampered by the burden of supporting many different file formats. For these reasons, astandardized format that can be used by all software tools is required. This format must capture theimage metadata using a commonly accepted specification and also store the binary data (the actualpixels) in a commonly accessible form.

In 2005, the Open Microscopy Environment (OME; http://openmicroscopy.org) proposed anXML (Extensible Markup Language)-based metadata specification known as the OME Data Model(Goldberg et al. 2005) (http://ome-xml.org). This specification provides a mechanism for describingmost common modes of microscopy (fluorescence, phase, differential interference contrast) and canrecord imaging system parameters for most imaging methods (wide field, confocal, multiphoton).The metadata and binary image data can be written as a stand-alone format (OME–XML) that cap-tures a full 5D image in a single XML document with the binary image data stored as compressed base64. This is convenient for transport but performs poorly for analysis and visualization. An alternativeis to use the metadata specification in OME–XML and to store this in the header of a TIFF (TaggedImage File Format) file. This hybrid file format is known as OME–TIFF and has a number of distinctadvantages.

• Image planes are stored within one multipage TIFF file or across multiple TIFF files. Any imageorganization is feasible.

• A complete OME–XML metadata block describing the image is embedded in each TIFF file’sheader. Thus, even if some of the TIFF files in a 5D image are misplaced, the metadataremain intact.

• The OME–XMLmetadata block in an OME–TIFF file is identical to the metadata contained in anOME–XML file and can refer to pixel data stored in single or multiple TIFF files.

• The only conceptual difference between OME–XML and OME–TIFF is that instead of encodingpixels as base-64 chunks within the XML such as OME–XML does, OME–TIFF uses the standardTIFF mechanism for storing one or more image planes in each of the constituent file(s).

• Storing image data in TIFF is a de facto standard, and essentially all image-handling software canread TIFF-based formats; thus, adoption and integration of OME–TIFF is straightforward.




http://www.openmicroscopy.org/site/products/bio-formats




http://www.gnu.org/copyleft/gpl.html

http://www.gnu.org/copyleft/gpl.html

http://openmicroscopy.org


http://ome-xml.org

http://ome-xml.org

http://ome-xml.org



A full description of the OMEDataModel (Goldberg et al. 2005)may be obtained at the website http://ome-xml.org. OME–XML and OME–TIFF are supported by a number of commercial software pro-viders and can be used as an export format from ImageJ and MATLAB (see below).

DATA ACQUISITION

Acquiring a Digital Image

Digital microscope images can be acquired either by projecting the magnified image of the objectdirectly onto the camera or by scanning a sample and measuring light intensity at each spatialelement using a fast detector such as a photomultiplier tube or avalanche photodiode. In bothcases, intensity data end up in computer memory, organized in rows and columns, from wherethey can be displayed, stored, or otherwise manipulated. When using scanning for image formation,control of the scanning device (which often consists of mirrors mounted on a galvo) needs to be coor-dinated with timing of the intensity measurements such that the computer has knowledge of whichintensity measurement belongs to which spatial element.

Maximizing Information Content

The information content of the acquired images is determined by the intensity of the captured photonflux (signal), noise in the signal, and noise in the detection system. Noise in the signal itself is calledshot noise and increases with increasing signal intensity. However, because shot noise increases moreslowly than signal strength itself, increasing the signal strength will increase the signal-to-noise ratio(SNR). Noise in the detection system most often consists of a signal-independent constant (loweringthis constant is worth a premium!) and possibly a component linearly dependent on signal strength.Therefore, detector-system-generated noise can also be overcome by increasing the signal strength.These considerations make it desirable to acquire images in which the maximum detected signalsare close to the saturation point of the detector. It is important that the saturation of the detectornot be surpassed, as such clipping will result in an unreliable representation of the object. So, howdoes one go about increasing the detected signal?

Obvious ways to do so are by increasing the photon flux itself. This can usually be accomplished byincreasing intensity of the illumination source. One can also increase the integration time on thedetector (by increasing exposure time of the camera or by scanning more slowly in a raster-scanningsystem). A drawback of both methods is the increased photon dose on the sample increasing the like-lihood of photon-induced damage and photobleaching. Moreover, increasing integration time mightlead to motion blur in live samples. Thus, information content of live cell images is almost alwayslimited by sample-imposed constraints on photon dose. Another approach is to sacrifice spatial res-olution, for instance, by binning multiple pixels on a camera. In all cases, it is essential to optimize thelight path between the sample and the detector such that minimal light loss is incurred. This includesthe use of objective lenses with high numerical aperture in fluorescence imaging so that a larger frac-tion of the light emitted by the sample is collected, reducing optical aberrations such as spherical aber-ration as well as optimizing the design of optical filters and/or other elements to maximize lightthroughput.

Autofocus

When images are taken at multiple positions, and/or when there is focus drift during the time frame ofacquisition (caused, for instance, by slight changes in environmental temperature), an autofocusmechanism is needed. There are two principal methodologies for autofocus used in microscopy:image-based autofocus and reflection-based autofocus. In image-based autofocus, a series of picturesis taken at different focus positions, and the position of best focus is calculated based on image con-tent (Rabut and Ellenberg 2004). The methodology consists of assigning a focus factor to each ac-quired image, for instance, by finding maximum intensity, sharp edges, or using other image-analysis




http://ome-xml.org

http://ome-xml.org

http://ome-xml.org

http://ome-xml.org



techniques and then estimating the position of best focus from the relation of focus position and focusfactor. In reflection-based autofocus systems, a (often infrared) light beam is directed through theobjective toward the image. This light beam is reflected back (in the case of air objectives, mainlyfrom the air–coverslip interface; for oil-immersion objectives, from the coverslip–cell medium inter-face) and detected by a sensor. By combining measurement of this reflection signal with movement ofthe focus stage, it is possible to find such a reflecting surface and/or to lock onto that surface (employ-ing a direct feedback loop between sensor and focus stage). Currently available implementations differin their ability to apply an offset between the point of optical focus and the reflecting surface as well asthe speed of the feedback loop between sensor and focus drive. Image-based autofocus is often muchslower than reflection-based autofocus and also results in exposure of the sample to illuminating light(rather than the more benign and lower-dose infrared light used in reflection-based autofocus), whichcan result in phototoxicity and photobleaching. However, when the distance between sample andsurface varies, reflection-based autofocus by itself cannot be satisfactory. Development of robust auto-focus mechanisms has driven automated high-content/high-throughput image-based screening tech-nologies, and the increasing availability of autofocus on research microscopes is fueling automatedhigh-throughput approaches in live cell imaging.

Computer Control of Imaging Devices and Peripherals

Current research microscopes are equipped with a multitude of computer-controllable componentssuch as x–y stages, focus drives, shutters, filter wheels, etc. These motorized components make poss-ible the fully automated acquisition of 5D images. First, the user will specify a protocol that defines thedesired sequence of component movements and image acquisition. For instance, to obtain a 5D imageusing a camera-equipped wide-field epifluorescence microscope, software will move the x-y stage tothe desired position, move the z drive to the start position, put the desired dichroic mirror, excitationand emission filters in place, open the correct shutter, start exposure of the camera, close the shutter,read out the image from the camera, move another dichroic mirror and filters in place, open theshutter, start camera exposure, close the shutter, read out the image, move the z drive to the next pos-ition, and so forth and so on. It is the task of acquisition software to carry out such sequences ofinstrument-control events as quickly and as reproducibly as possible yet be flexible and easy to con-figure for the end user.

Software Tools for Image Acquisition

Many microscope image-acquisition software packages are available. In those cases where the inter-face to the equipment is proprietary, acquisition software can only be provided by the hardwarevendor. This is currently almost universally the case with commercial scanning (confocal) micro-scopes (a notable exception is the software ScanImage from the group of Karel Svoboda at JaneliaFarm). Camera-based systems are often more open, and interface descriptions are available formany motorized microscopes and scientific-grade cameras, as well as for most peripheral equipmentsuch as shutters, filter wheels, and stages. Accordingly, a number of third-party software packages forcamera-based microscope image acquisition are available (examples are MetaMorph fromMolecularDevices, Image-Pro Plus fromMedia Cybernetics, Volocity from Improvision/Perkin-Elmer, and Slide-Book from Intelligent Imaging Innovations, Inc.). In addition, there is a growing trend among micro-scope companies to produce software that only works with their own microscopes (examples areAxioVision from Carl Zeiss and NIS-Elements from Nikon).

In all cases, the number of different types of microscope equipment supported by each softwarepackage is limited because of the large variety of such equipment and the lack of computer interfacestandards. Also—with very few exceptions—it is impossible for third parties to add support for amicroscopy-related device to an existing software package. Most of the software packages onlywork within the Microsoft Windows operating system, and programmatic access to them is usuallyhighly limited, causing advanced imaging laboratories to not use these packages but to code theirown in programming environments such as LabVIEW. Because of these constraints, it is typical in






research–laboratory environments to have different software packages run each microscope system,drastically increasing training requirements and frustrating researchers who spend more time becom-ing familiar with a particular user interface rather than the principles of operating a microscope.Because of the low volume of sales in comparison to consumer software, the price of image-acqui-sition software is relatively high ($5000–$15,000), and quality can be disappointing.

To alleviate these issues, a group (including one of the authors) in the laboratory of Ron Vale at theUniversity of California, San Francisco started development of open-source software for microscopecontrol named µManager (http://micro-manager.org) in 2005. The software is cross-platform (runson Windows, Mac OS X, and Linux), has a simple user interface, and allows for programmatic exten-sions in many different ways. It now supports a large number of devices used in camera-basedmicroscopy imaging. Most importantly, µManager has an open programming interface to devices,allowing anyone to add support for a novel device to µManager. A significant number of such“device adapters” have already been contributed by third parties. This programming interface alsoallows for programmatic abstraction of devices such as cameras, shutters, and stages so that programsor scripts written for the µManager Core can now work with all hardware supported by µManager,greatly expanding their usefulness. Not only does this give advanced-imaging laboratories a way toeasily make their developments quickly available to other scientists, but also it gives microscopistsa common interface for microscope equipment that will hopefully grow into an industry-wide stan-dard, reducing cost and increasing quality of software for microscopy.

IMAGE PROCESSING

Image-Processing Fundamentals

Image processing is a broad term that encompasses a large number of different kinds of imagemanipulation. In general, image-processing routines can be characterized as either linear or nonlinear.Linear approaches retain the proportionality of relative intensities, whereas nonlinear approachespotentially may change the relative intensities. For this reason, intensity-based measurement andanalysis can only be performed after processing with linear-based approaches. However, nonlinearapproaches are often quite useful for segmenting images to create masks that can be used to definethe boundaries of images for signal quantification.

Nonlinear Contrast Enhancement

There are a range of nonlinear approaches for enhancing contrast in images. The best comprehensiveresource for the schemes is John Russ’s thorough presentation of image processing (Russ 2007). Theserange from edge detection schemes to convolution techniques and other filtering techniques. It isimportant to distinguish between approaches that only work on 2D images and those that considerthe full 3D volume. In general, these approaches are used either to generate a representation ofdata that is easier to appreciate visually or to generate masks to use for further quantification of theoriginal image.

Deconvolution

Deconvolution techniques make use of knowledge of the point-spread function of the microscope toimprove the SNR and contrast in an image. There is a wide variety of deconvolution techniques thathave been implemented in most open and commercial image-processing packages. In general, decon-volution approaches can be classified as either deblurring techniques or restoration techniques.Deblurring techniques use the point-spread function to estimate blurring and subtract the blurringfrom the original image. By contrast, restoration techniques also use the point-spread function butattempt to calculate the distribution of intensity in the sample based on the point-spread function.Restoration techniques are usually iterative calculations and therefore require substantial processing




http://micro-manager.org





power. Full descriptions of these techniques and their applications in biological microscopy are avail-able (Swedlow et al. 1997; Wallace et al. 2001).

Image-Processing Platforms

A wide variety of commercial and open image-processing platforms are available for processing bio-logical images. Photoshop is a standard image-processing tool that is often used for simple imageenhancement, cropping, and color-mapping changes. Photoshop only handles 2D images and is avery powerful application; unfortunately, Photoshop can also be used to substantially change theappearance of an image. It is important to remember that the images that are being processed are actu-ally data, and the original appearance of the data must be preserved. A very clear and definitive defi-nition of appropriate uses of imaging processing in cell and developmental biology has been publishedby The Journal of Cell Biology (Rossner and Yamada 2004).

There are a number of commercial image-processing packages that are dedicated toward biologi-cal microscopy. These are too numerous to delineate here but are available from most vendors’ web-sites. In almost all cases, these handle multidimensional time-lapse and 3D images or the 5D imageand also provide sophisticated image-processing and visualization tools. To complicate matters, allcommercial image-acquisition software comes bundled with substantial image-processing capabili-ties. These commercial packages have varying support for file formats beyond their own. The usershould compare the ability to move data within these packages carefully.

A very commonly used open platform for image processing in microscopy is ImageJ. ImageJ is anopen Java-based program that has pluggable architecture and, for this reason, has become a popular toolused formost noncommercial image processing and analysis. ImageJ is free to download (http://rsbweb.nih.gov/ij/) and includes an extensive library of plug-in functions that are developed by the communityand extend the basic functionality of ImageJ (for instance, the aforementioned acquisition softwareµManager can run as an ImageJ plug-in). ImageJ can be used for image acquisition, image analysis,image processing, and final production of figures. ImageJ has been maintained by Wayne Rasband(National Institutes of Health [NIH]). There are now a number of efforts going forward to extendthe architecture and functionality of ImageJ using modern software programming techniques.

ANALYSIS

Object Definition and Measurement

Analysis of biological image data usually proceeds by identifying the objects to be measured and thenactually measuring their properties. Identification of objects is called segmentation. There is a widerange of segmentation tools available, and most image-processing packages have many differentchoices for methodologies to define the boundaries of objects. As noted above, it is important toknow whether the algorithm used is working in 2D or 3D. A full description of segmentation algor-ithms is available. In general, segmentation methods use some method of defining an object based onintensity boundaries but can include more sophisticated measurements. Almost always, the par-ameters used for segmentation are empirical and defined by a user’s knowledge of the objects theywant to measure. The end product of segmentation is a defined object in space and/or time withdefined boundaries of what is included in the objects and what is excluded. Thus, segmentation isthe basis for further analysis of object intensities and object tracking.

For single-molecule imaging, a standard approach is the fitting of a Gaussian-shaped functionto every bona fide molecule measured in the image. This fit image is then used for furthercharacterization.

For live cell imaging, a very common analysis is the tracking of objects as they move through spaceacross time. A wide variety of tracking algorithms is available. The simplest ones take an object at timet and identify its nearest neighbor at time t +1. These neighbor-based techniques (e.g., Platani et al.2002) are appropriate only for situations in which there is a small amount of movement between




http://rsbweb.nih.gov/ij/





frames and there are fairly sparsely distributed objects in the image. More sophisticated tracking toolsare necessary for more difficult images. One tool that one of us (J.R.S.) has used successfully combinesa global-minimization approach and quite accurate gap filling to produce powerful tracking perform-ance (Jaqaman et al. 2008).

Object and Measurement

Once objects are defined, a wide range of measurements can be performed across them. Most com-monly, measurement of the intensity contained within an object is calculated and then corrected bythe background intensity found around the object. More sophisticated analysis includes measurementof the lifetime of the fluorophore within the object. Fluorescence lifetimemeasurements are quite sen-sitive to changes in the environment of a fluorophore and are increasingly used for analysis of Försterresonance energy transfer (Wouters et al. 2001).

A large number of parameters can also be calculated on objects that characterize the shape of theobject and the distribution of intensity within the object. Shaped parameters include themeasurementof elongation, skew, kurtosis, and deviations from an ideal circle. The distribution of intensities withinan object is often referred to as textures and can be calculated by fitting classes of polynomials to theobject. There are a large number of different types of features that can be calculated for each object,and often, defining the calculations to be performed is an empirical task to identify what derived par-ameter can be used in an image-based measurement.

Machine Learning Methods

Over the past 10 years, a number of groups have applied well-establishedmachine learning tools to thedistribution of signals within microscope images. The first studies by the Murphy laboratory demon-strated the potential application of this approach by identifying subcellular distributions that humanscould not even visualize (Boland and Murphy 2001). Since then, a number of groups have helpeddevelop these concepts and have applied them to a large number of biological problems from single-cell analysis to histological sections to high-content screening (Neumann et al. 2006; Zhou and Peng2007; Orlov et al. 2008). All of these methods calculate a large number of intensity textures andshaped-based features and then use various well-established methods for defining which featurescan discriminate between multiple different classes. These tools can potentially be quite powerfuland can be used to define classes that might not be obvious to the user.

Tools for Image Analysis

There are a large number of commercial and open packages for multidimensional image analysis.ImageJ is an open framework that includes plug-ins for quite powerful image analysis. In addition,the MATLAB and IDL (Interface Definition Language) frameworks provide scripting frameworksfor sophisticated image analysis. All of these tools are commonly used in modern biological research.Again, these very powerful tools should always be used with caution to ensure that no perturbationsare caused in the data.

Critical Applications for Image Analysis

Image processing and analysis are now ubiquitous in cell and developmental biology. We havedescribed the basic tools that are available in most commercial and open image-processing packagesand used for most analysis. However, there are a number of much more sophisticated tools thatdeserve mention as examples of what is possible when a careful image acquisition and advancedmath-ematics are brought to bear.

The first example is fluorescent-speckle microscopy. This approach is used to characterize thedynamics of polymers, especially the cytoskeleton systems in living cells (Danuser and Waterman-Storer 2006). The approach has been especially powerful for analyzing the dynamics of microtubulesand actin in the mitotic spindle and during cell motility. These methods depend on the use of






advanced image acquisition and object identification and tracking. The Danuser and Waterman lab-oratories have pioneered the application of these techniques and have revealed substantial new under-standing of the properties and dynamics of the cytoskeleton.

One of the most powerful uses of live imaging is the measurement of the dynamics of molecules ormacromolecular complexes in living cells. With the maturation of laser-scanning confocal micro-scopes, it became a routine to monitor the movement of fluorescently labeled molecules using fluor-escence recovery after photobleaching. This method was first applied to measure the dynamics ofmolecules into 2D monolayers (Axelrod et al. 1976), but it has been extended to analyze moleculardynamics in a wide variety of cells and tissues (Lippincott-Schwartz et al. 2001). Themethodmeasuresthe recovery of fluorescent protein into a bleached region. The most common reported values are thet1/2 of the recovery and the mobile fraction or the fraction of the labeled protein that is mobile onthe timescale being measured. More sophisticated analyses can reveal the characteristic of bindingof the fluorescent protein to its receptors and the presence of multiple forms of the protein with dis-tinct mobilities (Braga et al. 2004; Rabut et al. 2004; Sprague and McNally 2005). These approachesprovide a quantitative basis to compare molecular association in living cells.

The use of imaging for large-scale assays is becoming more and more routine. A large number ofsmall interfering RNA screens have revealed a substantial number of new components of critical cel-lular machinery and new pathways and networks. Signaling and motility have been examined in Dro-sophila S2 cells (Ramadan et al. 2007), and themachinery that is responsible for driving the function ofthe mitotic spindle has been extensively examined using polygenome knockdown inDrosophila and inhuman cells (Bettencourt-Dias et al. 2004; Neumann et al. 2006; Goshima et al. 2007). These studieshave revealed a substantial number of new components of the mitotic spindle. In all cases, new com-binations of object identification, feature selection, and machine learning have been used to identifynew components and pathways. These assays stand as a testament of the kind of sophisticated analysisthat is possible when advanced image processing is brought to bear in biological imaging.

IMAGE DATA MANAGEMENT

In our own laboratories, it is quite common for a single graduate student or postdoctoral fellow togenerate >500 GB of images during their time in our laboratories, and, inevitably, they find themselvesstruggling to manage the collection of image data files associated with their experiments. For thisreason, image data management has become a critical issue that can hinder biological discovery.Data can be organized on the file systems that are in common use in all laboratories or by using specificdata-management applications. In this section, we review these facilities and their strengths andweaknesses.

Data Management on the File System

Storing data on the file system is perhaps the simplest way to manage large volumes of image data. Itallows users to make use of very familiar tools such as file names and directories to organize and trackdata. For sophisticated multistep analysis workflows, simply adding specified monikers to file namesor putting results files in specific directories is often sufficient to identify data files. However,file-system-based data management does not usually allow multiple users to visualize and, forexample, annotate and analyze data on the file system, simply because the tools that are needed totrack who does what to each file are not available. Most important, access to the data depends ondirectly accessing the file system, which may not be possible for users or collaborators who do nothave sufficient privileges or who are not within a laboratory’s firewall.

Data-Management Applications

An alternative to using the file system involves building applications for themanagement of large-scaledata sets from imaging experiments. In general, these applications are based on a server–client archi-tecture in which an application is built and runs on a central server and then delivers information out






to client applications connecting through the Ethernet. Most commonly, those client applicationsinvolve web browsers, but certainly, image clients can be built in any programming language. Theuse of this type of architecture enables a single-server application to serve multiple users and toprovide remote access in a flexible and easily accessible way.

Server/client applications are usually built as a series of layers, often in a so-called three-tier archi-tecture. The foundation or bottom of the application is a database application that stores data and thelinks between different data elements using a relational database. At the next level, a middleware appli-cation provides the definitions and the policies to communicate with external clients, thus deliveringthe relational database to the so-called front-end or client applications. These provide a user-facingtool that enables the use of the data held in the database. The server–client relationships are at theheart of most data-management applications, and the overall design and tools to build them arewell standardized and are available quite freely as open-source applications.

Both authors have been involved in software projects that deliver data-management tools forimaging in microscopy. The Scientific Image DataBase (SIDB; http://sidb.sourceforge.net/) was a first-generation open-source data-management tool built with help from Nico Stuurman that uses PHPscripts communicating with a PostgreSQL database and a web-browser-based front-end user. Thisfirst-generation application showed the usefulness of data management but also demonstrated the dif-ficulty of developing a full-fledged data-management program for images that supported largenumbers of file formats and provided facilities for both viewing and managing and analyzing largerepositories of image data.

In 2001, Jason R. Swedlow, along with cofounders Ilya Goldberg and Peter Sorger, initiated thefounding of the OME (http://openmicroscopy.org). OME is an open-grant-funded software develop-ment project to build data-management tools for life sciences imaging. OME has released the Bio-Formats library described above and has released a series of applications for image data managementin the biological sciences. The basic principles of OME are to develop tools that allow interoperabilitybetween software applications. Rather than build specific applications for any scientific domain, OMEfocuses its efforts on applications that are as generic as possible and serve as interfaces between existingsoftware tools or provide interfaces for future tools to use. In addition, OME releases specificationsfor common file formats to promote data sharing between individuals and individual softwareapplications.

Since 2000, OME has released data-management applications to support working with largenumbers of image and large-image sets as well as supporting interfaces for analysis (Swedlow et al.2009). From 2000 to 2005, OME released the OME Server, a Perl application for image data manage-ment. Since 2007, OME has released Open Microscopy Environment Remote Objects (OMERO), aJava Enterprise application for image-data management. All OME data-management applicationsrely on a server-client architecture such that a database and middleware application provide all ofthe storage, referencing and image data, and metadata access and allow connections to the serverfrom client applications running on the user’s desktop or laptop computers using a standard Internetconnection. OMERO is built to provide a rapid extensible and scalable data-management solutionthat supports access from a wide variety of programming and application environments. Basedon Java, OMERO runs on all major operating systems, is now running in hundreds of laboratoriesworldwide, and is available for download at http://openmicroscopy.org.

CONCLUSION

Modern microscopy, and live cell imaging in particular, requires the use of software tools for dataacquisition, processing, and analysis. In many cases, a result from an imaging experiment is onlyapparent after a significant number of steps of processing and analysis. The technology available tothe modern microscopist is developing rapidly, and the arrival of open software tools will providemuch more flexible environments for scientific investigation and discovery. There is an ongoing evol-ution across the whole domain of commercial and open imaging software tools, allowing for




http://sidb.sourceforge.net/

http://sidb.sourceforge.net/







increasingly sophisticated experiments and deeper insights into cellular mechanisms and pathways.We look forward to these advances and the discoveries they will enable.

ACKNOWLEDGMENTS

Work by N.S. on µManager is supported by grant 1R01RB007187 from the NIH. Work in the J.R.S.Laboratory is supported by the Wellcome Trust (085982), Cancer Research UK (C303/ A5434), andthe Biotechnology and Biological Sciences Research Council (BB/G01518X/1).

REFERENCES

Axelrod D, Koppel DE, Schlessinger J, Elson E, Webb WW. 1976. Mobilitymeasurement by analysis of fluorescence photobleaching recovery ki-netics. Biophys J 16: 1055–1069.

Bettencourt-Dias M, Giet R, Sinka R, Mazumdar A, Loc WG, Balloux F,Zafiropoulos PJ, Yamaguchi S, Winter S, Carthew RW, et al. 2004.Genome-wide survey of protein kinases required for cell cycle pro-gression. Nature 432: 980–987.

Boland MV, Murphy RF. 2001. A neural network classifier capable ofrecognizing the patterns of all major subcellular structures influorescence microscope images of HeLa cells. Bioinformatics 17:1213–1223.

Braga J, Desterro JM, Carmo-Fonseca M. 2004. Intracellular macro-molecular mobility measured by fluorescence recovery after photo-bleaching with confocal laser scanning microscopes. Mol Biol Cell 15:4749–4760.

Danuser G, Waterman-Storer CM. 2006. Quantitative fluorescent specklemicroscopy of cytoskeleton dynamics. Annu Rev Biophys BiomolStruct 35: 361–387.

Goldberg IG, Allan C, Burel J-M, Creager D, Falconi A, Hochheiser HS,Johnston J, Mellen J, Sorger PK, Swedlow JR. 2005. The OpenMicroscopy Environment (OME) Data Model and XML file: Opentools for informatics and quantitative analysis in biological imaging.Genome Biol 6: R47.

Goshima G, Wollman R, Goodwin SS, Zhang N, Scholey JM, Vale RD,StuurmanN. 2007. Genes required for mitotic spindle assembly inDro-sophila S2 cells. Science 316: 417–421.

Jaqaman K, Loerke D, Mettlen M, Kuwata H, Grinstein S, Schmid SL,Danuser G. 2008. Robust single-particle tracking in live-cell time-lapsesequences. Nat Methods 5: 695–702.

Lippincott-Schwartz J, Snapp E, Kenworthy A. 2001. Studying proteindynamics in living cells. Nat Rev Mol Cell Biol 2: 444–456.

Neumann B, Held M, Liebel U, Erfle H, Rogers P, Pepperkok R, Ellenberg J.2006. High-throughput RNAi screening by time-lapse imaging of livehuman cells. Nat Methods 3: 385–390.

Orlov N, Shamir L, Macura T, Johnston J, Eckley DM, Goldberg IG. 2008.WND-CHARM: Multi-purpose image classification using compoundimage transforms. Pattern Recogn Lett 29: 1684–1693.

PlataniM,Goldberg I, LamondAI, Swedlow JR. 2002. Cajal body dynamics andassociation with chromatin are ATP-dependent.Nat Cell Biol 4: 502–508.

Rabut G, Ellenberg J. 2004. Automatic real-time three-dimensional celltracking by fluorescence microscopy. J Microsc 216: 131–137.

Rabut G, Doye V, Ellenberg J. 2004. Mapping the dynamic organization of thenuclear pore complex inside single living cells.Nat Cell Biol 6: 1114–1121.

Ramadan N, Flockhart I, Booker M, Perrimon N, Mathey-Prevot B. 2007.Design and implementation of high-throughput RNAi screens in cul-tured Drosophila cells. Nat Protoc 2: 2245–2264.

RossnerM, Yamada KM. 2004.What’s in a picture? The temptation of imagemanipulation. J Cell Biol 166: 11–15.

Russ JC. 2007. The image processing handbook CRC, Boca Raton, FL.Sprague BL, McNally JG. 2005. FRAP analysis of binding: Proper and fitting.

Trends Cell Biol 15: 84–91.Swedlow JR, Sedat JW, Agard DA. 1997. Deconvolution in optical

microscopy. In Deconvolution of images and spectra (ed. Jansson PA),284–309. Academic Press, New York.

Swedlow JR, Goldberg IG, Eliceiri KW. 2009. Bioimage informatics forexperimental biology. Annu Rev Biophys 38: 327–346.

Wallace W, Schaefer LH, Swedlow JR. 2001. A workingperson’s guide todeconvolution in light microscopy. BioTechniques 31: 1076–1097.

Wouters FS, Verveer PJ, Bastiaens PI. 2001. Imaging biochemistry insidecells. Trends Cell Biol 11: 203–211.

Zhou J, Peng H. 2007. Automatic recognition and annotation of geneexpression patterns of fly embryos. Bioinformatics 23: 589–596.






doi: 10.1101/pdb.top067504Cold Spring Harb Protoc; Nico Stuurman and Jason R. Swedlow Software Tools, Data Structures, and Interfaces for Microscope Imaging

ServiceEmail Alerting click here.Receive free email alerts when new articles cite this article -

CategoriesSubject Cold Spring Harbor Protocols.Browse articles on similar topics from

(575 articles)Imaging/Microscopy, general (117 articles)Image Analysis

http://cshprotocols.cshlp.org/subscriptions go to: Cold Spring Harbor Protocols To subscribe to

© 2012 Cold Spring Harbor Laboratory Press


http://cshprotocols.cshlp.org/cgi/alerts/ctalert?alertType=citedby&addAlert=cited_by&saveAlert=no&cited_by_criteria_resid=protocols;10.1101/pdb.top067504&return_type=article&return_url=http://cshprotocols.cshlp.org/content/10.1101/pdb.top067504.full.pdf

http://cshprotocols.cshlp.org/cgi/collection/image_analysis

http://cshprotocols.cshlp.org/cgi/collection/imaging_microscopy_general

http://cshprotocols.cshlp.org/cgi/subscriptions



Documents

Software Tools, Data Structures, and Interfaces for ...cshprotocols.cshlp.org/content/2012/1/pdb.top067504.full.pdf · proprietary formats—QuickTime, AVI, WAV, etc. These compressed