Simulation of modern gene activity measurements: From microarrays to microscopy (Genevieve)
Project leader: Professor Olli YLI-HARJA
Institute of Signal Processing, Tampere University of Technology http://www.cs.tut.fi/sgn/csb/
Doctoral students of the project:
Daniel Nicorici
Matti Nykter
MSc graduations during the project:
Pekka Ruusuvuori
Jyrki Selinummi
Antti Lehmussola
Jenni Seppälä
Miika Ahdesmäki
Kaisa-Leena Taattola
Sakari Palokangas
Other researchers of the project:
Timo Erkkilä
Xiaofeng Dai
Key words: simulation, microarray, gene activity, measurements, microscopy, cell population
Project desciption and main results:
Whatever we measure, in biological research or elsewhere, the measuring result is not ideal. It is just a distorted presentation of the underlying reality. Therefore, it is necessary to thoroughly understand all the properties and features of the measurement system to take into account as many error sources as possible. During the project Genevieve, we studied the effects of different measurement systems to systems biological research by means of simulation.
We selected two commonly used measurement techniques, namely microarrays and fluorescent microscopy, and studied these techniques in detail by applying simulation. During the project, we designed data simulators for gene microarrays (cDNA, oligonucleotide) and protein microarrays (lysate). For fluorescent microscopy, a simulator for images of cell populations, including cell arrays, was implemented. This simulation approach has numerous benefits: First, we get unlimited amounts of new data, and are free of financial and time constraints. Second, we have total control of the experiments, and a sophisticated simulator can be applied in education to present different features of the measurement systems. Finally, simulation provides us with ground truth information that can be used in validation of analysis methods as well as in assessing effects of preprocessing and quality control.
Examples of the problems we addressed by simulation: 1. What is the effect of artifacts and imaging noise in different microarray experiments? 2. Does the selected cell enumeration algorithm perform satisfactorily?
During the project, three different software packages were developed, all open source, freely available for download.
Simulation of microarray data and images with realistic characteristics:
http://www.cs.tut.fi/sgn/csb/mamodel/
Computational framework for simulating fluorescence microscopy images with cell populations:
http://www.cs.tut.fi/sgn/csb/simcep/
Software for quantification of labeled cells by automated image analysis:
http://www.cs.tut.fi/sgn/csb/cellc/
We also implemented the main results into Medicel Oy's Integrator platform, and combined a microarray simulation model with a model allowing simulation of genetic regulatory networks from TEKES Neobio project Dynette. The combination of these models aims at producing realistic data, that is, data with realistic statistical and biological characteristics.
Selected publications:
A. Lehmussola, P. Ruusuvuori, J. Selinummi, H. Huttunen, and O. Yli-Harja. Computational framework for simulating fluorescence microscope images with cell populations. IEEE Transactions on Medical Imaging, 26(7):1010–1016, 2007.
A. Lehmussola, P. Ruusuvuori, and O. Yli-Harja. Evaluating the performance of microarray segmentation algorithms. Bioinformatics, 22(23):2910–2917, Dec 2006.
M. Nykter, T. Aho, M. Ahdesmaki, P. Ruusuvuori, A. Lehmussola, and O. Yli-Harja. Simulation of microarray data with realistic characteristics. BMC Bioinformatics, 7:349, Jun 2006.
J. Selinummi, J. Seppälä, O. Yli-Harja, and J. A. Puhakka. Software for quantification of labeled bacteria from digital microscope images by automated image analysis. BioTechniques, 39(6):859–863, Dec 2005.