FIMM

P.O. Box 20
FI-00014
University of Helsinki
Finland

Contacts

2009 © Institute for Molecular Medicine Finland FIMM

GeneSapiens was created at an interdisciplinary interface

Published on 2009-07-08

The GeneSapiens database, which was created through a cooperative effort by researchers at the Institute for Molecular Medicine Finland (FIMM), the VTT Technical Research Centre of Finland and the Tampere University of Technology, contains all available information of the expression of 17,330 human genes in both healthy and disease tissues.

Sami Kilpinen, M.Sc., who is a bioinformatician and PhD student at FIMM, is the first author of the paper with 17 co-authors representing the University of Helsinki, the Tampere University of Technology and VTT. They have created a database with which one can quickly and easily investigate the expression of any human gene in 43 normal and 132 disease human tissues.

Although similar data compilation projects have also been carried out before, GeneSapiens has a decisive advantage that data from all studies, tissue types and diseases are fully integrated and can be easily explored through a biologist friendly user interface at www.genesapiens.org.

"From the database, it is possible to retrieve a single integrated view that indicates how a specific gene is expressed across all human tissues," Kilpinen explains.

Kilpinen got the idea of creating GeneSapiens as he saw how difficult it was to utilise gene information found in public databases for research purposes.

"The information was created using several different methods, and was scattered. However, this information provides great advantages for researchers. That is why I started to compile gene expression data from all possible sources."

Sami_ja_Kalle_3.jpg

The first effort was to harmonise the terminology. For the most part, the work was carried out by Kilpinen's research colleague Kalle Ojala, M.Sc., together with VTT researchers. Each study had to be processed and it had to be investigated how the study had been conducted, and the precise meaning of the terminology used by the researcher had to be found out.

"This was very time-consuming and required a lot of patience," Ojala admits.

Although all data compiled by Kilpinen had been produced with microarrays, the results were not directly comparable because there are differences in the methods and technologies employed. This is why a mathematical model had to be developed in order to be able to integrate data from different platforms into a comparable format.

"This part took three years, and it was carried out in cooperation with the Tampere University of Technology, and particularly with researcher Reija Autio," says Kilpinen.

When the project had reached a stage where it was offered for publication, the researchers encountered a lot of prejudice.

"Before this, an integrated database like this has been seen as a kind of unreachable Holy Grail; there are many who have not believed that it would be possible to create such a database."

The Genome Biology journal published the study in September 2008 after the researchers had provided detailed answers to critical questions presented by the reviewers, and once they had been assured by the researchers of the reliability of the project.

After the publication, the article quickly made it into the top 20 most sought articles in the journal. In June 2009, a total of 770 users from 18 countries had registered to use the database, which is intended for researchers who carry out academic research.

A joint project of biologists and mathematicians

GeneSapiens was a typical multidisciplinary project: it combined biology with mathematics and computer science, and a lot of the early efforts were funded by the Academy of Finland systems biology program.

"In many cases, the expertise of research teams is related to either biology or mathematics, and it is not easy to combine the fields of know-how. FIMM provides the possibility to operate at the interdisciplinary interface just like this," Kilpinen says.

Kilpinen and Ojala are both biologists by education, but they both also have competence in mathematics and computer science. In fact, Kilpinen says that he had to act as a sort of interpreter between the biologists at VTT and the mathematicians at the Tampere University of Technology, since these two groups were not always able to find a common language.

The development process of GeneSapiens also touched upon medicine, but its actual effect will be seen once the project is at the stage of clinical applications. "Of course, the objective is to also improve diagnostics so that the gene information would benefit patient treatment, but we still have a long way to go," says Kilpinen.

Text and photo: Päivi Lehtinen

 

« Back