Over the last years, molecular biology has changed from a single experiment science to a high throughput endeavour. Although the genomic revolution is rooted in medicine and biotechnology, it is the environmental, and specifically the marine, sector that currently delivers the highest quantity of data. The possibility to easily sequence DNA samples from natural environments with and without prior cultivation is an unprecedented resource to investigate microbial diversity and function on a molecular level.
It is the focus of the Microbial Genomics and Bioinformatics Group to develop enabling technologies to transform the wealth of sequence and contextual data into biological knowledge. To reach this goal we are developing a bioinformatic workbench that integrates the diversity and abundance of organisms with their static and expressed functional potential (their genes) with respect to prevailing environmental conditions.
An integrated view on the complex interplay of organisms, genes and the environment surrounding them is the first step towards the statistical analysis and modelling of complex metabolic processes and networks (ecosystems biology). It will help to reveal the key genes involved in central processes in the ecosystem and provide hints to discovering their potential functions. The results will not only generate a better understanding of the marine environment and its impact on human welfare in times of global climate change, but also deliver new targets for medical and biotechnological applications.
Environmental genomics at MPI-Bremen has been introduced in the year 2000 by selecting three environmentally relevant marine bacteria for whole genome sequencing, annotation and functional analysis. At this time the organisms of interest were two sulfate reducing bacteria (Desulfotalea psychrophila and Desulfobacterium autotrophicum) and one Planctomycete (Rhodopirellula baltica SH1 T, formerly Pirellula sp. strain 1). The REGX (Real Environmental GenomiX) project, as it was named, got a six year funding by the Federal Ministry of Education and Research (
BMBF).
Within the project we focussed on technology development and competence building by setting up a bioinformatic analysis pipeline for whole genomes. This includes gene-finding, automatic annotation, metabolic reconstruction, genome linguistics and comparative genomics. Furthermore, to cope with the increasing amount of data coming from annotation and functional analysis (transcriptomics/proteomics) the development of new software solutions for visualisation and data integration was started.
The REGX project just marked the starting point. In 2005 we got partners of the Network of Excellence "
Marine Genomics Europe". Since then we have processed more than a dozen of genomes from environmental organisms funded by
Genoscope, the
Moore-foundation and the Max Planck Society.
In order to investigate genes vital for environmental adaptation of environmentally relevant marine microorganisms, Rhodopirellula baltica SH 1T, is under continuous investigation. Based on its high quality genome sequence, whole genome microarrays have been set up and evaluated to perform adaptation experiments. Our main research interest is to gain more insight into potential gene functions for the huge number of hypothetical and conserved hypothetical genes found in this organism. We claim that these genes with, as of yet, unknown function, are the key to understanding niche adaptations.
Metagenomics
With the competence gained by the complete genome analysis of cultivated bacteria we were able to start the analysis of genomic fragments from uncultured microorganisms - called metagenomics. Currently, we are involved in several in house and external co-operations.
In October 2008 we started the
MIMAS (Microbial Interactions in MArine Systems) project, funded by the BMBF, to investigate the diversity and function of the microbial community in the North Sea at Helgoland Roads. Besides metagenomics, metatranscriptomics, using next generation sequencing technology is now conducted.
Since July 2009 we are partners of the
MAMBA (Marine Metagenomics for new Biotechnological Applications) project funded by the EU. We will provide the bioinformatic backbone for sequence data analysis and screening for new enzymes for biotechnological applications.
Ecosystems Biology
After nearly one decade of marine molecular research it is now time to bridge the gap between marine diversity research, marine (meta)genomics and the environment. An integration of (i) data on diversity and spatio-temporal abundance of microorganisms with (ii) genomic data and (iii) further oceanographic information on the chemistry, physics, geology and biology of the habitat can be readily achieved based on geographic location and sampling time, see
www.megx.net. This was started within the EU-project
MetaFunctions coordinated by the Microbial Genomics Group. To archive these goals contextual data acquisition and standardisation is a prerequisite. To move forward the group is in the international
Genomic Standards Consortium. Furthermore, since October 2009 we are partner of the
EuroFleets project funded by the EU. We will provide expertise in bioinformatics for environmental data integration.
In summary the cultivation dependent and independent approach to ecological genomics will allow gaining new, unbiased insights into the genes involved in the processes driven by environmental organisms. On a long run, we hope that this will give us the key to open the black box of microbial ecology. As technology developers we generate tools and software pipelines to support the biologists coping with the flood of data in life sciences. Transferring data into biological knowledge is our primary goal.
Phylogenetic inference has a long tradition within the group. We use and develop the software package
ARB for phylogenetic tree reconstruction based on ribosomal RNAs and functional genes.
To cope with the increasing amount of ribosomal RNA data we have set up the
SILVA system for quality-checked, aligned small- and large subunit ribosomal RNA sequences. We are happy that the SILVA project and databases have become an internationally accepted resource with around 250,000 hits and more than 40,000 visitors per year.
As a spin-off from SILVA
“The All-Species Living Tree” project provides a highly manual subset especially designed to serve the microbial taxonomist community. The aim of the project is to reconstruct a single 16S rRNA tree harbouring all sequenced type strains of the hitherto classified species of Archaea and Bacteria.
Furthermore, we offer international workshops on the theory and practical applications of phylogenetic tree reconstruction. To date more than 500 researchers have participated in these workshops held in Europe and the US. If you want to know more about it, please refer to the company
Ribocon.
We have set up a complete bioinformatic pipeline for (meta)genome research. Starting from gene-finding, where we integrated several gene-finders in our metatool MORFind, we managed to install, maintain and adapt the annotation system
GenDB. Based on this system we developed the MicHanThi tool for automatic annotation and
JCoast for visualization and manual annotation.
To cope with the deluge of data from Next Generation Sequencing, new software pipelines to store, normalize, cluster, classify and analyse the sequence data are currently under development. Tools to facilitate data analysis, visualisation and metabolic reconstructions have been implemented and are currently being extended.
The organization of small and large sequence fragments into bins of taxonomically related sequences (taxobins) is crucial to link the primarily uncultured functional diversity in the environment with the diversity and abundance of the organisms. Nevertheless, the bioinformatic organisation of these “data clouds” is a computationally demanding task, which is still only partially solved. On the protein level, BLAST and Pfam hits are used to deduce the taxonomic affiliation of metagenome sequences. On the DNA level, oligonucleotide compositional asymmetries are used for taxonomic sequence classification. For this task, a Markov model-based tool for fast sequence composition similarity searches has been developed, as well as two variants of Self-Organising Maps (SOMs). These tools are all currently integrated in the metatool TaxoMeter.
Beginning of 2005 we founded the
Ribocon GmbH as a spin-off of the Microbial Genomics and Bioinformatics Group for knowledge transfer and product development. Our competences focus on bioinformatic analysis of genes and genomes, phylogenetic inference, the software package ARB, as well as the SILVA databases. Currently, the management team consists of:
Dr. Jörg Peplies (CEO),
Arno Geerds (CFO) and
Prof. Frank Oliver Glöckner.