RNAProNet (funded by the German Federal Ministry of Education and Research)
The precise regulation of cellular processes is key to orchestrating hundreds of proteins, RNAs and metabolites in each cell to ensure smooth functioning of the cell or the entire organism. Over the last two decades, RNA has been shown to be one of the most important regulatory classes of molecules, interacting with proteins, other RNAs, DNA and metabolites to form complex, non-hierarchical regulatory networks.
A variety of specific, high-throughput experimental methods are now available to study the different interactions, often accompanied by specially adapted software. This heterogeneity makes it difficult to integrate all interaction types into an overarching RNA-centric network of regulatory processes.
The aim of this collaborative project is to develop algorithms and pipelines to analyse RNA-based regulation and, in particular, to integrate it into existing protein-based regulatory systems in order to provide a comprehensive picture of regulatory networks with RNA as a central component.
Our responsibilities in this collaborative project are the development of a workflow for the automated reconstruction of transcriptomes in pro- and eukaryotes and the implementation of algorithms for the analysis of RNA-RNA interaction data based on high-throughput sequencing.
MeDaMCAn (funded by the DFG in the frame of SPP2141 "More than defence: the multiple facets of CRISPR-Cas")
The discovery of a prokaryotic, adaptive immune system, the so called CRISPR-Cas, has revolutionised genetic engineering. The underlying mechanism makes it possible to edit genomic sequences with nucleotide precision, the insertion of genetic material at precisely defined positions, and the accurate deletion of unwanted genomic stretches.
The original function of these systems in bacteria and archaea is the defence against invading DNA, such as viruses and plasmids, and more importantly to confer immunity against these elements. The specificity of CRISPR-Cas is mediated by the so called CRISPR RNAs (crRNAs) that are complementary to the foreign DNA. These crRNAs form a complex with the Cas proteins that recognises the invading DNA and one of the Cas proteins, the Cas endonuclease, cuts the DNA, thereby promoting its degradation. Variants of CRISPR-Cas systems could be identified that are involved in gene regulatory functions in the cell.
The actual challenge in the analysis of CRISPR-Cas systems is to find targets. Because of the fact that CRISPR-Cas is a defence system against foreign DNA, the targeted entities are under a high selective pressure to circumvent this targeting. If they manage to escape, the targeted region has acquired mutations, which hinders their finding based on sequence similarity. If they are unsuccessful to escape, they become extinct. For these reasons, and because viral genomes are underrepresented in existing sequence databases, the search for CRISPR-Cas targets is challenging. Here, metagenomic data have the advantage that they represent a genomic snapshot of a defined habitat at a defined timepoint. Therefore, they likely also contain information about virus-host interactions, as they occur in course of a CRISPR-Cas response to infection. Additionally, the vast majority of prokaryotic species is not culturable, such that metagenomic data also offers the potential to increase the diversity of known CRISPR-Cas systems.
When analysing metagenomic datasets, graph-based assembly algorithms are usually used, which assemble the sequence data into longer genomic pieces. The latter then serve as the basis for further analyses. The problem with this assembly step is that heuristics are used, which lead to data loss.
In the proposed project, we therefore want to develop methods for a comprehensive analysis of metagenomic datasets that dispense with the assembly step. The goal is an exhaustive analysis of metagenomic data sets in order to derive their basic properties for as many CRISPR-Cas systems as possible. The question here is, for example, whether defence against invading DNA is the main task of CRISPR-Cas, or whether other functions, e.g. gene regulation, are more common than is known.