Bacteriophages, or phages for short, are viruses that infect bacteria and have therefore again come more into focus for therapeutic applications in recent years. Especially for antibiotic-resistant pathogens, phage therapies could be a promising option. To date, however, little research has been done on phages and their influence on microbiomes in the environment and in humans.
Although more and more data from sequenced microorganisms are available, there has been no systematic investigation of which phages occur. To solve this problem, scientists have collaborated with nanozoo to develop the software What the Phage. The software provides a workflow that uses machine learning and other algorithms to detect phages and predict possible new variants from sequence data.
At What the Phage, several programs were combined into one workflow for this purpose and optimized for the fastest and simplest possible evaluation by users. The researchers deliberately relied on a modular open-source principle in order to be able to continuously expand and improve the workflow, such as the prediction for prophages.
ConsensusPrime facilitates primer design for new PCR assays
The PCR, the polymerase chain reaction, is considered the methodological gold standard for precise molecular laboratory tests in infectiology. Sections of genetic material that are characteristic of the pathogen being sought can be reproduced using the molecular biological method and thus detected.
But in order for the eponymous enzyme – DNA polymerase – to know which DNA segments it needs to duplicate; scientists design so-called primers – short DNA segments to which the polymerase binds. With ConsensusPrime, researchers from the Optical Molecular Diagnostics and Systems Technology Department at Leibniz IPHT have developed a bioinformatics tool that enables faster and more efficient design of primers.
When designing primers, researchers are often faced with the challenge that they want to detect or distinguish not just one, but several very closely related pathogen strains. At the same time, however, a primer must also be adapted as specifically as possible to the target genome. Researchers therefore often look for a so-called consensus primer, i.e. a sequence that has the greatest similarity to several strains but is nevertheless optimally suited as a primer.
„Without our software, you usually have to manually select the sequence segments that show similarities. This is not only imprecise, but also hardly manageable with large amounts of data and very time-consuming. ConsensusPrime filters out unsuitable primers directly and precisely calculates the genomic segments with the greatest similarities. With our pipeline, we thus obtain a consensus sequence for the optimal primer,“ explains Dr. Maximilian Collatz from Leibniz IPHT.
About the InfectoGnostics Research Campus Jena
The InfectoGnostics Research Campus Jena is a public-private partnership breaking new ground in the diagnostics of infections and pathogens, such as viruses, bacteria, and fungi. InfectoGnostics is funded by the Federal Ministry of Education and Research (Bundesministerium für Bildung und Forschung, BMBF) under the funding initiative “Research Campus – Public-Private Partnership for Innovation” with additional support from the state of Thuringia. About half of the required budget is financed by the participating partners. www.infectognostics.de
What the Phage:
Mike Marquet et. al, GigaScience, Volume 11, giac110, 2022,
More information on GitHub:
What the Phage | https://github.com/replikation/What_the_Phage Mike Marquet, Martin Hölzer, Mathias W Pletz, Adrian Viehweger, Oliwia Makarewicz, Ralf Ehricht, Christian Brandt. 2022. What the Phage: a scalable workflow for the identification and analysis of phage sequences, GigaScience, Volume 11, giac110. DOI: 10.1093/gigascience/giac110 ConsensusPrime | https://github.com/mcollatz/ConsensusPrime Maximilian Collatz, Sascha D. Braun, Stefan Monecke, and Ralf Ehricht. 2022. "ConsensusPrime—A Bioinformatic Pipeline for Ideal Consensus Primer Design" BioMedInformatics 2, no. 4: 637-642. DOI: 10.3390/biomedinformatics2040041