Johannes Söding
YOU?
Author Swipe
View article: Phylograd: fast column-specific calculation of substitution model gradients
Phylograd: fast column-specific calculation of substitution model gradients Open
Background Most popular tools for reconstructing phylogenetic trees from multiple sequence alignments use a model of molecular evolution in which a single substitution matrix or a small set of fixed matrices are shared between all columns.…
View article: Evaluation of metagenome binning: advances and challenges
Evaluation of metagenome binning: advances and challenges Open
Several recent deep learning methods for metagenome binning claim improvements in the recovery of high-quality metagenome-assembled genomes. These methods differ in their approaches to learn the contig embeddings and to cluster them. Rapid…
View article: Enhancing genome recovery across metagenomic samples using MAGmax
Enhancing genome recovery across metagenomic samples using MAGmax Open
Summary The number of metagenome-assembled genomes (MAGs) is rapidly increasing with the growing scale of metagenomic studies, driving fast progress in microbiome research. Sample-wise assembly has become the standard due to its computatio…
View article: De novo discovery of conserved gene clusters in microbial genomes with Spacedust
De novo discovery of conserved gene clusters in microbial genomes with Spacedust Open
Metagenomics has revolutionized environmental and human-associated microbiome studies. However, the limited fraction of proteins with known biological processes and molecular functions presents a major bottleneck. In prokaryotes and viruse…
View article: Enhancing genome recovery across metagenomic samples using MAGmax
Enhancing genome recovery across metagenomic samples using MAGmax Open
Summary The number of metagenome-assembled genomes (MAGs) is rapidly increasing with the growing scale of metagenomic studies, driving fast progress in microbiome research. Sample-wise assembly has become the standard due to its computatio…
View article: SoftAlign: End-to-end protein structures alignment
SoftAlign: End-to-end protein structures alignment Open
With the recent breakthrough of highly accurate structure prediction methods, there has been a rapid growth of available protein structures. Efficient methods are needed to infer structural similarity within these datasets. We present an e…
View article: Evaluation of Metagenome Binning: Advances and Challenges
Evaluation of Metagenome Binning: Advances and Challenges Open
Background Several recent deep learning methods for metagenome binning claim improvements in the recovery of high quality metagenome-assembled genomes. These methods differ in their approaches to learn the contig embeddings and to cluster …
View article: Rapid and sensitive protein complex alignment with Foldseek-Multimer
Rapid and sensitive protein complex alignment with Foldseek-Multimer Open
Advances in computational structure prediction will vastly augment the hundreds of thousands of currently available protein complex structures. Translating these into discoveries requires aligning them, which is computationally prohibitive…
View article: Stochastic Variational Inference for Structured Additive Distributional Regression
Stochastic Variational Inference for Structured Additive Distributional Regression Open
In structured additive distributional regression, the conditional distribution of the response variables given the covariate information and the vector of model parameters is modelled using a P-parametric probability density function where…
View article: Aggregating gut: on the link between neurodegeneration and bacterial functional amyloids
Aggregating gut: on the link between neurodegeneration and bacterial functional amyloids Open
Amyloids are insoluble protein aggregates with a cross-beta structure, which are traditionally associated with neurodegeneration. Similar structures, named functional amyloids, expressed mostly by microorganisms, play important physiologic…
View article: UniOP: a universal operon prediction for high-throughput prokaryotic (meta-)genomic data using intergenic distance
UniOP: a universal operon prediction for high-throughput prokaryotic (meta-)genomic data using intergenic distance Open
The study of the deluge of metagenomic and genomic sequences is challenging due to the severe lack of function information. Predicting operons, groups of functionally related genes in prokaryotic genomes, is critical for bridging this gap.…
View article: De novo discovery of conserved gene clusters in microbial genomes with Spacedust
De novo discovery of conserved gene clusters in microbial genomes with Spacedust Open
Metagenomics has revolutionized environmental and human-associated microbiome studies. However, the limited fraction of proteins with known biological process and molecular functions presents a major bottleneck. In prokaryotes and viruses,…
View article: Strain-resolved de-novo metagenomic assembly of viral genomes and microbial 16S rRNAs
Strain-resolved de-novo metagenomic assembly of viral genomes and microbial 16S rRNAs Open
Background Metagenomics is a powerful approach to study environmental and human-associated microbial communities and, in particular, the role of viruses in shaping them. Viral genomes are challenging to assemble from metagenomic samples du…
View article: CarpeDeam: A <i>De Novo</i> Metagenome Assembler for Heavily Damaged Ancient Datasets
CarpeDeam: A <i>De Novo</i> Metagenome Assembler for Heavily Damaged Ancient Datasets Open
De novo assembly of ancient metagenomic datasets is a challenging task. Ultra-short fragment size and characteristic postmortem damage patterns of sequenced ancient DNA molecules leave current tools ill-equipped for ideal assembly. We pres…
View article: Rapid and Sensitive Protein Complex Alignment with Foldseek-Multimer
Rapid and Sensitive Protein Complex Alignment with Foldseek-Multimer Open
Advances in computational structure prediction will vastly augment the hundreds of thousands of currently-available protein complex structures. Translating these into discoveries requires aligning them, which is computationally prohibitive…
View article: Rapid and Sensitive Protein Complex Alignment with Foldseek-Multimer
Rapid and Sensitive Protein Complex Alignment with Foldseek-Multimer Open
Advances in computational structure prediction will vastly augment the hundreds of thousands of currently-available protein complex structures. Translating these into discoveries requires aligning them, which is computationally prohibitive…
View article: Strain-resolved de-novo metagenomic assembly of viral genomes and microbial 16S rRNAs
Strain-resolved de-novo metagenomic assembly of viral genomes and microbial 16S rRNAs Open
Background Metagenomics is a powerful approach to study environmental and human-associated microbial communities and, in particular, the role of viruses in shaping them. Viral genomes are challenging to assemble from metagenomic samples du…
View article: TransAnnot—a fast transcriptome annotation pipeline
TransAnnot—a fast transcriptome annotation pipeline Open
Summary The annotation of deeply sequenced, de novo assembled transcriptomes continues to be a challenge as some of the state-of-the-art tools are slow, difficult to install, and hard to use. We have tackled these issues with TransAnnot, a…
View article: DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options
DescribePROT in 2023: more, higher-quality and experimental annotations and improved data download options Open
The DescribePROT database of amino acid-level descriptors of protein structures and functions was substantially expanded since its release in 2020. This expansion includes substantial increase in the size, scope, and quality of the underly…
View article: RNA sequencing indicates widespread conservation of circadian clocks in marine zooplankton
RNA sequencing indicates widespread conservation of circadian clocks in marine zooplankton Open
Zooplankton are important eukaryotic constituents of marine ecosystems characterized by limited motility in the water. These metazoans predominantly occupy intermediate trophic levels and energetically link primary producers to higher trop…
View article: Cooperativity boosts affinity and specificity of proteins with multiple RNA-binding domains
Cooperativity boosts affinity and specificity of proteins with multiple RNA-binding domains Open
Numerous cellular processes rely on the binding of proteins with high affinity to specific sets of RNAs. Yet most RNA-binding domains display low specificity and affinity in comparison to DNA-binding domains. The best binding motif is typi…
View article: Cln5 represents a new type of cysteine-based <i>S</i> -depalmitoylase linked to neurodegeneration
Cln5 represents a new type of cysteine-based <i>S</i> -depalmitoylase linked to neurodegeneration Open
Genetic CLN5 variants are associated with childhood neurodegeneration and Alzheimer’s disease; however, the molecular function of ceroid lipofuscinosis neuronal protein 5 (Cln5) is unknown. We solved the Cln5 crystal structure and identifi…
View article: Fast and accurate protein structure search with Foldseek
Fast and accurate protein structure search with Foldseek Open
As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing the ami…
View article: Going to extremes – a metagenomic journey into the dark matter of life
Going to extremes – a metagenomic journey into the dark matter of life Open
The Virus-X—Viral Metagenomics for Innovation Value—project was a scientific expedition to explore and exploit uncharted territory of genetic diversity in extreme natural environments such as geothermal hot springs and deep-sea ocean ecosy…
View article: Tejaas: reverse regression increases power for detecting trans-eQTLs
Tejaas: reverse regression increases power for detecting trans-eQTLs Open
Trans -acting expression quantitative trait loci ( trans -eQTLs) account for ≥70 % expression heritability and could therefore facilitate uncovering mechanisms underlying the origination of complex diseases. Identifying trans -eQTLs is cha…
View article: Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins
Thermodynamic modeling reveals widespread multivalent binding by RNA-binding proteins Open
Motivation Understanding how proteins recognize their RNA targets is essential to elucidate regulatory processes in the cell. Many RNA-binding proteins (RBPs) form complexes or have multiple domains that allow them to bind to RNA in a mult…
View article: Bayesian Markov models improve the prediction of binding motifs beyond first order
Bayesian Markov models improve the prediction of binding motifs beyond first order Open
Transcription factors (TFs) regulate gene expression by binding to specific DNA motifs. Accurate models for predicting binding affinities are crucial for quantitatively understanding of transcriptional regulation. Motifs are commonly descr…
View article: SpacePHARER: sensitive identification of phages from CRISPR spacers in prokaryotic hosts
SpacePHARER: sensitive identification of phages from CRISPR spacers in prokaryotic hosts Open
Summary SpacePHARER (CRISPR Spacer Phage–Host Pair Finder) is a sensitive and fast tool for de novo prediction of phage–host relationships via identifying phage genomes that match CRISPR spacers in genomic or metagenomic data. SpacePHARER …