Tal Pupko
YOU?
Author Swipe
View article: The role of plant polyploidy in the structure of plant-pollinator communities
The role of plant polyploidy in the structure of plant-pollinator communities Open
Polyploidization is a major macromutation, bearing notable genomic and ecological consequences. While the impact of polyploidy on plant abiotic niches is well studied, our understanding of its consequences on biotic interactions, and parti…
View article: M1CR0B1AL1Z3R 2.0: an enhanced web server for comparative analysis of bacterial genomes at scale
M1CR0B1AL1Z3R 2.0: an enhanced web server for comparative analysis of bacterial genomes at scale Open
Large-scale analyses of bacterial genomic datasets contribute to the comprehensive characterization of complex microbial dynamics among different strains and species. Such analyses often include open reading frame extraction, orthogroup in…
View article: Effectidor II: a pan-genomic AI-based algorithm for the prediction of type III secretion system effectors
Effectidor II: a pan-genomic AI-based algorithm for the prediction of type III secretion system effectors Open
Motivation Type III secretion systems are used by many Gram-negative bacteria to inject type 3 effectors (T3Es) directly into eukaryotic cells, promoting disease or provoking immune response. Because of these opposing evolutionary forces, …
View article: BetaAlign: a deep learning approach for multiple sequence alignment
BetaAlign: a deep learning approach for multiple sequence alignment Open
Motivation Multiple sequence alignments (MSAs) are extensively used in biology, from phylogenetic reconstruction to structure and function prediction. Here, we suggest an out-of-the-box approach for the inference of MSAs, which relies on a…
View article: Protein2Text: Providing Rich Descriptions for Protein Sequences
Protein2Text: Providing Rich Descriptions for Protein Sequences Open
Understanding the functionality of proteins has been a focal point of biological research due to their critical roles in various biological processes. Unraveling protein functions is essential for advancements in medicine, agriculture, and…
View article: Phylogenetic Analysis of 590 Species Reveals Distinct Evolutionary Patterns of Intron–Exon Gene Structures Across Eukaryotic Lineages
Phylogenetic Analysis of 590 Species Reveals Distinct Evolutionary Patterns of Intron–Exon Gene Structures Across Eukaryotic Lineages Open
Introns are highly prevalent in most eukaryotic genomes. Despite the accumulating evidence for benefits conferred by the possession of introns, their specific roles and functions, as well as the processes shaping their evolution, are still…
View article: Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications
Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications Open
Insertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic r…
View article: Evolutionary Insights from the Mitochondrial Genome of <i>Oikopleura dioica</i> : Sequencing Challenges, RNA Editing, Gene Transfers to the Nucleus, and tRNA Loss
Evolutionary Insights from the Mitochondrial Genome of <i>Oikopleura dioica</i> : Sequencing Challenges, RNA Editing, Gene Transfers to the Nucleus, and tRNA Loss Open
Sequencing the mitochondrial genome of the tunicate Oikopleura dioica is a challenging task due to the presence of long poly-A/T homopolymer stretches, which impair sequencing and assembly. Here, we report on the sequencing and annotation …
View article: A machine-learning-based alternative to phylogenetic bootstrap
A machine-learning-based alternative to phylogenetic bootstrap Open
Motivation Currently used methods for estimating branch support in phylogenetic analyses often rely on the classic Felsenstein’s bootstrap, parametric tests, or their approximations. As these branch support scores are widely used in phylog…
View article: The Tree Reconstruction Game: Phylogenetic Reconstruction Using Reinforcement Learning
The Tree Reconstruction Game: Phylogenetic Reconstruction Using Reinforcement Learning Open
The computational search for the maximum-likelihood phylogenetic tree is an NP-hard problem. As such, current tree search algorithms might result in a tree that is the local optima, not the global one. Here, we introduce a paradigm shift f…
View article: Evolutionary Insights from the Mitochondrial Genome of<i>Oikopleura dioica</i>: Sequencing Challenges, RNA Editing, Gene Transfers to the Nucleus, and tRNA Loss
Evolutionary Insights from the Mitochondrial Genome of<i>Oikopleura dioica</i>: Sequencing Challenges, RNA Editing, Gene Transfers to the Nucleus, and tRNA Loss Open
Sequencing the mitochondrial genome of the tunicate Oikopleura dioica is a challenging task due to the presence of long poly-A/T homopolymer stretches, which impair sequencing and assembly. Here, we report on the sequencing and annotation …
View article: Evolution of myxozoan mitochondrial genomes: insights from myxobolids
Evolution of myxozoan mitochondrial genomes: insights from myxobolids Open
View article: Genetic and Functional Diversity Help Explain Pathogenic, Weakly Pathogenic, and Commensal Lifestyles in the Genus <i>Xanthomonas</i>
Genetic and Functional Diversity Help Explain Pathogenic, Weakly Pathogenic, and Commensal Lifestyles in the Genus <i>Xanthomonas</i> Open
The genus Xanthomonas has been primarily studied for pathogenic interactions with plants. However, besides host and tissue-specific pathogenic strains, this genus also comprises nonpathogenic strains isolated from a broad range of hosts, s…
View article: Effect of tokenization on transformers for biological sequences
Effect of tokenization on transformers for biological sequences Open
Motivation Deep-learning models are transforming biological research, including many bioinformatics and comparative genomics algorithms, such as sequence alignments, phylogenetic tree inference, and automatic classification of protein func…
View article: BetaAlign: a deep learning approach for multiple sequence alignment
BetaAlign: a deep learning approach for multiple sequence alignment Open
The multiple sequence alignment (MSA) problem is a fundamental pillar in bioinformatics, comparative genomics, and phylogenetics. Here we characterize and improve BetaAlign, the first deep learning aligner, which substantially deviates fro…
View article: Statistical framework to determine indel-length distribution
Statistical framework to determine indel-length distribution Open
Motivation Insertions and deletions (indels) of short DNA segments, along with substitutions, are the most frequent molecular evolutionary events. Indels were shown to affect numerous macro-evolutionary processes. Because indels may span m…
View article: A machine-learning based alternative to phylogenetic bootstrap
A machine-learning based alternative to phylogenetic bootstrap Open
A data-driven approach to estimate branch support values with a probabilistic interpretation
View article: Effect of Tokenization on Transformers for Biological Sequences
Effect of Tokenization on Transformers for Biological Sequences Open
Deep learning models are transforming biological research. Many bioinformatics and comparative genomics algorithms analyze genomic data, either DNA or protein sequences. Examples include sequence alignments, phylogenetic tree inference and…
View article: Evaluation of the Ability of AlphaFold to Predict the Three-Dimensional Structures of Antibodies and Epitopes
Evaluation of the Ability of AlphaFold to Predict the Three-Dimensional Structures of Antibodies and Epitopes Open
Being able to accurately predict the three-dimensional structure of an antibody can facilitate fast and precise antibody characterization and epitope prediction, with important diagnostic and clinical implications. In the current study, we…
View article: Comparative sequence analysis of pPATH pathogenicity plasmids in Pantoea agglomerans gall-forming bacteria
Comparative sequence analysis of pPATH pathogenicity plasmids in Pantoea agglomerans gall-forming bacteria Open
Acquisition of the pathogenicity plasmid pPATH that encodes a type III secretion system (T3SS) and effectors (T3Es) has likely led to the transition of a non-pathogenic bacterium into the tumorigenic pathogen Pantoea agglomerans . P. agglo…
View article: Genetic and functional diversity help explain pathogenic, weakly pathogenic, and commensal lifestyles in the genus<i>Xanthomonas</i>
Genetic and functional diversity help explain pathogenic, weakly pathogenic, and commensal lifestyles in the genus<i>Xanthomonas</i> Open
The genus Xanthomonas has been primarily studied for pathogenic interactions with plants. However, besides host and tissue specific pathogenic strains, this genus also comprises nonpathogenic strains isolated from a broad range of hosts, s…
View article: Complete genome sequence of an Israeli isolate of Xanthomonas hortorum pv. pelargonii strain 305 and novel type III effectors identified in Xanthomonas
Complete genome sequence of an Israeli isolate of Xanthomonas hortorum pv. pelargonii strain 305 and novel type III effectors identified in Xanthomonas Open
Xanthomonas hortorum pv. pelargonii is the causative agent of bacterial blight in geranium ornamental plants, the most threatening bacterial disease of this plant worldwide. Xanthomonas fragariae is the causative agent of angular leaf spot…
View article: GenomeFLTR: filtering reads made easy
GenomeFLTR: filtering reads made easy Open
In the last decade, advances in sequencing technology have led to an exponential increase in genomic data. These new data have dramatically changed our understanding of the evolution and function of genes and genomes. Despite improvements …
View article: Identification of selective sweeps in bacteria
Identification of selective sweeps in bacteria Open
Selective sweeps occur when a beneficial mutation spreads rapidly throughout the population due to natural selection. Searching for selective sweeps has proved to be one of the most fruitful ways to detect the footprints selection leaves o…
View article: The tree reconstruction game: phylogenetic reconstruction using reinforcement learning
The tree reconstruction game: phylogenetic reconstruction using reinforcement learning Open
We propose a reinforcement-learning algorithm to tackle the challenge of reconstructing phylogenetic trees. The search for the tree that best describes the data is algorithmically challenging, thus all current algorithms for phylogeny reco…
View article: Using evolutionary data to make sense of macromolecules with a “face‐lifted” ConSurf
Using evolutionary data to make sense of macromolecules with a “face‐lifted” ConSurf Open
The ConSurf web‐sever for the analysis of proteins, RNA, and DNA provides a quick and accurate estimate of the per‐site evolutionary rate among homologues. The analysis reveals functionally important regions, such as catalytic and ligand‐b…
View article: The evolutionary dynamics that retain long neutral genomic sequences in face of indel deletion bias: a model and its application to human introns
The evolutionary dynamics that retain long neutral genomic sequences in face of indel deletion bias: a model and its application to human introns Open
Insertions and deletions (indels) of short DNA segments are common evolutionary events. Numerous studies showed that deletions occur more often than insertions in both prokaryotes and eukaryotes. It raises the question why neutral sequence…
View article: Natural language processing approach to model the secretion signal of type III effectors
Natural language processing approach to model the secretion signal of type III effectors Open
Type III effectors are proteins injected by Gram-negative bacteria into eukaryotic hosts. In many plant and animal pathogens, these effectors manipulate host cellular processes to the benefit of the bacteria. Type III effectors are secrete…
View article: An Approximate Bayesian Computation Approach for Modeling Genome Rearrangements
An Approximate Bayesian Computation Approach for Modeling Genome Rearrangements Open
The inference of genome rearrangement events has been extensively studied, as they play a major role in molecular evolution. However, probabilistic evolutionary models that explicitly imitate the evolutionary dynamics of such events, as we…
View article: The evolutionary dynamics that retain long neutral genomic sequences in face of indel deletion bias: a model and its application to human introns
The evolutionary dynamics that retain long neutral genomic sequences in face of indel deletion bias: a model and its application to human introns Open
Insertions and deletions (indels) of short DNA segments are common evolutionary events. Numerous studies showed that deletions occur more often than insertions in both prokaryotes and eukaryotes. It raises the question why neutral sequence…