Sequence motif ≈ Sequence motif
View article
STREME: accurate and versatile sequence motif discovery Open
Motivation Sequence motif discovery algorithms can identify novel sequence patterns that perform biological functions in DNA, RNA and protein sequences—for example, the binding site motifs of DNA- and RNA-binding proteins. Results The STRE…
View article
Convolutional neural network architectures for predicting DNA–protein binding Open
Motivation: Convolutional neural networks (CNN) have outperformed conventional methods in modeling the sequence specificity of DNA–protein binding. Yet inappropriate CNN architectures can yield poorer performance than simpler models. Thus …
View article
i-Motif DNA: structural features and significance to cell biology Open
The i-motif represents a paradigmatic example of the wide structural versatility of nucleic acids. In remarkable contrast to duplex DNA, i-motifs are four-stranded DNA structures held together by hemi- protonated and intercalated cytosine …
View article
ELM—the eukaryotic linear motif resource in 2020 Open
The eukaryotic linear motif (ELM) resource is a repository of manually curated experimentally validated short linear motifs (SLiMs). Since the initial release almost 20 years ago, ELM has become an indispensable resource for the molecular …
View article
Redefining the structural motifs that determine <span>RNA</span> binding and <span>RNA</span> editing by pentatricopeptide repeat proteins in land plants Open
Summary The pentatricopeptide repeat ( PPR ) proteins form one of the largest protein families in land plants. They are characterised by tandem 30–40 amino acid motifs that form an extended binding surface capable of sequence‐specific reco…
View article
PAM identification by CRISPR-Cas effector complexes: diversified mechanisms and structures Open
Adaptive immunity of prokaryotes is mediated by CRISPR-Cas systems that employ a large variety of Cas protein effectors to identify and destroy foreign genetic material. The different targeting mechanisms of Cas proteins rely on the proper…
View article
GibbsCluster: unsupervised clustering and alignment of peptide sequences Open
Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these inter…
View article
Identification of multiple genomic DNA sequences which form i-motif structures at neutral pH Open
i-Motifs are alternative DNA secondary structures formed in cytosine-rich sequences. Particular examples of these structures, traditionally assumed to be stable only at acidic pH, have been found to form under near-physiological conditions…
View article
Evaluation of the Stability of DNA i‐Motifs in the Nuclei of Living Mammalian Cells Open
C‐rich DNA has the capacity to form a tetra‐stranded structure known as an i‐motif. The i‐motifs within genomic DNA have been proposed to contribute to the regulation of DNA transcription. However, direct experimental evidence for the exis…
View article
Proteome-wide analysis of chaperone-mediated autophagy targeting motifs Open
Chaperone-mediated autophagy (CMA) contributes to the lysosomal degradation of a selective subset of proteins. Selectivity lies in the chaperone heat shock cognate 71 kDa protein (HSC70) recognizing a pentapeptide motif (KFERQ-like motif) …
View article
The eukaryotic linear motif resource – 2018 update Open
Short linear motifs (SLiMs) are protein binding modules that play major roles in almost all cellular processes. SLiMs are short, often highly degenerate, difficult to characterize and hard to detect. The eukaryotic linear motif (ELM) resou…
View article
MoMo: discovery of statistically significant post-translational modification motifs Open
Motivation Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence …
View article
The sequence at Spike S1/S2 site enables cleavage by furin and phospho-regulation in SARS-CoV2 but not in SARS-CoV1 or MERS-CoV Open
The Spike protein of the novel coronavirus SARS-CoV2 contains an insertion 680 S PRRA R↓SV 687 forming a cleavage motif RxxR for furin-like enzymes at the boundary of S1/S2 subunits. Cleavage at S1/S2 is important for efficient viral entry…
View article
SEA: Simple Enrichment Analysis of motifs Open
Motif enrichment algorithms can identify known sequence motifs that are present to a statistically significant degree in DNA, RNA and protein sequences. Databases of such known motifs exist for DNA- and RNA-binding proteins, as well as for…
View article
Occupancy maps of 208 chromatin-associated proteins in one human cell type Open
Transcription factors are DNA-binding proteins that have key roles in gene regulation 1,2 . Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological proc…
View article
Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells Open
Eukaryotic genomes contain a variety of structured patterns: repetitive elements, binding sites of DNA and RNA associated proteins, splice sites, and so on. Often, these structured patterns can be formalized as motifs and described using a…
View article
XSTREME: Comprehensive motif analysis of biological sequence datasets Open
XSTREME is a web-based tool for performing comprehensive motif discovery and analysis in DNA, RNA or protein sequences, as well as in sequences in user-defined alphabets. It is designed for both very large and very small datasets. XSTREME …
View article
Improved LbCas12a variants with altered PAM specificities further broaden the genome targeting range of Cas12a nucleases Open
The widespread use of Cas12a (formerly Cpf1) nucleases for genome engineering is limited by their requirement for a rather long TTTV protospacer adjacent motif (PAM) sequence. Here we have aimed to loosen these PAM constraints and have gen…
View article
High-throughput functional analysis of lncRNA core promoters elucidates rules governing tissue specificity Open
Transcription initiates at both coding and noncoding genomic elements, including mRNA and long noncoding RNA (lncRNA) core promoters and enhancer RNAs (eRNAs). However, each class has a different expression profile with lncRNAs and eRNAs b…
View article
SLiMSearch: a framework for proteome-wide discovery and annotation of functional modules in intrinsically disordered regions Open
The extensive intrinsically disordered regions of higher eukaryotic proteomes contain vast numbers of functional interaction modules known as short linear motifs (SLiMs). Here, we present SLiMSearch, a motif discovery tool that scans a mot…
View article
sgRNA Sequence Motifs Blocking Efficient CRISPR/Cas9-Mediated Gene Editing Open
Cas9 nucleases can be programmed with single guide RNAs (sgRNAs) to mediate gene editing. High CRISPR/Cas9-mediated gene knockout efficiencies are essential for genetic screens and critically depend on the properties of the sgRNAs used. Th…
View article
Protein remote homology detection and structural alignment using deep learning Open
Exploiting sequence–structure–function relationships in biotechnology requires improved methods for aligning proteins that have low sequence similarity to previously annotated proteins. We develop two deep learning methods to address this …
View article
Exploiting sequence-based features for predicting enhancer–promoter interactions Open
Motivation A large number of distal enhancers and proximal promoters form enhancer–promoter interactions to regulate target genes in the human genome. Although recent high-throughput genome-wide mapping approaches have allowed us to more c…
View article
Cryptic sequence features within the disordered protein p27 <sup>Kip1</sup> regulate cell cycle signaling Open
Significance Intrinsically disordered regions (IDRs) of proteins are scaffolds for linear motifs that mediate protein–protein interactions and are the sites of posttranslational modifications. Using the cell cycle inhibitory protein p27 as…
View article
ELM—the Eukaryotic Linear Motif resource—2024 update Open
Short Linear Motifs (SLiMs) are the smallest structural and functional components of modular eukaryotic proteins. They are also the most abundant, especially when considering post-translational modifications. As well as being found through…
View article
Representation learning of genomic sequence motifs with convolutional neural networks Open
Although convolutional neural networks (CNNs) have been applied to a variety of computational genomics problems, there remains a large gap in our understanding of how they build representations of regulatory genomic sequences. Here we perf…
View article
Deep learning of immune cell differentiation Open
Significance Applying artificial intelligence tools to a highly complex question of immunology, we show that a deep neural network can learn to predict the patterns of chromatin opening across 81 stem and differentiated cells across the im…
View article
The RxLR Motif of the Host Targeting Effector AVR3a of <i>Phytophthora infestans</i> Is Cleaved before Secretion Open
When plant-pathogenic oomycetes infect their hosts, they employ a large arsenal of effector proteins to establish a successful infection. Some effector proteins are secreted and are destined to be translocated and function inside host cell…
View article
HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors Open
We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated…
View article
Comprehensive classification of the PIN domain-like superfamily Open
PIN-like domains constitute a widespread superfamily of nucleases, diverse in terms of the reaction mechanism, substrate specificity, biological function and taxonomic distribution. Proteins with PIN-like domains are involved in central ce…