Pairwise comparison ≈ Pairwise comparison
View article
Minimap2: pairwise alignment for nucleotide sequences Open
Motivation Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or in…
View article
VSEARCH: a versatile open source tool for metagenomics Open
Background VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USE…
View article
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models Open
The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unpre…
View article
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments Open
Unsupervised image representations have significantly reduced the gap with supervised pretraining, notably with the recent achievements of contrastive learning methods. These contrastive methods typically work online and rely on a large nu…
View article
ASAP: assemble species by automatic partitioning Open
Here, we describe Assemble Species by Automatic Partitioning (ASAP), a new method to build species partitions from single locus sequence alignments (i.e., barcode data sets). ASAP is efficient enough to split data sets as large 10 4 sequen…
View article
Efficient Multi-Scale Attention Module with Cross-Spatial Learning Open
Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. However, modeling the cross-channel relationships with channel …
View article
HH-suite3 for fast remote homology detection and deep protein annotation Open
Background HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It is based on pairwise alignment of profile Hidden Markov models (HMMs), which represent multiple seq…
View article
New strategies to improve minimap2 alignment accuracy Open
Summary We present several recent improvements to minimap2, a versatile pairwise aligner for nucleotide sequences. Now minimap2 v2.22 can more accurately map long reads to highly repetitive regions and align through insertions or deletions…
View article
Network analysis of multivariate data in psychological science Open
In recent years, network analysis has been applied to identify and analyse patterns of statistical association in multivariate psychological data. In these approaches, network nodes represent variables in a data set, and edges represent pa…
View article
A Survey on Learning to Hash Open
Nearest neighbor search is a problem of finding the data points from the database such that the distances from them to the query point are the smallest. Learning to hash is one of the major solutions to this problem and has been widely stu…
View article
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model Open
http://raptorx.uchicago.edu/ContactMap/.
View article
Uniclust databases of clustered and deeply annotated protein sequences and alignments Open
We present three clustered protein sequence databases, Uniclust90, Uniclust50, Uniclust30 and three databases of multiple sequence alignments (MSAs), Uniboost10, Uniboost20 and Uniboost30, as a resource for protein sequence analysis, funct…
View article
A guide to phylogenetic metrics for conservation, community ecology and macroecology Open
The use of phylogenies in ecology is increasingly common and has broadened our understanding of biological diversity. Ecological sub‐disciplines, particularly conservation, community ecology and macroecology, all recognize the value of evo…
View article
Protein Sequence Analysis Using the MPI Bioinformatics Toolkit Open
The MPI Bioinformatics Toolkit ( https://toolkit.tuebingen.mpg.de ) provides interactive access to a wide range of the best‐performing bioinformatics tools and databases, including the state‐of‐the‐art protein sequence comparison methods H…
View article
A New Model for Determining Weight Coefficients of Criteria in MCDM Models: Full Consistency Method (FUCOM) Open
In this paper, a new multi-criteria problem solving method—the Full Consistency Method (FUCOM)—is proposed. The model implies the definition of two groups of constraints that need to satisfy the optimal values of weight coefficients. The f…
View article
DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution Open
DAMBE is a comprehensive software package for genomic and phylogenetic data analysis on Windows, Linux, and Macintosh computers. New functions include imputing missing distances and phylogeny simultaneously (paving the way to build large p…
View article
Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic Open
We present a large-scale comparison of five multidisciplinary bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. The comparison considers scientific documents from the period 2008–2017 covered…
View article
Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks Open
The use of RGB-D information for salient object detection (SOD) has been extensively explored in recent years. However, relatively few efforts have been put toward modeling SOD in real-world human activity scenes with RGB-D. In this articl…
View article
Conducting proportional meta-analysis in different types of systematic reviews: a guide for synthesisers of evidence Open
Background Single group data present unique challenges for synthesises of evidence. Proportional meta-analysis is becoming an increasingly common technique employed for the synthesis of single group data. Proportional meta-analysis shares …
View article
Synthetic, Switchable Enzymes Open
The construction of switchable, radiation-controlled, aptameric enzymes - “swenzymes” - is, in principle, feasible. We propose a strategy to make such catalysts from 2 (or more) aptamers each selected to bind specifically to one of the sub…
View article
Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures Open
While single-cell gene expression experiments present new challenges for data processing, the cell-to-cell variability observed also reveals statistical relationships that can be used by information theory. Here, we use multivariate inform…
View article
RRPP: An <span>r</span> package for fitting linear models to high‐dimensional data using residual randomization Open
Residual randomization in permutation procedures (RRPP) is an appropriate means of generating empirical sampling distributions for ANOVA statistics and linear model coefficients, using ordinary or generalized least‐squares estimation. This…
View article
Personalized Cross-Silo Federated Learning on Non-IID Data Open
Non-IID data present a tough challenge for federated learning. In this paper, we explore a novel idea of facilitating pairwise collaborations between clients with similar data. We propose FedAMP, a new method employing federated attentive …
View article
KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies Open
Motivation De novo assembly of whole genome shotgun (WGS) next-generation sequencing (NGS) data benefits from high-quality input with high coverage. However, in practice, determining the quality and quantity of useful reads quickly and in …
View article
Bacterial Communities: Interactions to Scale Open
In the environment, bacteria live in complex multispecies communities. These communities span in scale from small, multicellular aggregates to billions or trillions of cells within the gastrointestinal tract of animals. The dynamics of bac…
View article
End-to-End Neural Ad-hoc Ranking with Kernel Pooling Open
This paper proposes K-NRM, a kernel based neural model for document ranking.\nGiven a query and a set of documents, K-NRM uses a translation matrix that\nmodels word-level similarities via word embeddings, a new kernel-pooling\ntechnique t…
View article
SynergyFinder: a web application for analyzing drug combination dose–response matrix data Open
Summary: Rational design of drug combinations has become a promising strategy to tackle the drug sensitivity and resistance problem in cancer treatment. To systematically evaluate the pre-clinical significance of pairwise drug combinations…
View article
Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation Open
Session-based recommendation (SBR) focuses on next-item prediction at a certain time point. As user profiles are generally not available in this scenario, capturing the user intent lying in the item transitions plays a pivotal role. Recent…
View article
Simplicial closure and higher-order link prediction Open
Networks provide a powerful formalism for modeling complex systems by using a model of pairwise interactions. But much of the structure within these systems involves interactions that take place among more than two nodes at once—for exampl…
View article
SymMap: an integrative database of traditional Chinese medicine enhanced by symptom mapping Open
Recently, the pharmaceutical industry has heavily emphasized phenotypic drug discovery (PDD), which relies primarily on knowledge about phenotype changes associated with diseases. Traditional Chinese medicine (TCM) provides a massive amoun…