Robust, scalable, and informative clustering for diverse biological networks Article Swipe
Related Concepts
Cluster analysis
Scalability
Computer science
Range (aeronautics)
Data mining
Consensus clustering
Biology
Big data
Machine learning
Artificial intelligence
Correlation clustering
CURE data clustering algorithm
Engineering
Database
Aerospace engineering
Chris Gaiteri
,
David Connell
,
Faraz Sultan
,
Artemis Iatrou
,
Bernard Ng
,
Bolesław K. Szymański
,
Ada Zhang
,
Shinya Tasaki
·
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.1186/s13059-023-03062-0
· OA: W4387578887
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.1186/s13059-023-03062-0
· OA: W4387578887
Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm—SpeakEasy2: Champagne. These tests identify trends in performance, show no single method is universally optimal, and allow us to examine factors behind variation in performance. Multiple metrics indicate SpeakEasy2 generally provides robust, scalable, and informative clusters for a range of applications.
Related Topics
Finding more related topics…