RabbitVar: ultra-fast and accurate somatic small-variant calling on multi-core architectures Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.1101/2023.01.06.522980
· OA: W4313638349
The continuous development of next-generation sequencing (NGS) technology has led to extensive and frequent use of genomic analysis in cancer research. The associated production of large-scale NGS datasets establishes the need for high-precision somatic variant calling methods that are highly optimized on commonly used hardware platforms. We present RabbitVar ( https://github.com/LeiHaoa/RabbitVar ), a scalable variant caller that can detect small somatic variants from paired tumor/normal NGS data on modern multi-core CPUs. Our approach combines candidate-finding and machine-learning-based filtering strategies with optimized data structures and multi-threading to achieve both high accuracy and efficiency. We have compared the performance of RabbitVar to leading state-of-the-art callers (Strelka2, Mutect2, NeuSomatic, VarDict, VarScan2) on real-world HCC1395 breast cancer datasets under different sequencing conditions and contamination rates. The evaluation results demonstrate that RabbitVar achieves highly competitive F1-scores when calling SNVs. Moreover, when calling the more challenging indel variants, it consistently achieves the highest F1-scores. RabbitVar is able to process a paired tumor and normal whole human genome sequencing datasets with 80x depth in less than 20 minutes on a 48-core workstation outperforming all other tested variant callers in terms of efficiency.