Explanipedia

Minimizing Reference Bias with an Impute-First Approach Open

Kavya Vaddadi, Taher Mun, Ben Langmead · 2023

Pangenome indexes reduce reference bias in sequencing data analysis. However, bias can be reduced further by using a personalized reference, e.g. a diploid human reference constructed to match a donor individual’s alleles. We present a nov…

Minimizing Reference Bias: The Impute-First Approach for Personalized Genome Analysis Open

Kavya Vaddadi, Taher Mun, Ben Langmead · 2023

Computer science Biology

We introduce the Impute-first alignment framework that reduces reference bias in genomics by integrating genotype imputation with pangenome alignment. Beginning with genotyping and genotype imputation using a portion of the input data, a p…

Pangenomic genotyping with the marker array Open

Taher Mun, Kavya Vaddadi, Ben Langmead · 2022

Computer science Biology

We present a new method and software tool called rowbowt that applies a pangenome index to the problem of inferring genotypes from short-read sequencing data. The method uses a novel indexing structure called the marker array. Using the ma…

LevioSAM: fast lift-over of variant-aware reference alignments Open

Taher Mun, Nae-Chyun Chen, Ben Langmead · 2021

Computer science Economics Sociology

Motivation As more population genetics datasets and population-specific references become available, the task of translating (‘lifting’) read alignments from one reference coordinate system to another is becoming more common. Existing tool…

LevioSAM: Fast lift-over of alternate reference alignments Open

Taher Mun, Nae-Chyun Chen, Ben Langmead · 2021

Computer science Engineering Sociology

Motivation As more population genetics datasets and population-specific references become available, the task of translating (“lifting”) read alignments from one reference coordinate system to another is becoming more common. Existing tool…

Reference flow VCF for pre-built genomes Open

Nae-Chyun Chen, Brad Solomon, Taher Mun, Sheila Iyer, Ben Langmead · 2020

Computer science Biology Mathematics

Pre-built genomes (in VCF format) for the RandFlow-LD and RandFlow-LD-26 methods in reference flow. The references can be built using the reference flow software (https://github.com/langmead-lab/reference_flow). An archival version of the …

Reference flow VCF for pre-built genomes Open

Nae-Chyun Chen, Brad Solomon, Taher Mun, Sheila Iyer, Ben Langmead · 2020

Computer science Biology

Pre-built genomes (in VCF format) for the RandFlow-LD and RandFlow-LD-26 methods in reference flow. The references can be built using the reference flow software (https://github.com/langmead-lab/reference_flow). An archival version of the …

Raw data for reference flow experiments Open

Nae-Chyun Chen, Brad Solomon, Taher Mun, Sheila Iyer, Ben Langmead · 2020

Computer science Environmental science Mathematics

Raw results data for the reference flow study. Reference flow used public sequence data, where is specified in the manuscript, to generate the results. The figures shown in the manuscript were plotted using the provided processed data. Ref…

Raw data for reference flow experiments Open

Nae-Chyun Chen, Brad Solomon, Taher Mun, Sheila Iyer, Ben Langmead · 2020

Computer science Mathematics

Raw results data for the reference flow study. Reference flow used public sequence data, where is specified in the manuscript, to generate the results. The figures shown in the manuscript were plotted using the provided processed data. Ref…

Efficient Construction of a Complete Index for Pan-Genomics Read Alignment Open

Alan Kuhnle, Taher Mun, Christina Boucher, Travis Gagie, Ben Langmead , et al. · 2020

Computer science Mathematics Philosophy

While short read aligners, which predominantly use the FM-index, are able to easily index one or a few human genomes, they do not scale well to indexing databases containing thousands of genomes. To understand why, it helps to examine the …

Matching Reads to Many Genomes with the <i>r</i> -Index Open

Taher Mun, Alan Kuhnle, Christina Boucher, Travis Gagie, Ben Langmead , et al. · 2020

Computer science Biology Mathematics

The r-index is a tool for compressed indexing of genomic databases for exact pattern matching, which can be used to completely align reads that perfectly match some part of a genome in the database or to find seeds for reads that do not. T…

Reducing reference bias using multiple population reference genomes Open

Nae-Chyun Chen, Brad Solomon, Taher Mun, Sheila Iyer, Ben Langmead · 2020

Computer science Biology Mathematics

Most sequencing data analyses start by aligning sequencing reads to a linear reference genome. But failure to account for genetic variation causes reference bias and confounding of results downstream. Other approaches replace the linear re…

Matching reads to many genomes with the $r$-index Open

Taher Mun, Alan Kuhnle, Christina Boucher, Travis Gagie, Ben Langmead , et al. · 2019

Computer science Biology Mathematics

The $r$-index is a tool for compressed indexing of genomic databases for exact pattern matching, which can be used to completely align reads that perfectly match some part of a genome in the database or to find seeds for reads that do not.…

Efficient Construction of a Complete Index for Pan-Genomics Read Alignment Open

Alan Kuhnle, Taher Mun, Christina Boucher, Travis Gagie, Ben Langmead , et al. · 2018

Computer science Mathematics Physics

While short read aligners, which predominantly use the FM-index, are able to easily index one or a few human genomes, they do not scale well to indexing databases containing thousands of genomes. To understand why, it helps to examine the …

Taher Mun YOU? Author Swipe