1000 Genomes Project
View article
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program Open
The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these d…
View article
FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing Open
Allele-specific copy number analysis (ASCN) from next generation sequencing (NGS) data can greatly extend the utility of NGS beyond the identification of mutations to precisely annotate the genome for the detection of homozygous/heterozygo…
View article
Multi-platform discovery of haplotype-resolved structural variation in human genomes Open
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing tec…
View article
An Expanded Genome-Wide Association Study of Type 2 Diabetes in Europeans Open
To characterize type 2 diabetes (T2D)-associated variation across the allele frequency spectrum, we conducted a meta-analysis of genome-wide association data from 26,676 T2D case and 132,532 control subjects of European ancestry after impu…
View article
BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data Open
Summary: Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of…
View article
Haplotype-resolved diverse human genomes and integrated analysis of structural variation Open
Resolving genomic structural variation Many human genomes have been reported using short-read technology, but it is difficult to resolve structural variants (SVs) using these data. These genomes thus lack comprehensive comparisons among in…
View article
Accurate, scalable and integrative haplotype estimation Open
The number of human genomes being genotyped or sequenced increases exponentially and efficient haplotype estimation methods able to handle this amount of data are now required. Here we present a method, SHAPEIT4, which substantially improv…
View article
AnnotSV: an integrated tool for structural variations annotation Open
Summary Structural Variations (SV) are a major source of variability in the human genome that shaped its actual structure during evolution. Moreover, many human diseases are caused by SV, highlighting the need to accurately detect those ge…
View article
BCFtools/csq: haplotype-aware variant consequences Open
Motivation Prediction of functional variant consequences is an important part of sequencing pipelines, allowing the categorization and prioritization of genetic variants for follow up analysis. However, current predictors analyze variants …
View article
HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies Open
Many tools have been developed for haplotype assembly—the reconstruction of individual haplotypes using reads mapped to a reference genome sequence. Due to increasing interest in obtaining haplotype-resolved human genomes, a range of new s…
View article
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program Open
Summary paragraph The Trans-Omics for Precision Medicine (TOPMed) program seeks to elucidate the genetic architecture and disease biology of heart, lung, blood, and sleep disorders, with the ultimate goal of improving diagnosis, treatment,…
View article
FinnGen: Unique genetic insights from combining isolated population and national health register data Open
Population isolates such as Finland provide benefits in genetic studies because the allelic spectrum of damaging alleles in any gene is often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%), wh…
View article
A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree Open
Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of 1…
View article
Discovery and genotyping of structural variation from long-read haploid genome sequence data Open
In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we syste…
View article
The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology Open
Mobile element insertions (MEIs) represent ∼25% of all structural variants in human genomes. Moreover, when they disrupt genes, MEIs can influence human traits and diseases. Therefore, MEIs should be fully discovered along with other forms…
View article
Genome graphs and the evolution of genome inference Open
The human reference genome is part of the foundation of modern human biology and a monumental scientific achievement. However, because it excludes a great deal of common human variation, it introduces a pervasive reference bias into the fi…
View article
The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals Open
Metabolic diseases are the most common and rapidly growing health issues worldwide. The massive population-based human genetics is crucial for the precise prevention and intervention of metabolic disorders. The China Metabolic Analytics Pr…
View article
Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations Open
Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations ha…
View article
The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data Open
The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation d…
View article
Human genetic variation database, a reference database of genetic variations in the Japanese population Open
This FAIRsharing record describes: The Human Genetic Variation Database (HGVD) aims to provide a central resource to archive and display Japanese genetic variation and association between the variation and transcription level of genes. The…
View article
Resolving the full spectrum of human genome variation using Linked-Reads Open
Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencin…
View article
The Korea Biobank Array: Design and Identification of Coding Variants Associated with Blood Biochemical Traits Open
We introduce the design and implementation of a new array, the Korea Biobank Array (referred to as KoreanChip), optimized for the Korean population and demonstrate findings from GWAS of blood biochemical traits. KoreanChip comprised >833,0…
View article
Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing Open
Whole-genome sequencing using sequencing technologies such as Illumina enables the accurate detection of small-scale variants but provides limited information about haplotypes and variants in repetitive regions of the human genome. Single-…
View article
High coverage whole genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios Open
SUMMARY The 1000 Genomes Project (1kGP) is the largest fully open resource of whole genome sequencing (WGS) data consented for public distribution of raw sequence data without access or use restrictions. The final release of the 1kGP inclu…
View article
Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel Open
Genetic imputation is a cost-efficient way to improve the power and resolution of genome-wide association (GWA) studies. Current publicly accessible imputation reference panels accurately predict genotypes for common variants with minor al…
View article
International HapMap Project Open
A collaboration among scientists in Japan, the U.K., Canada, China, Nigeria, and the U.S. to develop a haplotype map of the human genome, the HapMap, which will describe the common patterns of human DNA sequence variation.
View article
The presence and impact of reference bias on population genomic studies of prehistoric human populations Open
Haploid high quality reference genomes are an important resource in genomic research projects. A consequence is that DNA fragments carrying the reference allele will be more likely to map successfully, or receive higher quality scores. Thi…
View article
Accurate, scalable cohort variant calls using DeepVariant and GLnexus Open
Motivation Population-scale sequenced cohorts are foundational resources for genetic analyses, but processing raw reads into analysis-ready cohort-level variants remains challenging. Results We introduce an open-source cohort-calling metho…
View article
SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes Open
Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in…
View article
Dating genomic variants and shared ancestry in population-scale sequencing data Open
The origin and fate of new mutations within species is the fundamental process underlying evolution. However, while much attention has been focused on characterizing the presence, frequency, and phenotypic impact of genetic variation, the …