Reference genome ≈ Reference genome
View article: Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads Open
The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long re…
View article
Introducing EzBioCloud: a taxonomically united database of 16S rRNA gene sequences and whole-genome assemblies Open
The recent advent of DNA sequencing technologies facilitates the use of genome sequencing data that provide means for more informative and precise classification and identification of members of the Bacteria and Archaea. Because the curren…
View article
GENCODE reference annotation for the human and mouse genomes Open
The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE…
View article
Shifting the limits in wheat research and breeding using a fully annotated reference genome Open
Insights from the annotated wheat genome Wheat is one of the major sources of food for much of the world. However, because bread wheat's genome is a large hybrid mix of three separate subgenomes, it has been difficult to produce a high-qua…
View article
Fast and accurate de novo genome assembly from long uncorrected reads Open
The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource-intensive error-correction and consensus-generation steps to obtain high-quality assemblies. We show that the error-correction…
View article
MUMmer4: A fast and versatile genome alignment system Open
The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer version 3 in 2004, it has been applied to many types of probl…
View article
GenomeScope: fast reference-free genome profiling from short reads Open
Summary GenomeScope is an open-source web tool to rapidly estimate the overall characteristics of a genome, including genome size, heterozygosity rate and repeat content from unprocessed short reads. These features are essential for studyi…
View article
FastQ Screen: A tool for multi-genome mapping and quality control Open
DNA sequencing analysis typically involves mapping reads to just one reference genome. Mapping against multiple genomes is necessary, however, when the genome of origin requires confirmation. Mapping against multiple genomes is also advisa…
View article
HGVS Recommendations for the Description of Sequence Variants: 2016 Update Open
The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends on accurate and standardized description and sha…
View article
Versatile genome assembly evaluation with QUAST-LG Open
Motivation The emergence of high-throughput sequencing technologies revolutionized genomics in early 2000s. The next revolution came with the era of long-read sequencing. These technological advances along with novel computational approach…
View article
FastQ Screen: A tool for multi-genome mapping and quality control Open
DNA sequencing analysis typically involves mapping reads to just one reference genome. Mapping against multiple genomes is necessary, however, when the genome of origin requires confirmation. Mapping against multiple genomes is also advisa…
View article
Proksee: in-depth characterization and visualization of bacterial genomes Open
Proksee (https://proksee.ca) provides users with a powerful, easy-to-use, and feature-rich system for assembling, annotating, analysing, and visualizing bacterial genomes. Proksee accepts Illumina sequence reads as compressed FASTQ files o…
View article
FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing Open
Allele-specific copy number analysis (ASCN) from next generation sequencing (NGS) data can greatly extend the utility of NGS beyond the identification of mutations to precisely annotate the genome for the detection of homozygous/heterozygo…
View article
Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly Open
The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues an…
View article
Ensembl 2020 Open
The Ensembl (https://www.ensembl.org) is a system for generating and distributing genome annotation such as genes, variation, regulation and comparative genomics across the vertebrate subphylum and key model organisms. The Ensembl annotati…
View article
CPGAVAS2, an integrated plastome sequence annotator and analyzer Open
We previously developed a web server CPGAVAS for annotation, visualization and GenBank submission of plastome sequences. Here, we upgrade the server into CPGAVAS2 to address the following challenges: (i) inaccurate annotation in the refere…
View article
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research Open
Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls…
View article
RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation Open
The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains nearly 200 000 bacterial and archaeal genomes and 150 million proteins with up-to-date annotation. Changes in the Prokaryotic Geno…
View article
Multi-platform discovery of haplotype-resolved structural variation in human genomes Open
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing tec…
View article
Direct determination of diploid genome sequences Open
Determining the genome sequence of an organism is challenging, yet fundamental to understanding its biology. Over the past decade, thousands of human genomes have been sequenced, contributing deeply to biomedical research. In the vast majo…
View article
RESCRIPt: Reproducible sequence taxonomy reference database management Open
Nucleotide sequence and taxonomy reference databases are critical resources for widespread applications including marker-gene and metagenome sequencing for microbiome analysis, diet metabarcoding, and environmental DNA (eDNA) surveys. Repr…
View article
rnaSPAdes: a <i>de novo</i> transcriptome assembler and its application to RNA-Seq data Open
Background The possibility of generating large RNA-sequencing datasets has led to development of various reference-based and de novo transcriptome assemblers with their own strengths and limitations. While reference-based tools are widely …
View article
<i>De novo</i> assembly of the cattle reference genome with single-molecule sequencing Open
Background Major advances in selection progress for cattle have been made following the introduction of genomic tools over the past 10–12 years. These tools depend upon the Bos taurus reference genome (UMD3.1.1), which was created using no…
View article
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads Open
Complete and accurate genome assemblies form the basis of most downstream genomic analyses and are of critical importance. Recent genome assembly projects have relied on a combination of noisy long-read sequencing and accurate short-read s…
View article
Assessing genome assembly quality using the LTR Assembly Index (LAI) Open
Assembling a plant genome is challenging due to the abundance of repetitive sequences, yet no standard is available to evaluate the assembly of repeat space. LTR retrotransposons (LTR-RTs) are the predominant interspersed repeat that is po…
View article
ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter Open
The assembly of DNA sequences de novo is fundamental to genomics research. It is the first of many steps toward elucidating and characterizing whole genomes. Downstream applications, including analysis of genomic variation between species,…
View article
SKESA: strategic k-mer extension for scrupulous assemblies Open
SKESA is a DeBruijn graph-based de-novo assembler designed for assembling reads of microbial genomes sequenced using Illumina. Comparison with SPAdes and MegaHit shows that SKESA produces assemblies that have high sequence quality and cont…
View article
Pan-genome analysis highlights the extent of genomic variation in cultivated and wild rice Open
The rich genetic diversity in Oryza sativa and Oryza rufipogon serves as the main sources in rice breeding. Large-scale resequencing has been undertaken to discover allelic variants in rice, but much of the information for genetic variatio…
View article
RNA-Seq differential expression analysis: An extended review and a software tool Open
The correct identification of differentially expressed genes (DEGs) between specific conditions is a key in the understanding phenotypic variation. High-throughput transcriptome sequencing (RNA-Seq) has become the main option for these stu…
View article
The <i>Sorghum bicolor</i> reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization Open
Summary Sorghum bicolor is a drought tolerant C4 grass used for the production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (approximately 800 Mbp), diploid gene…