Xiaowo Wang
YOU?
Author Swipe
View article: Advancing genetic engineering with active learning: theory, implementations and potential opportunities.
Advancing genetic engineering with active learning: theory, implementations and potential opportunities. Open
Employing machine learning (ML) models to accelerate experimentation and uncover biological mechanisms has been a rising tendency in genetic engineering. However, effectively collecting data to enhance model accuracy and improve design rem…
View article: <i>De novo</i> design of insulated <i>cis</i>-regulatory elements based on deep learning-predicted fitness landscape
<i>De novo</i> design of insulated <i>cis</i>-regulatory elements based on deep learning-predicted fitness landscape Open
Precise control of gene activity within a host cell is crucial in bioengineering applications. Despite significant advancements in cis-regulatory sequence activity prediction and reverse engineering, the context-dependent effects of host c…
View article: esMPRA: an easy-to-use systematic pipeline for MPRA experiment quality control and data analysis
esMPRA: an easy-to-use systematic pipeline for MPRA experiment quality control and data analysis Open
Motivation Massively Parallel Reporter Assays (MPRAs) have emerged as pivotal tools for systematically profiling cis-regulatory element activity, playing critical roles in deciphering gene regulation mechanisms and synthetic regulatory ele…
View article: Simulation-guided pan-cancer analysis identifies a novel regulator of CpG island hypermethylation heterogeneity
Simulation-guided pan-cancer analysis identifies a novel regulator of CpG island hypermethylation heterogeneity Open
CpG island hypermethylation, a hallmark of cancer, exhibits substantial heterogeneity across tumors, presenting both opportunities and challenges for cancer diagnostics and therapeutics. While this heterogeneity offers potential for patien…
View article: Assessment of anemia recovery using peripheral blood smears by deep semi-supervised learning
Assessment of anemia recovery using peripheral blood smears by deep semi-supervised learning Open
Monitoring anemia recovery is crucial for clinical intervention. Morphological assessment of red blood cells (RBCs) with peripheral blood smears (PBSs) provides additional information beyond routine blood tests. However, the PBS test is la…
View article: Systematic representation and optimization enable the inverse design of cross-species regulatory sequences in bacteria
Systematic representation and optimization enable the inverse design of cross-species regulatory sequences in bacteria Open
Regulatory sequences encode crucial gene expression signals, yet the sequence characteristics that determine their functionality across species remain obscure. Deep generative models have demonstrated considerable potential in various inve…
View article: Deconer: An Evaluation Toolkit for Reference-based Deconvolution Methods Using Gene Expression Data
Deconer: An Evaluation Toolkit for Reference-based Deconvolution Methods Using Gene Expression Data Open
In recent years, computational methods for quantifying cell-type proportions from transcription data have gained significant attention, particularly those reference-based methods which have demonstrated high accuracy. However, there is cur…
View article: Foundation models in bioinformatics
Foundation models in bioinformatics Open
With the adoption of foundation models (FMs), artificial intelligence (AI) has become increasingly significant in bioinformatics and has successfully addressed many historical challenges, such as pre-training frameworks, model evaluation a…
View article: Space: reconciling multiple <u>spa</u>tial domain identification algorithms via <u>c</u>onsensus clust<u>e</u>ring
Space: reconciling multiple spatial domain identification algorithms via consensus clustering Open
Motivation The rapid development of spatially resolved transcriptomics (SRT) technologies has provided unprecedented opportunities for characterizing and understanding tissue architecture. As this field continues to advance, various method…
View article: Revealing long-range heterogeneous organization of nucleoproteins with N6-methyladenine footprinting
Revealing long-range heterogeneous organization of nucleoproteins with N6-methyladenine footprinting Open
A major challenge in epigenetics is uncovering the dynamic distribution of nucleosomes and other DNA-binding proteins, which plays a crucial role in regulating cellular functions. Established approaches such as ATAC-seq, ChIP-seq, and CUT&…
View article: Non-contact Dexterous Micromanipulation with Multiple Optoelectronic Robots
Non-contact Dexterous Micromanipulation with Multiple Optoelectronic Robots Open
Micromanipulation systems leverage automation and robotic technologies to improve the precision, repeatability, and efficiency of various tasks at the microscale. However, current approaches are typically limited to specific objects or tas…
View article: Efficient computation by molecular competition networks
Efficient computation by molecular competition networks Open
Most biomolecular systems exhibit computation abilities, which are often achieved through complex networks such as signal transduction networks. Particularly, molecular competition in these networks can introduce crosstalk and serve as a h…
View article: Unveil <i>cis</i>-acting combinatorial mRNA motifs by interpreting deep neural network
Unveil <i>cis</i>-acting combinatorial mRNA motifs by interpreting deep neural network Open
Summary Cis-acting mRNA elements play a key role in the regulation of mRNA stability and translation efficiency. Revealing the interactions of these elements and their impact plays a crucial role in understanding the regulation of the mRNA…
View article: Weakly-supervised causal discovery based on fuzzy knowledge and complex data complementarity
Weakly-supervised causal discovery based on fuzzy knowledge and complex data complementarity Open
Causal discovery based on observational data is important for deciphering the causal mechanism behind complex systems. However, the effectiveness of existing causal discovery methods is limited due to inferior prior knowledge, domain incon…
View article: GPro: generative AI-empowered toolkit for promoter design
GPro: generative AI-empowered toolkit for promoter design Open
Motivation Promoters with desirable properties are crucial in biotechnological applications. Generative AI (GenAI) has demonstrated potential in creating novel synthetic promoters with significantly enhanced functionality. However, these m…
View article: DIProT: A deep learning based interactive toolkit for efficient and effective Protein design
DIProT: A deep learning based interactive toolkit for efficient and effective Protein design Open
The protein inverse folding problem, designing amino acid sequences that fold into desired protein structures, is a critical challenge in biological sciences. Despite numerous data-driven and knowledge-driven methods, there remains a need …
View article: Unified machine learning framework uncovers overlooked bias control in immunogenic neoantigen identification
Unified machine learning framework uncovers overlooked bias control in immunogenic neoantigen identification Open
Neoantigens play a crucial role in tumor immune process and precisely identifying them can greatly contribute to tumor immunotherapy design. There are three main steps in the neoantigen immune process, i.e., binding with MHCs, extracellula…
View article: Deconer: A comprehensive and systematic evaluation toolkit for reference-based cell type deconvolution algorithms using gene expression data
Deconer: A comprehensive and systematic evaluation toolkit for reference-based cell type deconvolution algorithms using gene expression data Open
In recent years, computational methods for quantifying cell type proportions from transcription data have gained significant attention, particularly those reference-based methods which have demonstrated high accuracy. However, there is cur…
View article: Efficient computation by molecular competition networks
Efficient computation by molecular competition networks Open
Most biomolecular systems exhibit computation abilities, which are often achieved through complex networks such as signal transduction networks. Particularly, molecular competition in these networks can introduce crosstalk and serve as a h…
View article: Building digital life systems for future biology and medicine
Building digital life systems for future biology and medicine Open
The rapid development of biological technology (BT) and information technology (IT) especially of genomics and artificial intelligence (AI) is bringing great potential for revolutionizing future medicine. We propose the concept and framewo…
View article: A generic reference defined by consensus peaks for scATAC-seq data analysis
A generic reference defined by consensus peaks for scATAC-seq data analysis Open
The rapid advancement of transposase-accessible chromatin using sequencing (ATAC-seq) technology, particularly with the emergence of single-cell ATAC-seq (scATAC-seq), has accelerated the studies of gene regulation. However, the absence of…
View article: Deep flanking sequence engineering for efficient promoter design
Deep flanking sequence engineering for efficient promoter design Open
Human experts are good at summarizing explicit strong patterns from small samples, while deep learning models can learn implicit weak patterns from big data. Biologists have traditionally described the sequence patterns of promoters via tr…
View article: NeuronMotif: Deciphering cis-regulatory codes by layer-wise demixing of deep neural networks
NeuronMotif: Deciphering cis-regulatory codes by layer-wise demixing of deep neural networks Open
Discovering DNA regulatory sequence motifs and their relative positions is vital to understanding the mechanisms of gene expression regulation. Although deep convolutional neural networks (CNNs) have achieved great success in predicting ci…