Saptarshi Bej
YOU?
Author Swipe
View article: Dependency-aware synthetic tabular data generation
Dependency-aware synthetic tabular data generation Open
Synthetic tabular data is increasingly used in privacy-sensitive domains such as health care, but existing generative models often fail to preserve inter-attribute relationships. In particular, functional dependencies (FDs) and logical dep…
View article: Detection of pre-ictal epileptic events using a self-attention based neural network from raw Neonatal EEG data
Detection of pre-ictal epileptic events using a self-attention based neural network from raw Neonatal EEG data Open
Epileptic seizures can occur unpredictably, making real-time monitoring and early warning systems critical, especially in neonatal patients, where timely intervention can significantly improve outcomes. Neonatal seizures are often subtle a…
View article: Multivariate functional linear discriminant analysis for partially-observed time series
Multivariate functional linear discriminant analysis for partially-observed time series Open
The more extensive access to time-series data, especially for biomedical purposes, raises new methodological challenges, particularly regarding missing values. Functional linear discriminant analysis (FLDA) extends Linear Discriminant Anal…
View article: Handling Missing Data in Downstream Tasks With Distribution-Preserving Guarantees
Handling Missing Data in Downstream Tasks With Distribution-Preserving Guarantees Open
Missing feature values are a significant hurdle for downstream machine-learning tasks such as classification. However, imputation methods for classification might be time-consuming for high-dimensional data, and offer few theoretical guara…
View article: Identification of key factors for malnutrition diagnosis in chronic gastrointestinal diseases using machine learning underscores the importance of GLIM criteria as well as additional parameters
Identification of key factors for malnutrition diagnosis in chronic gastrointestinal diseases using machine learning underscores the importance of GLIM criteria as well as additional parameters Open
Introduction Disease-related malnutrition is common but often underdiagnosed in patients with chronic gastrointestinal diseases, such as liver cirrhosis, short bowel and intestinal insufficiency, and chronic pancreatitis. To improve malnut…
View article: Bitter peptide prediction using graph neural networks
Bitter peptide prediction using graph neural networks Open
Bitter taste is an unpleasant taste modality that affects food consumption. Bitter peptides are generated during enzymatic processes that produce functional, bioactive protein hydrolysates or during the aging process of fermented products …
View article: Convex space learning for tabular synthetic data generation
Convex space learning for tabular synthetic data generation Open
Generating synthetic samples from the convex space of the minority class is a popular oversampling approach for imbalanced classification problems. Recently, deep-learning approaches have been successfully applied to modeling the convex sp…
View article: Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data
Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data Open
Functional linear discriminant analysis (FLDA) is a powerful tool that extends LDA-mediated multiclass classification and dimension reduction to univariate time-series functions. However, in the age of large multivariate and incomplete dat…
View article: ConvGeN: A convex space learning approach for deep-generative oversampling and imbalanced classification of small tabular datasets
ConvGeN: A convex space learning approach for deep-generative oversampling and imbalanced classification of small tabular datasets Open
Oversampling is commonly used to improve classifier performance for small tabular imbalanced datasets. State-of-the-art linear interpolation approaches can be used to generate synthetic samples from the convex space of the minority class. …
View article: Accounting for diverse feature-types improves patient stratification on tabular clinical datasets
Accounting for diverse feature-types improves patient stratification on tabular clinical datasets Open
Tabular Clinical and Biomedical Routine Data (CBRD) contains diverse feature types. Recent research shows that the conventional application of Uniform Manifold Projection and Approximation (UMAP) to extract clusters from the low dimensiona…
View article: Contribution of Synthetic Data Generation towards an Improved Patient Stratification in Palliative Care
Contribution of Synthetic Data Generation towards an Improved Patient Stratification in Palliative Care Open
AI model development for synthetic data generation to improve Machine Learning (ML) methodologies is an integral part of research in Computer Science and is currently being transferred to related medical fields, such as Systems Medicine an…
View article: ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets
ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets Open
Data is commonly stored in tabular format. Several fields of research are prone to small imbalanced tabular data. Supervised Machine Learning on such data is often difficult due to class imbalance. Synthetic data generation, i.e., oversamp…
View article: Attention Retrieval Model for Entity Relation Extraction From Biological Literature
Attention Retrieval Model for Entity Relation Extraction From Biological Literature Open
Natural Language Processing (NLP) has contributed to extracting relationships among biological entities, such as genes, their mutations, proteins, diseases, processes, phenotypes, and drugs, for a comprehensive and concise understanding of…
View article: Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling
Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling Open
Background The research landscape of single-cell and single-nuclei RNA-sequencing is evolving rapidly. In particular, the area for the detection of rare cells was highly facilitated by this technology. However, an automated, unbiased, and …
View article: Self-Attention-Based Models for the Extraction of Molecular Interactions from Biological Texts
Self-Attention-Based Models for the Extraction of Molecular Interactions from Biological Texts Open
For any molecule, network, or process of interest, keeping up with new publications on these is becoming increasingly difficult. For many cellular processes, the amount molecules and their interactions that need to be considered can be ver…
View article: Self-Attention Based Models for the Extraction of Molecular Interactions from Biological Texts
Self-Attention Based Models for the Extraction of Molecular Interactions from Biological Texts Open
For any molecule, network, or process of interest, to keep up with new publications on these, is becoming increasingly difficult. For many cellular processes, molecules and their interactions that need to be considered can be very large. A…
View article: Comprehensive Characterization of Multitissue Expression Landscape, Co-Expression Networks and Positive Selection in Pikeperch
Comprehensive Characterization of Multitissue Expression Landscape, Co-Expression Networks and Positive Selection in Pikeperch Open
Promising efforts are ongoing to extend genomics resources for pikeperch (Sander lucioperca), a species of high interest for the sustainable European aquaculture sector. Although previous work, including reference genome assembly, transcri…
View article: Cross-tissue transcriptome-wide association studies of 885,176 individuals and seven diseases of the gut-brain axis identify susceptibility genes shared between schizophrenia and inflammatory bowel disease
Cross-tissue transcriptome-wide association studies of 885,176 individuals and seven diseases of the gut-brain axis identify susceptibility genes shared between schizophrenia and inflammatory bowel disease Open
Genetic correlations and an increased incidence of psychiatric disorders in inflammatory-bowel disease (IBD) have been reported, but shared molecular mechanisms are unknown. We performed cross-tissue and multiple-gene conditioned transcrip…
View article: A multi-schematic classifier-independent oversampling approach for\n imbalanced datasets
A multi-schematic classifier-independent oversampling approach for\n imbalanced datasets Open
Over 85 oversampling algorithms, mostly extensions of the SMOTE algorithm,\nhave been built over the past two decades, to solve the problem of imbalanced\ndatasets. However, it has been evident from previous studies that different\noversam…
View article: Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling
Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling Open
The research landscape of single-cell and single-nuclei RNA sequencing is evolving rapidly, and one area that is enabled by this technology, is the detection of rare cells. An automated, unbiased and accurate annotation of rare subpopulati…
View article: A Multi-Schematic Classifier-Independent Oversampling Approach for Imbalanced Datasets
A Multi-Schematic Classifier-Independent Oversampling Approach for Imbalanced Datasets Open
Over 85 oversampling algorithms, mostly extensions of the SMOTE algorithm, have been built over the past two decades, to solve the problem of imbalanced datasets. However, it has been evident from previous studies that different oversampli…