Clustering high-dimensional data
View article
-Profiles: A Nonlinear Clustering Method for Pattern Detection in High Dimensional Data Open
With modern technologies such as microarray, deep sequencing, and liquid chromatography-mass spectrometry (LC-MS), it is possible to measure the expression levels of thousands of genes/proteins simultaneously to unravel important biologica…
View article
Improved Deep Embedded Clustering with Local Structure Preservation Open
Deep clustering learns deep feature representations that favor clustering task using neural networks. Some pioneering work proposes to simultaneously learn embedded features and perform clustering by explicitly defining a clustering orient…
View article
Multi-view Subspace Clustering Open
For many computer vision applications, the data sets distribute on certain low;dimensional subspaces. Subspace clustering is to find such underlying subspaces and cluster the data points correctly. In this paper, we propose a novel multi;v…
View article
Multi-view clustering: A survey Open
In the big data era, the data are generated from different sources or observed from different views. These data are referred to as multi-view data. Unleashing the power of knowledge in multi-view data is very important in big data mining a…
View article
Large-Scale Multi-View Subspace Clustering in Linear Time Open
A plethora of multi-view subspace clustering (MVSC) methods have been proposed over the past few years. Researchers manage to boost clustering accuracy from different points of view. However, many state-of-the-art MVSC algorithms, typicall…
View article
Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization Open
Multiview data clustering attracts more attention than their single-view counterparts due to the fact that leveraging multiple independent and complementary information from multiview feature spaces outperforms the single one. Multiview sp…
View article
Deep learning-based clustering approaches for bioinformatics Open
Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts …
View article
Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework Open
Subspace clustering refers to the problem of segmenting data drawn from a union of subspaces. State-of-the-art approaches for solving this problem follow a two-stage approach. In the first step, an affinity matrix is learned from the data …
View article
The Application of Unsupervised Clustering Methods to Alzheimer’s Disease Open
Clustering is a powerful machine learning tool for detecting structures in datasets. In the medical field, clustering has been proven to be a powerful tool for discovering patterns and structure in labeled and unlabeled datasets. Unlike su…
View article
Deep Neural Networks for High Dimension, Low Sample Size Data Open
Deep neural networks (DNN) have achieved breakthroughs in applications with large sample size. However, when facing high dimension, low sample size (HDLSS) data, such as the phenotype prediction problem using genetic data in bioinformatics…
View article
Robust continuous clustering Open
Significance Clustering is a fundamental experimental procedure in data analysis. It is used in virtually all natural and social sciences and has played a central role in biology, astronomy, psychology, medicine, and chemistry. Despite the…
View article
dropClust: efficient clustering of ultra-large scRNA-seq data Open
Droplet based single cell transcriptomics has recently enabled parallel screening of tens of thousands of single cells. Clustering methods that scale for such high dimensional data without compromising accuracy are scarce. We exploit Local…
View article
A Unified Framework for Representation-Based Subspace Clustering of Out-of-Sample and Large-Scale Data Open
Under the framework of spectral clustering, the key of subspace clustering is building a similarity graph, which describes the neighborhood relations among data points. Some recent works build the graph using sparse, low-rank, and l2 -norm…
View article
One-Pass Incomplete Multi-View Clustering Open
Real data are often with multiple modalities or from multiple heterogeneous sources, thus forming so-called multi-view data, which receives more and more attentions in machine learning. Multi-view clustering (MVC) becomes its important par…
View article
A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data Open
Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each…
View article
SAFE-clustering: Single-cell Aggregated (from Ensemble) clustering for single-cell RNA-seq data Open
Motivation Accurately clustering cell types from a mass of heterogeneous cells is a crucial first step for the analysis of single-cell RNA-seq (scRNA-Seq) data. Although several methods have been recently developed, they utilize different …
View article
Clustering Algorithm for a Healthcare Dataset Using Silhouette Score Value Open
The huge amount of healthcare data, coupled with the need for data analysis tools has made data mining interesting research areas.Data mining tools and techniques help to discover and understand hidden patterns in a dataset which may not b…
View article
Variable selection methods for model-based clustering Open
Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to d…
View article
Unsupervised Deep Embedding for Clustering Analysis Open
Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this p…
View article
Transfer Prototype-Based Fuzzy Clustering Open
The traditional prototype based clustering methods, such as the well-known\nfuzzy c-mean (FCM) algorithm, usually need sufficient data to find a good\nclustering partition. If the available data is limited or scarce, most of the\nexisting …
View article
Sliding Window-Based Fault Detection From High-Dimensional Data Streams Open
High-dimensional data streams are becoming increasingly ubiquitous in industrial systems. Efficient detection of system faults from these data can ensure the reliability and safety of the system. The difficulties brought about by high dime…
View article
Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features Open
Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. Here we propose a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction …
View article
Data Clustering: Algorithms and Its Applications Open
Data is useless if information or knowledge that can
\nbe used for further reasoning cannot be inferred from it.
\nCluster analysis, based on some criteria, shares data into important, practical or both categories (clusters) based on share…
View article
Clustering of single-cell multi-omics data with a multimodal deep learning method Open
Single-cell multimodal sequencing technologies are developed to simultaneously profile different modalities of data in the same cell. It provides a unique opportunity to jointly analyze multimodal data at the single-cell level for the iden…
View article
Distance‐based clustering of mixed data Open
Cluster analysis comprises of several unsupervised techniques aiming to identify a subgroup (cluster) structure underlying the observations of a data set. The desired cluster allocation is such that it assigns similar observations to the s…
View article
Joint dimension reduction and clustering analysis of single-cell RNA-seq and spatial transcriptomics data Open
Dimension reduction and (spatial) clustering is usually performed sequentially; however, the low-dimensional embeddings estimated in the dimension-reduction step may not be relevant to the class labels inferred in the clustering step. We t…
View article
Entropy-based consensus clustering for patient stratification Open
Motivation Patient stratification or disease subtyping is crucial for precision medicine and personalized treatment of complex diseases. The increasing availability of high-throughput molecular data provides a great opportunity for patient…
View article
SHARP: hyperfast and accurate processing of single-cell RNA-seq data via ensemble random projection Open
To process large-scale single-cell RNA-sequencing (scRNA-seq) data effectively without excessive distortion during dimension reduction, we present SHARP, an ensemble random projection-based algorithm that is scalable to clustering 10 milli…
View article
Integrative clustering methods of multi‐omics data for molecule‐based cancer classifications Open
One goal of precise oncology is to re‐classify cancer based on molecular features rather than its tissue origin. Integrative clustering of large‐scale multi‐omics data is an important way for molecule‐based cancer classification. The data …
View article
HDclassif: An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data Open
This paper presents the R package HDclassif which is devoted to the clustering and the discriminant analysis of high-dimensional data. The classification methods proposed in the package result from a new parametrization of the Gaussian mix…