Joshua Zhexue Huang
YOU?
Author Swipe
View article: Editorial: Special Issue on Recent Advances in Statistical Analytics Theories and Methods for Big Data
Editorial: Special Issue on Recent Advances in Statistical Analytics Theories and Methods for Big Data Open
View article: A Novel Flexible Kernel Density Estimator for Multimodal Probability Density Functions
A Novel Flexible Kernel Density Estimator for Multimodal Probability Density Functions Open
Estimating probability density functions (PDFs) is critical in data analysis, particularly for complex multimodal distributions. traditional kernel density estimator (KDE) methods often face challenges in accurately capturing multimodal st…
View article: Optimizing Resource Scheduling in Computing Power Networks for Low-Consumption Big Data Analytics
Optimizing Resource Scheduling in Computing Power Networks for Low-Consumption Big Data Analytics Open
View article: A Fast and Accurate Block Compression Solution for Spatiotemporal Kernel Density Visualization
A Fast and Accurate Block Compression Solution for Spatiotemporal Kernel Density Visualization Open
View article: Efficient Multi-Sample Approximate Computing for Scalable Analysis of Massive Distributed Datasets on Resource-Constrained Clusters
Efficient Multi-Sample Approximate Computing for Scalable Analysis of Massive Distributed Datasets on Resource-Constrained Clusters Open
The prolific explosion of data in today's digital sphere by modern AI applications has created new challenges and opportunities for business industries. This has necessitated the development of scalable methods for analyzing massive datase…
View article: Discriminative local affine-hull clustering for high-dimensional data
Discriminative local affine-hull clustering for high-dimensional data Open
View article: Flexible π-stacked organic frameworks with dynamic electronic interactions for highly efficient photocatalytic hydrogen evolution
Flexible π-stacked organic frameworks with dynamic electronic interactions for highly efficient photocatalytic hydrogen evolution Open
There is growing interest in using light-responsive soft matter with dynamic noncovalent bonding for solar-driven fuel production. Here, we report a supramolecular organic light absorber, the π-stacked organic framework (πOF-1), formed thr…
View article: Approximate Approach for Frequent Itemsets Mining on Massive Distributed Data beyond Computing Capacity
Approximate Approach for Frequent Itemsets Mining on Massive Distributed Data beyond Computing Capacity Open
View article: An asset subset-constrained minimax optimization framework for online portfolio selection
An asset subset-constrained minimax optimization framework for online portfolio selection Open
Effective online portfolio selection necessitates seamless integration of three key properties: diversity, sparsity, and risk control. However, existing algorithms often prioritize one property at the expense of the others due to inherent …
View article: CDFRS: A scalable sampling approach for efficient big data analysis
CDFRS: A scalable sampling approach for efficient big data analysis Open
The sampling-based approximation method has demonstrated its potential in various domains such as machine learning, query processing, and data analysis. Most preceding sampling algorithms generate samples at the record level, making it imp…
View article: Graph-Augmented Contrastive Clustering for Time Series Data
Graph-Augmented Contrastive Clustering for Time Series Data Open
View article: Graph-Augmented Contrastive Clustering for Time Series Data
Graph-Augmented Contrastive Clustering for Time Series Data Open
View article: MapReduce vs Non-MapReduce - Efficiency and Scalability in Big Data Computing
MapReduce vs Non-MapReduce - Efficiency and Scalability in Big Data Computing Open
MapReduce is a popular distributed computing paradigm for processing big data in a massively parallel fashion.However, when it is used to implement and run highly iterative algorithms for analyzing distributedly stored big data, the MapRed…
View article: Non-MapReduce computing for intelligent analysis of Big Data
Non-MapReduce computing for intelligent analysis of Big Data Open
View article: A scalable and flexible basket analysis system for big transaction data in Spark
A scalable and flexible basket analysis system for big transaction data in Spark Open
Basket analysis is a prevailing technique to help retailers uncover patterns and associations of sold products in customer shopping transactions. However, as the size of transaction databases grows, the traditional basket analysis techniqu…
View article: Geographically distributed data management to support large-scale data analysis
Geographically distributed data management to support large-scale data analysis Open
Nowadays, several companies prefer storing their data on multiple data centers with replication for many reasons. The data that spans various data centers ensures the fastest possible response time for customers and workforces who are geog…
View article: Data quality model for assessing public COVID-19 big datasets
Data quality model for assessing public COVID-19 big datasets Open
View article: An ensemble method for estimating the number of clusters in a big data set using multiple random samples
An ensemble method for estimating the number of clusters in a big data set using multiple random samples Open
Clustering a big dataset without knowing the number of clusters presents a big challenge to many existing clustering algorithms. In this paper, we propose a Random Sample Partition-based Centers Ensemble (RSPCE) algorithm to identify the n…
View article: A novel observation points‐based positive‐unlabeled learning algorithm
A novel observation points‐based positive‐unlabeled learning algorithm Open
In this study, an observation points‐based positive‐unlabeled learning algorithm (hence called OP‐PUL) is proposed to deal with positive‐unlabeled learning (PUL) tasks by judiciously assigning highly credible labels to unlabeled samples. T…
View article: Survey of Distributed Computing Frameworks for Supporting Big Data Analysis
Survey of Distributed Computing Frameworks for Supporting Big Data Analysis Open
Distributed computing frameworks are the fundamental component of distributed computing systems. They provide an essential way to support the efficient processing of big data on clusters or cloud. The size of big data increases at a pace t…
View article: A novel correlation Gaussian process regression-based extreme learning machine
A novel correlation Gaussian process regression-based extreme learning machine Open
View article: Graph-Augmented Contrastive Clustering for Time Series
Graph-Augmented Contrastive Clustering for Time Series Open
View article: A Novel Adaptive Perturbation-Based Method for Long-Tailed Open Sets Recognition
A Novel Adaptive Perturbation-Based Method for Long-Tailed Open Sets Recognition Open
View article: Graph-Augmented Contrastive Clustering for Time Series
Graph-Augmented Contrastive Clustering for Time Series Open
View article: A Dynamic Variational Framework for Open-World Node Classification in Structured Sequences
A Dynamic Variational Framework for Open-World Node Classification in Structured Sequences Open
Structured sequences are a popular data representation, used to model complex data such as traffic networks. A key machine learning task for structured sequences is node classification, that is predicting the class labels of unlabeled node…
View article: A Novel Mixed-Attribute Fusion-Based Naive Bayesian Classifier
A Novel Mixed-Attribute Fusion-Based Naive Bayesian Classifier Open
The Naive Bayesian classifier (NBC) is a well-known classification model that has a simple structure, low training complexity, excellent scalability, and good classification performances. However, the NBC has two key limitations: (1) it is…
View article: DSAA 2022 Cover Page
DSAA 2022 Cover Page Open
Message from the General Chairs DSAA'2022As the technological foundation of the digital economy, artificial intelligence and other smart and green digital technologies cannot thrive without rich data resources.Data science and analytics pr…
View article: Offloading Dependent Tasks in MEC-enabled IoT Systems: A Preference-based Hybrid Optimization Method
Offloading Dependent Tasks in MEC-enabled IoT Systems: A Preference-based Hybrid Optimization Method Open
The rapid development of IoT-based services has resulted in an exponen- tial increase in the number of connected smart mobile devices (SMDs). Processing the massive data generated by the large number of SMDs is becoming a big problem for m…
View article: A Novel Correlation Gaussian Process Regression-Based Extreme Learning Machine
A Novel Correlation Gaussian Process Regression-Based Extreme Learning Machine Open
One obvious defect of Extreme Learning Machine (ELM) is that the prediction performance of ELM is sensitive to the random initialization of input-layer weights and hidden-layer biases. GPRELM integrating Gaussian Process Regression (GPR) i…
View article: Observation points classifier ensemble for high‐dimensional imbalanced classification
Observation points classifier ensemble for high‐dimensional imbalanced classification Open
In this paper, an Observation Points Classifier Ensemble (OPCE) algorithm is proposed to deal with High‐Dimensional Imbalanced Classification (HDIC) problems based on data processed using the Multi‐Dimensional Scaling (MDS) feature extract…