Chunyuan Zhang
YOU?
Author Swipe
View article: Genetic association of tertiary lymphoid structure-related gene signatures with HCC based on Mendelian randomization and machine learning and construction of prognosis model
Genetic association of tertiary lymphoid structure-related gene signatures with HCC based on Mendelian randomization and machine learning and construction of prognosis model Open
3 TLSGs identified by machine learning and MR can predict the onset, prognosis and clinical treatment of HCC patients, and had significant genetic association with HCC.
View article: Can changes in corporate income tax rate affect corporate innovation?
Can changes in corporate income tax rate affect corporate innovation? Open
View article: SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow
SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow Open
Deep learning has become a highly popular research field, and previously deep learning algorithms ran primarily on CPUs and GPUs. However, with the rapid development of deep learning, it was discovered that existing processors could not me…
View article: RHS-TRNG: A Resilient High-Speed True Random Number Generator Based on STT-MTJ Device
RHS-TRNG: A Resilient High-Speed True Random Number Generator Based on STT-MTJ Device Open
High-quality random numbers are very critical to many fields such as cryptography, finance, and scientific simulation, which calls for the design of reliable true random number generators (TRNGs). Limited by entropy source, throughput, rel…
View article: Heterogeneous Multi-Chiplets Neural Network Accelerator
Heterogeneous Multi-Chiplets Neural Network Accelerator Open
View article: Research and establishment of steam dryness detection method
Research and establishment of steam dryness detection method Open
In this paper, by designing the overall idea of the pipeline steam dryness control measurement system, designing and implementing the pipeline steam dryness control measurement system, constructing the pipeline steam dryness control measur…
View article: SAPTM: Towards High-Throughput Per-Flow Traffic Measurement with a Systolic Array-Like Architecture on FPGA
SAPTM: Towards High-Throughput Per-Flow Traffic Measurement with a Systolic Array-Like Architecture on FPGA Open
Per-flow traffic measurement has emerged as a critical but challenging task in data centers in recent years in the face of massive network traffic. Many approximate methods have been proposed to resolve the existing resource-accuracy trade…
View article: Efficient Parallel TLD on CPU-GPU Platform for Real-Time Tracking
Efficient Parallel TLD on CPU-GPU Platform for Real-Time Tracking Open
Trackers, especially long-term (LT) trackers, now have a more complex structure and more intensive computation for nowadays' endless pursuit of high accuracy and robustness.However, computing efficiency of LT trackers cannot meet the real-…
View article: P4 to FPGA-A Fast Approach for Generating Efficient Network Processors
P4 to FPGA-A Fast Approach for Generating Efficient Network Processors Open
This paper presents a framework for converting P4 programs to VHDL and then implementing them on Field-Programmable Gate Array (FPGA) platforms. In this framework, a match-action-based hardware architecture is introduced with clearly desig…
View article: A Fast Approach for Generating Efficient Parsers on FPGAs
A Fast Approach for Generating Efficient Parsers on FPGAs Open
The development of modern networking requires that high-performance network processors be designed quickly and efficiently to support new protocols. As a very important part of the processor, the parser parses the headers of the packets—th…
View article: Balancing Distributed Key-Value Stores with Efficient In-Network Redirecting
Balancing Distributed Key-Value Stores with Efficient In-Network Redirecting Open
Today’s cloud-based online services are underpinned by distributed key-value stores (KVSs). Keys and values are distributed across back-end servers in such scale-out systems. One primary real-life performance bottleneck occurs when storage…
View article: Programming Protocol-Independent Packet Processors High-Level Programming (P4HLP): Towards Unified High-Level Programming for a Commodity Programmable Switch
Programming Protocol-Independent Packet Processors High-Level Programming (P4HLP): Towards Unified High-Level Programming for a Commodity Programmable Switch Open
Network algorithms are building blocks of network applications. They are inspired by emerging commodity programmable switches and the Programming Protocol-Independent Packet Processors (P4) language. P4 aims to provide target-independent p…
View article: Efficient Implementation of 2D and 3D Sparse Deconvolutional Neural Networks with a Uniform Architecture on FPGAs
Efficient Implementation of 2D and 3D Sparse Deconvolutional Neural Networks with a Uniform Architecture on FPGAs Open
Three-dimensional (3D) deconvolution is widely used in many computer vision applications. However, most previous works have only focused on accelerating two-dimensional (2D) deconvolutional neural networks (DCNNs) on Field-Programmable Gat…
View article: Towards a Uniform Architecture for the Efficient Implementation of 2D and 3D Deconvolutional Neural Networks on FPGAs
Towards a Uniform Architecture for the Efficient Implementation of 2D and 3D Deconvolutional Neural Networks on FPGAs Open
Three-dimensional deconvolution is widely used in many computer vision applications. However, most previous works have only focused on accelerating 2D deconvolutional neural networks (DCNNs) on FPGAs, while the acceleration of 3D DCNNs has…
View article: Interleaved Sketch: Toward Consistent Network Telemetry for Commodity Programmable Switches
Interleaved Sketch: Toward Consistent Network Telemetry for Commodity Programmable Switches Open
Network telemetry is vital to various network applications, including network anomaly detection, capacity planning, and congestion alleviation. State-of-the-art network telemetry systems are claimed to be scalable, flexible, all-purpose, a…
View article: Application-Oriented Network Scheduling With Metaflow
Application-Oriented Network Scheduling With Metaflow Open
Distributed applications usually feature a set of correlated flows between two consecutive computation stages. The scheduling of these flows has a crucial influence on job completion time. Coflow improves performance by optimizing the fini…
View article: HPGraph: High-Performance Graph Analytics with Productivity on the GPU
HPGraph: High-Performance Graph Analytics with Productivity on the GPU Open
The growing use of graph in many fields has sparked a broad interest in developing high-level graph analytics programs. Existing GPU implementations have limited performance with compromising on productivity. HPGraph, our high-performance …
View article: Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs
Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs Open
Large-scale floating-point matrix multiplication is a fundamental kernel in many scientific and engineering applications. Most existing work only focus on accelerating matrix multiplication on FPGA by adopting a linear systolic array. This…
View article: MALMM: A multi-array architecture for large-scale matrix multiplication on FPGA
MALMM: A multi-array architecture for large-scale matrix multiplication on FPGA Open
Large-scale floating-point matrix multiplication is widely used in many scientific and engineering applications. Most existing works focus on designing a linear array architecture for accelerating matrix multiplication on FPGAs. This paper…
View article: Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA
Optimizing OpenCL Implementation of Deep Convolutional Neural Network on FPGA Open
View article: High Performance Implementation of 3D Convolutional Neural Networks on a GPU
High Performance Implementation of 3D Convolutional Neural Networks on a GPU Open
Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural netw…
View article: A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC
A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC Open
We propose a highly parallel and scalable motion estimation algorithm, named multilevel resolution motion estimation ( MLRME for short), by combining the advantages of local full search and downsampling. By subsampling a video frame, a lar…
View article: FPGA‐accelerated deep convolutional neural networks for high throughput and energy efficiency
FPGA‐accelerated deep convolutional neural networks for high throughput and energy efficiency Open
Summary Recent breakthroughs in the deep convolutional neural networks (CNNs) have led to great improvements in the accuracy of both vision and auditory systems. Characterized by their deep structures and large numbers of parameters, deep …
View article: The Visual Object Tracking VOT2015 Challenge Results
The Visual Object Tracking VOT2015 Challenge Results Open
The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 62 trackers are presented. The number of tested trackers m…
View article: Soft-Error-Rate Adaptive Intervals for Low-Overhead Checkpoint Mechanism
Soft-Error-Rate Adaptive Intervals for Low-Overhead Checkpoint Mechanism Open
Soft errors are increasingly important threats to the reliability of integrated circuits.Chips manufactured in advanced technologies show variations in SER caused by variations in the process parameters.Ongoing reduction of feature sizes a…
View article: Enabling a Uniform OpenCL Device View for Heterogeneous Platforms
Enabling a Uniform OpenCL Device View for Heterogeneous Platforms Open
Aiming to ease the parallel programming for heterogeneous architectures, we propose and implement a high-level OpenCL runtime that conceptually merges multiple heterogeneous hardware devices into one virtual heterogeneous compute device (V…