Explanipedia

Towards Accurate and Efficient Sub-8-Bit Integer Training Open

Wenjin Guo, Donglai Liu, Weiying Xie, Yunsong Li, Xuefei Ning , et al. · 2024

Neural network training is a memory- and compute-intensive task. Quantization, which enables low-bitwidth formats in training, can significantly mitigate the workload. To reduce quantization error, recent methods have developed new data fo…

PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs Open

Moazin Khatti, Xingyu Tian, Ahmad Sedigh Baroughi, Akhil Raj Baranwal, Yuze Chi , et al. · 2024

Computer science Economics

In recent years, the adoption of FPGAs in datacenters has increased, with a growing number of users choosing High-Level Synthesis (HLS) as their preferred programming method. While HLS simplifies FPGA programming, one notable challenge ari…

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model Open

Haisheng Fu, Jie Liang, Zhenman Fang, Jingning Han, Feng Liang , et al. · 2024

Computer science Mathematics Physics

Recently learned image compression (LIC) has achieved great progress and even outperformed the traditional approach using DCT or discrete wavelet transform (DWT). However, LIC mainly reduces spatial redundancy in the autoencoder networks a…

Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers Open

Zhengang Li, Alec Lu, Yanyue Xie, Zhenglun Kong, Mengshu Sun , et al. · 2024

Computer science

Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs). However, ViT models are often computation-intensive for efficient deployment on resource-limit…

HiSpMV: Hybrid Row Distribution and Vector Buffering for Imbalanced SpMV Acceleration on FPGAs Open

Manoj B. Rajashekar, Xingyu Tian, Zhenman Fang · 2024

Computer science Mathematics Physics

Sparse matrix-vector multiplication (SpMV) is a fundamental operation in numerous applications such as scientific computing, machine learning, and graph analytics. While recent studies have made great progress in accelerating SpMV on HBM-e…

Learned Image Compression with Dual-Branch Encoder and Conditional Information Coding Open

Haisheng Fu, Feng Liang, Jie Liang, Zhenman Fang, Guohe Zhang , et al. · 2024

Computer science Physics

Recent advancements in deep learning-based image compression are notable. However, prevalent schemes that employ a serial context-adaptive entropy model to enhance rate-distortion (R-D) performance are markedly slow. Furthermore, the compl…

TAPA: A Scalable Task-parallel Dataflow Programming Framework for Modern FPGAs with Co-optimization of HLS and Physical Design Open

Licheng Guo, Yuze Chi, Jason Lau, Linghao Song, Xingyu Tian , et al. · 2023

Computer science Economics

In this article, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set …

A Cycle-Accurate Soft Error Vulnerability Analysis Framework for FPGA-based Designs Open

Eduardo Luis Rhod, Behnam Ghavami, Zhenman Fang, Lesley Shannon · 2023

Computer science Engineering Physics

Many aerospace and automotive applications use FPGAs in their designs due to their low power and reconfigurability requirements. Meanwhile, such applications also pose a high standard on system reliability, which makes the early-stage reli…

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers Open

Peiyan Dong, Mengshu Sun, Alec Lu, Yanyue Xie, Kenneth Liu , et al. · 2022

Computer science

While vision transformers (ViTs) have continuously achieved new milestones in the field of computer vision, their sophisticated network architectures with high computation and memory costs have impeded their deployment on resource-limited …

SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery Open

Jiaqing Zhang, Jie Lei, Weiying Xie, Zhenman Fang, Yunsong Li , et al. · 2022

Computer science Philosophy

Accurately and timely detecting multiscale small objects that contain tens of pixels from remote sensing images (RSI) remains challenging. Most of the existing solutions primarily design complex deep neural networks to learn strong feature…

TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design Open

Licheng Guo, Yuze Chi, Jason Lau, Linghao Song, Xingyu Tian , et al. · 2022

Computer science Engineering

In this paper, we propose TAPA, an end-to-end framework that compiles a C++ task-parallel dataflow program into a high-frequency FPGA accelerator. Compared to existing solutions, TAPA has two major advantages. First, TAPA provides a set of…

SASA: A Scalable and Automatic Stencil Acceleration Framework for Optimized Hybrid Spatial and Temporal Parallelism on HBM-based FPGAs Open

Xingyu Tian, Zhifan Ye, Alec Lu, Licheng Guo, Yuze Chi , et al. · 2022

Computer science Biology

Stencil computation is one of the fundamental computing patterns in many application domains such as scientific computing and image processing. While there are promising studies that accelerate stencils on FPGAs, there lacks an automated a…

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization Open

Zhengang Li, Mengshu Sun, Alec Lu, Haoyu Ma, Geng Yuan , et al. · 2022

Computer science Engineering

Vision transformers (ViTs) are emerging with significantly improved accuracy in computer vision tasks. However, their complex architecture and enormous computation/storage demand impose urgent needs for new hardware accelerator design meth…

TopSort: A High-Performance Two-Phase Sorting Accelerator Optimized on HBM-based FPGAs Open

Weikang Qiao, Licheng Guo, Zhenman Fang, Mau-Chung Frank Chang, Jason Cong · 2022

Computer science

The emergence of high-bandwidth memory (HBM) brings new opportunities to boost the performance of sorting acceleration on FPGAs, which was conventionally bounded by the available off-chip memory bandwidth. However, it is nontrivial for des…

Introduction to the Special Section on High-level Synthesis for FPGA: Next-generation Technologies and Applications Open

Christian Pilato, Zhenman Fang, Yuko Hara–Azumi, Jihyeon Hwang · 2022

Computer science Engineering

No abstract available.

FitAct: Error Resilient Deep Neural Networks via Fine-Grained Post-Trainable Activation Functions Open

Behnam Ghavami, Mani Sadati, Zhenman Fang, Lesley Shannon · 2021

Computer science Physics

Deep neural networks (DNNs) are increasingly being deployed in safety-critical systems such as personal healthcare devices and self-driving cars. In such DNN-based systems, error resilience is a top priority since faults in DNN inference c…

Stealthy Attack on Algorithmic-Protected DNNs via Smart Bit Flipping Open

Behnam Ghavami, Seyd Movi, Zhenman Fang, Lesley Shannon · 2021

Computer science Chemistry

Recently, deep neural networks (DNNs) have been deployed in safety-critical systems such as autonomous vehicles and medical devices. Shortly after that, the vulnerability of DNNs were revealed by stealthy adversarial examples where crafted…

SeaPlace: Process Variation Aware Placement for Reliable Combinational Circuits against SETs and METs Open

Kiarash Saremi, Hossein Pedram, Behnam Ghavami, Mohsen Raji, Zhenman Fang , et al. · 2021

Computer science Mathematics Engineering

Nowadays nanoscale combinational circuits are facing significant reliability challenges including soft errors and process variations. This paper presents novel process variation-aware placement strategies that include two algorithms to inc…

BDFA: A Blind Data Adversarial Bit-flip Attack on Deep Neural Networks Open

Behnam Ghavami, Mani Sadati, Mohammad Shahidzadeh, Zhenman Fang, Lesley Shannon · 2021

Computer science Sociology

Adversarial bit-flip attack (BFA) on Neural Network weights can result in catastrophic accuracy degradation by flipping a very small number of bits. A major drawback of prior bit flip attack techniques is their reliance on test data. This …

FPGA-based Near Data Processing Platform Selection Using Fast Performance Modeling (WiP Paper) Open

Nazanin Farahpour, Zhenman Fang, Glenn Reinman · 2020

Computer science Physics

With the trend of adopting FPGAs in data centers, various FPGA acceleration platforms have been developed in recent years. Each server could incorporate one or many of these FPGAs at different compute hierarchy levels to match its workload…

Best-Effort FPGA Programming: A Few Steps Can Go a Long Way Open

Jason Cong, Zhenman Fang, Yuchen Hao, Peng Wei, Cody Hao Yu , et al. · 2018

Computer science

FPGA-based heterogeneous architectures provide programmers with the ability to customize their hardware accelerators for flexible acceleration of many workloads. Nonetheless, such advantages come at the cost of sacrificing programmability.…

Revisiting FPGA Acceleration of Molecular Dynamics Simulation with Dynamic Data Flow Behavior in High-Level Synthesis Open

Jason Cong, Zhenman Fang, Hassan Kianinejad, Peng Wei · 2016

Computer science Physics Chemistry

Molecular dynamics (MD) simulation is one of the past decade's most important tools for enabling biology scientists and researchers to explore human health and diseases. However, due to the computation complexity of the MD algorithm, it ta…

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for Accelerator-Rich Architectures Open

Yuting Chen, Jason Cong, Zhenman Fang, Bingjun Xiao, Peipei Zhou · 2016

Computer science Engineering

Compared to conventional general-purpose processors, accelerator-rich architectures (ARAs) can provide orders-of-magnitude performance and energy gains and are emerging as one of the most promising solutions in the age of dark silicon. How…

Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale Open

Muhuan Huang, Di Wu, Cody Hao Yu, Zhenman Fang, Matteo Interlandi , et al. · 2016

Computer science Engineering

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microso…

A quantitative analysis on microarchitectures of modern CPU-FPGA platforms Open

Young Choi, Jason Cong, Zhenman Fang, Yuchen Hao, Glenn Reinman , et al. · 2016

Computer science

CPU-FPGA heterogeneous acceleration platforms have shown great potential for continued performance and energy efficiency improvement for modern data centers, and have captured great attention from both academia and industry. However, it is…

Zhenman Fang YOU? Author Swipe