Tunhou Zhang
YOU?
Author Swipe
View article: AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems
AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems Open
View article: AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems
AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems Open
The performance bottleneck of deep-learning-based recommender systems resides in their backbone Deep Neural Networks. By integrating Processing-In-Memory~(PIM) architectures, researchers can reduce data movement and enhance energy efficien…
View article: Towards Automated Model Design on Recommender Systems
Towards Automated Model Design on Recommender Systems Open
The increasing popularity of deep learning models has created new opportunities for developing artificial intelligence–based recommender systems. Designing recommender systems using deep neural networks (DNNs) requires careful architecture…
View article: Towards Automated Model Design on Recommender Systems
Towards Automated Model Design on Recommender Systems Open
The increasing popularity of deep learning models has created new opportunities for developing AI-based recommender systems. Designing recommender systems using deep neural networks requires careful architecture design, and further optimiz…
View article: Enhancing Performance and Scalability of Large-Scale Recommendation Systems with Jagged Flash Attention
Enhancing Performance and Scalability of Large-Scale Recommendation Systems with Jagged Flash Attention Open
The integration of hardware accelerators has significantly advanced the\ncapabilities of modern recommendation systems, enabling the exploration of\ncomplex ranking paradigms previously deemed impractical. However, the GPU-based\ncomputati…
View article: DistDNAS: Search Efficient Feature Interactions within 2 Hours
DistDNAS: Search Efficient Feature Interactions within 2 Hours Open
Search efficiency and serving efficiency are two major axes in building feature interactions and expediting the model development process in recommender systems. On large-scale benchmarks, searching for the optimal feature interaction desi…
View article: Farthest Greedy Path Sampling for Two-shot Recommender Search
Farthest Greedy Path Sampling for Two-shot Recommender Search Open
Weight-sharing Neural Architecture Search (WS-NAS) provides an efficient mechanism for developing end-to-end deep recommender models. However, in complex search spaces, distinguishing between superior and inferior architectures (or paths) …
View article: LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search
LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search Open
Search spaces hallmark the advancement of Neural Architecture Search (NAS). Large and complex search spaces with versatile building operators and structures provide more opportunities to brew promising architectures, yet pose severe challe…
View article: LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search
LISSNAS: Locality-based Iterative Search Space Shrinkage for Neural Architecture Search Open
Search spaces hallmark the advancement of Neural Architecture Search (NAS). Large and complex search spaces with versatile building operators and structures provide more opportunities to brew promising architectures, yet pose severe challe…
View article: PIDS: Joint Point Interaction-Dimension Search for 3D Point Cloud
PIDS: Joint Point Interaction-Dimension Search for 3D Point Cloud Open
The interaction and dimension of points are two important axes in designing point operators to serve hierarchical 3D models. Yet, these two axes are heterogeneous and challenging to fully explore. Existing works craft point operator under …
View article: NASRec: Weight Sharing Neural Architecture Search for Recommender Systems
NASRec: Weight Sharing Neural Architecture Search for Recommender Systems Open
The rise of deep neural networks offers new opportunities in optimizing recommender systems. However, optimizing recommender systems using deep neural networks requires delicate architecture fabrication. We propose NASRec, a paradigm that …
View article: Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data
Towards Collaborative Intelligence: Routability Estimation based on Decentralized Private Data Open
Applying machine learning (ML) in design flow is a popular trend in EDA with various applications from design quality predictions to optimizations. Despite its promise, which has been demonstrated in both academic researches and industrial…
View article: NASGEM: Neural Architecture Search via Graph Embedding Method
NASGEM: Neural Architecture Search via Graph Embedding Method Open
Neural Architecture Search (NAS) automates and prospers the design of neural networks. Estimator-based NAS has been proposed recently to model the relationship between architectures and their performance to enable scalable and flexible sea…
View article: Automatic Routability Predictor Development Using Neural Architecture Search
Automatic Routability Predictor Development Using Neural Architecture Search Open
The rise of machine learning technology inspires a boom of its applications in electronic design automation (EDA) and helps improve the degree of automation in chip designs. However, manually crafted machine learning models require extensi…
View article: NASGEM: Neural Architecture Search via Graph Embedding Method
NASGEM: Neural Architecture Search via Graph Embedding Method Open
Neural Architecture Search (NAS) automates and prospers the design of neural networks. Estimator-based NAS has been proposed recently to model the relationship between architectures and their performance to enable scalable and flexible sea…
View article: AutoShrink: A Topology-Aware NAS for Discovering Efficient Neural Architecture
AutoShrink: A Topology-Aware NAS for Discovering Efficient Neural Architecture Open
Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell struc…
View article: AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture
AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture Open
Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell struc…
View article: SwiftNet: Using Graph Propagation as Meta-knowledge to Search Highly Representative Neural Architectures
SwiftNet: Using Graph Propagation as Meta-knowledge to Search Highly Representative Neural Architectures Open
Designing neural architectures for edge devices is subject to constraints of accuracy, inference latency, and computational cost. Traditionally, researchers manually craft deep neural networks to meet the needs of mobile devices. Neural Ar…