Shaden Smith
YOU?
Author Swipe
View article: HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs Open
Collaborative filtering (CF) has been proven to be one of the most effective techniques for recommendation. Among all CF approaches, SimpleX is the state-of-the-art method that adopts a novel loss function and a proper number of negative s…
View article: DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale Open
The past several years have witnessed the success of transformer-based models, and their scale and application scenarios continue to grow aggressively. The current landscape of transformer models is increasingly diverse: the model size var…
View article: Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model Open
Pretrained general-purpose language models can achieve state-of-the-art accuracies in various natural language processing domains by adapting to downstream tasks via zero-shot, few-shot and fine-tuning techniques. Because of their success,…
View article: PIUMA: Programmable Integrated Unified Memory Architecture
PIUMA: Programmable Integrated Unified Memory Architecture Open
High performance large scale graph analytics are essential to timely analyze relationships in big data sets. Conventional processor architectures suffer from inefficient resource usage and bad scaling on those workloads. To enable efficien…
View article: Algorithms for Large-Scale Sparse Tensor Factorization
Algorithms for Large-Scale Sparse Tensor Factorization Open
University of Minnesota Ph.D. dissertation. April 2019. Major: Computer Science. Advisor: George Karypis. 1 computer file (PDF); xiv, 153 pages.
View article: Streaming Tensor Factorization for Infinite Data Sources
Streaming Tensor Factorization for Infinite Data Sources Open
Sparse tensor factorization is a popular tool in multi-way data analysis and is used in applications such as cybersecurity, recommender systems, and social network analysis. In many of these applications, the tensor is not known a priori a…
View article: Blocking Optimization Techniques for Sparse Tensor Computation
Blocking Optimization Techniques for Sparse Tensor Computation Open
We present a detailed analysis of the sparse matricized tensor times Khatri-Rao product (MTTKRP) kernel that is the key bottleneck in various sparse tensor computations. By using the well-known roofline model and carefully instrumenting th…
View article: Scalable Label Propagation for Multi-relational Learning on the Tensor Product of Graphs
Scalable Label Propagation for Multi-relational Learning on the Tensor Product of Graphs Open
Multi-relational learning on knowledge graphs infers high-order relations among the entities across the graphs. This learning task can be solved by label propagation on the tensor product of the knowledge graphs to learn the high-order rel…
View article: Scalable Label Propagation for Multi-relational Learning on Tensor Product Graph
Scalable Label Propagation for Multi-relational Learning on Tensor Product Graph Open
Label propagation on the tensor product of multiple graphs can infer multi-relations among the entities across the graphs by learning labels in a tensor. However, the tensor formulation is only empirically scalable up to three graphs due t…
View article: A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization
A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization Open
Modeling multi-way data can be accomplished using tensors, which are data structures \n indexed along three or more dimensions. Tensors are increasingly used to analyze \n extremely large and sparse multi-way datasets in life sciences, eng…
View article: Big Data and Recommender Systems
Big Data and Recommender Systems Open
Recommender systems are ubiquitous in today's marketplace and have great commercial importance, as evidenced by the large number of companies that sell recommender systems solutions. Successful recommender systems use past product purchase…
View article: DMS: Distributed Sparse Tensor Factorization with Alternating Least Squares
DMS: Distributed Sparse Tensor Factorization with Alternating Least Squares Open
Tensors are data structures indexed along three or more dimensions. Tensors have found increasing use in domains such as data mining and recommender systems where dimensions can have enormous length and are resultingly very sparse. The can…