Praneeth Kacham
YOU?
Author Swipe
View article: PolarQuant: Quantizing KV Caches with Polar Transformation
PolarQuant: Quantizing KV Caches with Polar Transformation Open
Large language models (LLMs) require significant memory to store Key-Value (KV) embeddings in their KV cache, especially when handling long-range contexts. Quantization of these KV embeddings is a common technique to reduce memory consumpt…
View article: Approximating the Top Eigenvector in Random Order Streams
Approximating the Top Eigenvector in Random Order Streams Open
When rows of an $n \times d$ matrix $A$ are given in a stream, we study algorithms for approximating the top eigenvector of the matrix ${A}^TA$ (equivalently, the top right singular vector of $A$). We consider worst case inputs $A$ but ass…
View article: LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions
LevAttention: Time, Space, and Streaming Efficient Algorithm for Heavy Attentions Open
A central problem related to transformers can be stated as follows: given two $n \times d$ matrices $Q$ and $K$, and a non-negative function $f$, define the matrix $A$ as follows: (1) apply the function $f$ to each entry of the $n \times n…
View article: Optimal Communication Bounds for Classic Functions in the Coordinator Model and Beyond
Optimal Communication Bounds for Classic Functions in the Coordinator Model and Beyond Open
In the coordinator model of communication with s servers, given an arbitrary non-negative function f, we study the problem of approximating the sum ∑i ∈ [n]f(xi) up to a 1 ± ε factor. Here the vector x ∈ ℝn is defined to be x = x(1) + ⋯ + …
View article: High-Dimensional Geometric Streaming for Nearly Low Rank Data
High-Dimensional Geometric Streaming for Nearly Low Rank Data Open
We study streaming algorithms for the $\ell_p$ subspace approximation problem. Given points $a_1, \ldots, a_n$ as an insertion-only stream and a rank parameter $k$, the $\ell_p$ subspace approximation problem is to find a $k$-dimensional s…
View article: Optimal Communication for Classic Functions in the Coordinator Model and Beyond
Optimal Communication for Classic Functions in the Coordinator Model and Beyond Open
In the coordinator model of communication with $s$ servers, given an arbitrary non-negative function $f$, we study the problem of approximating the sum $\sum_{i \in [n]}f(x_i)$ up to a $1 \pm \varepsilon$ factor. Here the vector $x \in R^n…
View article: Faster Algorithms for Schatten-p Low Rank Approximation
Faster Algorithms for Schatten-p Low Rank Approximation Open
We study algorithms for the Schatten-p Low Rank Approximation (LRA) problem. First, we show that by using fast rectangular matrix multiplication algorithms and different block sizes, we can improve the running time of the algorithms in the…
View article: Lower Bounds on Adaptive Sensing for Matrix Recovery
Lower Bounds on Adaptive Sensing for Matrix Recovery Open
We study lower bounds on adaptive sensing algorithms for recovering low rank matrices using linear measurements. Given an $n \times n$ matrix $A$, a general linear measurement $S(A)$, for an $n \times n$ matrix $S$, is just the inner produ…
View article: Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming
Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming Open
We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan'…
View article: PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels Open
The quadratic time and memory complexity inherent to self-attention mechanisms, with respect to sequence length, presents a critical computational bottleneck in the training and deployment of large-scale Transformer-based language models. …
View article: Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming
Pseudorandom Hashing for Space-bounded Computation with Applications in Streaming Open
We revisit Nisan's classical pseudorandom generator (PRG) for space-bounded computation (STOC 1990) and its applications in streaming algorithms. We describe a new generator, HashPRG, that can be thought of as a symmetric version of Nisan'…
View article: Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation
Sub-quadratic Algorithms for Kernel Matrices via Kernel Density Estimation Open
Kernel matrices, as well as weighted graphs represented by them, are ubiquitous objects in machine learning, statistics and other related fields. The main drawback of using kernel methods (learning and inference using kernel matrices) is e…
View article: Sketching Algorithms and Lower Bounds for Ridge Regression
Sketching Algorithms and Lower Bounds for Ridge Regression Open
We give a sketching-based iterative algorithm that computes a $1+\varepsilon$ approximate solution for the ridge regression problem $\min_x \|Ax-b\|_2^2 +λ\|x\|_2^2$ where $A \in R^{n \times d}$ with $d \ge n$. Our algorithm, for a constan…
View article: Near-Optimal Algorithms for Linear Algebra in the Current Matrix Multiplication Time
Near-Optimal Algorithms for Linear Algebra in the Current Matrix Multiplication Time Open
In the numerical linear algebra community, it was suggested that to obtain nearly optimal bounds for various problems such as rank computation, finding a maximal linearly independent subset of columns (a basis), regression, or low-rank app…
View article: Reduced-Rank Regression with Operator Norm Error
Reduced-Rank Regression with Operator Norm Error Open
A common data analysis task is the reduced-rank regression problem: $$\min_{\textrm{rank-}k \ X} \|AX-B\|,$$ where $A \in \mathbb{R}^{n \times c}$ and $B \in \mathbb{R}^{n \times d}$ are given large matrices and $\|\cdot\|$ is some norm. H…
View article: Dimensionality Reduction for Sum-of-Distances Metric
Dimensionality Reduction for Sum-of-Distances Metric Open
We give a dimensionality reduction procedure to approximate the sum of distances of a given set of $n$ points in $R^d$ to any "shape" that lies in a $k$-dimensional subspace. Here, by "shape" we mean any set of points in $R^d$. Our algorit…