Mohammad Hossein Sekhavat
YOU?
Author Swipe
View article: Computational Bottlenecks of Training Small-scale Large Language Models
Computational Bottlenecks of Training Small-scale Large Language Models Open
While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers. However, there is limited research on the training behavior and…
View article: Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models
Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models Open
Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget, leading to inefficient resource utilization. To address this shortcoming, recent advancements in mixture of expert (MoE) models, speculati…
View article: An Efficient and Streaming Audio Visual Active Speaker Detection System
An Efficient and Streaming Audio Visual Active Speaker Detection System Open
This paper delves into the challenging task of Active Speaker Detection (ASD), where the system needs to determine in real-time whether a person is speaking or not in a series of video frames. While previous works have made significant str…
View article: CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Open
Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text embeddings. However, pairwise similarity computation in contrastive loss between image and te…
View article: OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework Open
The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end…