Mohammad Hossein Sekhavat

Computational Bottlenecks of Training Small-scale Large Language Models Open

Saleh Ashkboos, Iman Mirzadeh, Keivan Alizadeh, Mohammad Hossein Sekhavat, Moin Nabi , et al. · 2024

While large language models (LLMs) dominate the AI landscape, Small-scale large Language Models (SLMs) are gaining attention due to cost and efficiency demands from consumers. However, there is limited research on the training behavior and…

Duo-LLM: A Framework for Studying Adaptive Computation in Large Language Models Open

Keivan Alizadeh, Iman Mirzadeh, Hooman Shahrokhi, Dmitry Belenko, F.W. Sun , et al. · 2024

Large Language Models (LLMs) typically generate outputs token by token using a fixed compute budget, leading to inefficient resource utilization. To address this shortcoming, recent advancements in mixture of expert (MoE) models, speculati…

An Efficient and Streaming Audio Visual Active Speaker Detection System Open

Arnav Kundu, Yanzi Jin, Mohammad Hossein Sekhavat, Max Horton, Danny Tormoen , et al. · 2024

This paper delves into the challenging task of Active Speaker Detection (ASD), where the system needs to determine in real-time whether a person is speaking or not in a series of video frames. While previous works have made significant str…

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data Open

Sachin Mehta, Maxwell Horton, Fartash Faghri, Mohammad Hossein Sekhavat, Mahyar Najibi , et al. · 2024

Contrastive learning has emerged as a transformative method for learning effective visual representations through the alignment of image and text embeddings. However, pairwise similarity computation in contrastive loss between image and te…

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework Open

Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin , et al. · 2024

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end…

Mohammad Hossein Sekhavat YOU? Author Swipe