Gholamreza Haffari
YOU?
Author Swipe
View article: Towards Inference-time Scaling for Continuous Space Reasoning
Towards Inference-time Scaling for Continuous Space Reasoning Open
Inference-time scaling through multiple sample generation in combination with Process- or Outcome-Reward Model (PRM or ORM) re-ranking has proven effective for text-based reasoning in large language models. This paper investigates whether …
View article: Beyond Imitation: Recovering Dense Rewards from Demonstrations
Beyond Imitation: Recovering Dense Rewards from Demonstrations Open
Conventionally, supervised fine-tuning (SFT) is treated as a simple imitation learning process that only trains a policy to imitate expert behavior on demonstration datasets. In this work, we challenge this view by establishing a fundament…
View article: G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge
G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge Open
Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge. Retrieval-augmented generation (RAG) mitigates this by incorporating external knowledge, yet existing RAGs struggle w…
View article: Physics-Grounded Motion Forecasting via Equation Discovery for Trajectory-Guided Image-to-Video Generation
Physics-Grounded Motion Forecasting via Equation Discovery for Trajectory-Guided Image-to-Video Generation Open
Recent advances in diffusion-based and autoregressive video generation models have achieved remarkable visual realism. However, these models typically lack accurate physical alignment, failing to replicate real-world dynamics in object mot…
View article: Table-r1: Self-Supervised and Reinforcement Learning for Program-Based Table Reasoning in Small Language Models
Table-r1: Self-Supervised and Reinforcement Learning for Program-Based Table Reasoning in Small Language Models Open
Table reasoning (TR) requires structured reasoning over semi-structured tabular data and remains challenging, particularly for small language models (SLMs, e.g., LLaMA-8B) due to their limited capacity compared to large LMs (LLMs, e.g., GP…
View article: Continual Speech Learning with Fused Speech Features
Continual Speech Learning with Fused Speech Features Open
Rapid growth in speech data demands adaptive models, as traditional static methods fail to keep pace with dynamic and diverse speech information. We introduce continuous speech learning, a new set-up targeting at bridging the adaptation ga…
View article: Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models
Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models Open
Large Audio Language Models (LALMs) have extended the capabilities of Large Language Models (LLMs) by enabling audio-based human interactions. However, recent research has revealed that LALMs remain vulnerable to harmful queries due to ins…
View article: RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars
RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars Open
Alignment tuning is crucial for ensuring large language models (LLMs) behave ethically and helpfully. Current alignment approaches require high-quality annotations and significant training resources. This paper proposes a low-cost, tuning-…
View article: ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning
ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning Open
Identifying cause-and-effect relationships is critical to understanding real-world dynamics and ultimately causal reasoning. Existing methods for identifying event causality in NLP, including those based on Large Language Models (LLMs), ex…
View article: Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning Open
Teacher-forcing training for audio captioning usually leads to exposure bias due to training and inference mismatch. Prior works propose the contrastive method to deal with caption degeneration. However, the contrastive method ignores the …
View article: GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation Open
Retrieval-augmented generation (RAG) has proven effective in integrating knowledge into large language models (LLMs). However, conventional RAGs struggle to capture complex relationships between pieces of knowledge, limiting their performa…
View article: SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models
SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models Open
View article: CultureInstruct: Curating Multi-Cultural Instructions at Scale
CultureInstruct: Curating Multi-Cultural Instructions at Scale Open
View article: IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data
IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data Open
View article: Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models
Reshaping Representation Space to Balance the Safety and Over-rejection in Large Audio Language Models Open
View article: Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models
Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models Open
View article: Continual Learning of Large Language Models
Continual Learning of Large Language Models Open
View article: SurveyPilot: an Agentic Framework for Automated Human Opinion Collection from Social Media
SurveyPilot: an Agentic Framework for Automated Human Opinion Collection from Social Media Open
View article: Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search
Zero-Shot Privacy-Aware Text Rewriting via Iterative Tree Search Open
View article: NAP2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human
NAP2: A Benchmark for Naturalness and Privacy-Preserving Text Rewriting by Learning from Human Open
View article: Audio Is the Achilles’ Heel: Red Teaming Audio Large Multimodal Models
Audio Is the Achilles’ Heel: Red Teaming Audio Large Multimodal Models Open
View article: SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine
SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine Open
View article: (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts Open
Literary translations remains one of the most challenging frontiers in machine translation due to the complexity of capturing figurative language, cultural nuances, and unique stylistic elements. In this work, we introduce TransAgents, a n…
View article: ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning
ACCESS : A Benchmark for Abstract Causal Event Discovery and Reasoning Open
View article: Discrete Minds in a Continuous World: Do Language Models Know Time Passes?
Discrete Minds in a Continuous World: Do Language Models Know Time Passes? Open
View article: Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation
Extending LLMs to New Languages: A Case Study of Llama and Persian Adaptation Open
Large language models (LLMs) have made great progress in classification and text generation tasks. However, they are mainly trained on English data and often struggle with low-resource languages. In this study, we explore adding a new lang…
View article: An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models
An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models Open
Large Multimodal Models (LMMs) have achieved strong performance across a range of vision and language tasks. However, their spatial reasoning capabilities are under-investigated. In this paper, we construct a novel VQA dataset, Spatial-MM,…
View article: Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models Open
Large Multimodal Models (LMMs) have demonstrated the ability to interact with humans under real-world conditions by combining Large Language Models (LLMs) and modality encoders to align multimodal information (visual and auditory) with tex…
View article: The Best of Both Worlds: Bridging Quality and Diversity in Data Selection with Bipartite Graph
The Best of Both Worlds: Bridging Quality and Diversity in Data Selection with Bipartite Graph Open
The performance of large language models (LLMs) is strongly influenced by the quality and diversity of data used during supervised fine-tuning (SFT). However, current data selection methods often prioritize one aspect over the other, resul…
View article: Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models Open
Large language models (LLMs) have demonstrated impressive reasoning abilities, but they still struggle with faithful reasoning due to knowledge gaps and hallucinations. To address these issues, knowledge graphs (KGs) have been utilized to …