Explanipedia

Revealing Language Model Trajectories via Kullback-Leibler Divergence Open

Ryo Kishino, Yusuke Takase, Momose Oyama, Hiroaki Yamagiwa, Hidetoshi Shimodaira · 2025

A recently proposed method enables efficient estimation of the KL divergence between language models, including models with different architectures, by assigning coordinates based on log-likelihood vectors. To better understand the behavio…

Likelihood Variance as Text Importance for Resampling Texts to Map Language Models Open

Momose Oyama, Ryo Kishino, Hiroaki Yamagiwa, Hidetoshi Shimodaira · 2025

We address the computational cost of constructing a model map, which embeds diverse language models into a common space for comparison via KL divergence. The map relies on log-likelihoods over a large text set, making the cost proportional…

Predicting drug–gene relations via analogy tasks with word embeddings Open

Hiroaki Yamagiwa, Ryoma Hashimoto, Kiwamu Arakane, Ken Murakami, Shou Soeda , et al. · 2025

Mapping 1,000+ Language Models via the Log-Likelihood Vector Open

Momose Oyama, Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira · 2025

To compare autoregressive language models at scale, we propose using log-likelihood vectors computed on a predefined text set as model features. This approach has a solid theoretical basis: when treated as model coordinates, their squared …

Mapping 1,000+ Language Models via the Log-Likelihood Vector Open

Momose Oyama, Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira · 2025

Likelihood Variance as Text Importance for Resampling Texts to Map Language Models Open

Momose Oyama, Ryo Kishino, Hiroaki Yamagiwa, Hidetoshi Shimodaira · 2025

Quantifying Lexical Semantic Shift via Unbalanced Optimal Transport Open

Ryo Kishino, Hiroaki Yamagiwa, Ryo Nagata, Sho Yokoi, Hidetoshi Shimodaira · 2024

Lexical semantic change detection aims to identify shifts in word meanings over time. While existing methods using embeddings from a diachronic corpus pair estimate the degree of change for target words, they offer limited insight into cha…

Understanding Higher-Order Correlations Among Semantic Components in Embeddings Open

Momose Oyama, Hiroaki Yamagiwa, Hidetoshi Shimodaira · 2024

Independent Component Analysis (ICA) offers interpretable semantic components of embeddings. While ICA theory assumes that embeddings can be linearly decomposed into independent components, real-world data often do not satisfy this assumpt…

Norm of Mean Contextualized Embeddings Determines their Variance Open

Hiroaki Yamagiwa, Hidetoshi Shimodaira · 2024

Contextualized embeddings vary by context, even for the same token, and form a distribution in the embedding space. To analyze this distribution, we focus on the norm of the mean embedding and the variance of the embeddings. In this study,…

Shimo Lab at "Discharge Me!": Discharge Summarization by Prompt-Driven Concatenation of Electronic Health Record Sections Open

Yunzhen He, Hiroaki Yamagiwa, Hidetoshi Shimodaira · 2024

In this paper, we present our approach to the shared task "Discharge Me!" at the BioNLP Workshop 2024. The primary goal of this task is to reduce the time and effort clinicians spend on writing detailed notes in the electronic health recor…

Revisiting Cosine Similarity via Normalized ICA-transformed Embeddings Open

Hiroaki Yamagiwa, Momose Oyama, Hidetoshi Shimodaira · 2024

Cosine similarity is widely used to measure the similarity between two embeddings, while interpretations based on angle and correlation coefficient are common. In this study, we focus on the interpretable axes of embeddings transformed by …

Predicting drug-gene relations via analogy tasks with word embeddings Open

Hiroaki Yamagiwa, Ryoma Hashimoto, Kiwamu Arakane, Ken Murakami, Shou Soeda , et al. · 2024

Natural language processing (NLP) is utilized in a wide range of fields, where words in text are typically transformed into feature vectors called embeddings. BioConceptVec is a specific example of embeddings tailored for biology, trained …

Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings Open

Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira · 2024

Word embedding is one of the most important components in natural language processing, but interpreting high-dimensional embeddings remains a challenging problem. To address this problem, Independent Component Analysis (ICA) is identified …

Zero-Shot Edge Detection with SCESAME: Spectral Clustering-based Ensemble for Segment Anything Model Estimation Open

Hiroaki Yamagiwa, Yusuke Takase, Hiroyuki Kambe, Ryosuke Nakamoto · 2023

This paper proposes a novel zero-shot edge detection with SCESAME, which stands for Spectral Clustering-based Ensemble for Segment Anything Model Estimation, based on the recently proposed Segment Anything Model (SAM). SAM is a foundation …

Discovering Universal Geometry in Embeddings with ICA Open

Hiroaki Yamagiwa, Momose Oyama, Hidetoshi Shimodaira · 2023

This study utilizes Independent Component Analysis (ICA) to unveil a consistent semantic structure within embeddings of words or images. Our approach extracts independent semantic components from the embeddings of a pre-trained model by le…

Discovering Universal Geometry in Embeddings with ICA Open

Hiroaki Yamagiwa, Momose Oyama, Hidetoshi Shimodaira · 2023

This study utilizes Independent Component Analysis (ICA) to unveil a consistent semantic structure within embeddings of words or images. Our approach extracts independent semantic components from the embeddings of a pre-trained model by le…

Improving word mover’s distance by leveraging self-attention matrix Open

Hiroaki Yamagiwa, Sho Yokoi, Hidetoshi Shimodaira · 2023

Measuring the semantic similarity between two sentences is still an important task. The word mover’s distance (WMD) computes the similarity via the optimal alignment between the sets of word embeddings. However, WMD does not utilize word o…

Improving word mover's distance by leveraging self-attention matrix Open

Hiroaki Yamagiwa, Sho Yokoi, Hidetoshi Shimodaira · 2022

Measuring the semantic similarity between two sentences is still an important task. The word mover's distance (WMD) computes the similarity via the optimal alignment between the sets of word embeddings. However, WMD does not utilize word o…

Hiroaki Yamagiwa YOU? Author Swipe