Kentaro Inui
YOU?
Author Swipe
View article: Correction: Toward mapping pragmatic impairment of autism spectrum disorder individuals through the development of a corpus of spoken Japanese
Correction: Toward mapping pragmatic impairment of autism spectrum disorder individuals through the development of a corpus of spoken Japanese Open
[This corrects the article DOI: 10.1371/journal.pone.0264204.].
View article: Uncovering the Spectral Bias in Diagonal State Space Models
Uncovering the Spectral Bias in Diagonal State Space Models Open
Current methods for initializing state space models (SSMs) parameters mainly rely on the \textit{HiPPO framework}, which is based on an online approximation of orthogonal polynomials. Recently, diagonal alternatives have shown to reach a s…
View article: Cross-prompt Pre-finetuning of Language Models for Short Answer Scoring
Cross-prompt Pre-finetuning of Language Models for Short Answer Scoring Open
Automated short answer scoring (SAS) is the task of automatically scoring a given input to a prompt based on rubrics and reference answers. SAS is promising for real-world applications. However, because rubrics and reference answers differ…
View article: TopK Language Models
TopK Language Models Open
Sparse autoencoders (SAEs) have become an important tool for analyzing and interpreting the activation space of transformer-based language models (LMs). However, SAEs suffer several shortcomings that diminish their utility and internal val…
View article: Emergence of Primacy and Recency Effect in Mamba: A Mechanistic Point of View
Emergence of Primacy and Recency Effect in Mamba: A Mechanistic Point of View Open
We study memory in state-space language models using primacy and recency effects as behavioral tools to uncover how information is retained and forgotten over time. Applying structured recall tasks to the Mamba architecture, we observe a c…
View article: Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance
Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance Open
Large Language Models (LLMs) are known to process information using a proficient internal language consistently, referred to as latent language, which may differ from the input or output languages. However, how the discrepancy between the …
View article: Mechanistic Insights into Grokking from the Embedding Layer
Mechanistic Insights into Grokking from the Embedding Layer Open
Grokking, a delayed generalization in neural networks after perfect training performance, has been observed in Transformers and MLPs, but the components driving it remain underexplored. We show that embeddings are central to grokking: intr…
View article: SPIRIT: Patching Speech Language Models against Jailbreak Attacks
SPIRIT: Patching Speech Language Models against Jailbreak Attacks Open
Speech Language Models (SLMs) enable natural interactions via spoken instructions, which more effectively capture user intent by detecting nuances in speech. The richer speech signal introduces new security risks compared to text-based mod…
View article: SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation
SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation Open
Efficient path planning in robotics, particularly within large-scale, complex environments, remains a significant hurdle. While Large Language Models (LLMs) offer strong reasoning capabilities, their high computational cost and limited ada…
View article: Beyond Click to Cognition
Beyond Click to Cognition Open
View article: How Individual Traits and Language Styles Shape Preferences In Open-ended User-LLM Interaction: A Preliminary Study
How Individual Traits and Language Styles Shape Preferences In Open-ended User-LLM Interaction: A Preliminary Study Open
What makes an interaction with the LLM more preferable for the user? While it is intuitive to assume that information accuracy in the LLM's responses would be one of the influential variables, recent studies have found that inaccurate LLM'…
View article: Correction: Toward mapping pragmatic impairment of autism spectrum disorder individuals through the development of a corpus of spoken Japanese
Correction: Toward mapping pragmatic impairment of autism spectrum disorder individuals through the development of a corpus of spoken Japanese Open
[This corrects the article DOI: 10.1371/journal.pone.0264204.].
View article: Syntactic Learnability of Echo State Neural Language Models at Scale
Syntactic Learnability of Echo State Neural Language Models at Scale Open
What is a neural model with minimum architectural complexity that exhibits reasonable language learning capability? To explore such a simple but sufficient neural language model, we revisit a basic reservoir computing (RC) model, Echo Stat…
View article: Number Representations in LLMs: A Computational Parallel to Human Perception
Number Representations in LLMs: A Computational Parallel to Human Perception Open
Humans are believed to perceive numbers on a logarithmic mental number line, where smaller values are represented with greater resolution than larger ones. This cognitive bias, supported by neuroscience and behavioral studies, suggests tha…
View article: Large Language Models Are Human-Like Internally
Large Language Models Are Human-Like Internally Open
Recent cognitive modeling studies have reported that larger language models (LMs) exhibit a poorer fit to human reading behavior (Oh and Schuler, 2023b; Shain et al., 2024; Kuribayashi et al., 2024), leading to claims of their cognitive im…
View article: FinchGPT: a Transformer based language model for birdsong analysis
FinchGPT: a Transformer based language model for birdsong analysis Open
The long-range dependencies among the tokens, which originate from hierarchical structures, are a defining hallmark of human language. However, whether similar dependencies exist within the sequential vocalization of non-human animals rema…
View article: Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference Open
According to the stages-of-inference hypothesis, early layers of language models map their subword-tokenized input, which does not necessarily correspond to a linguistically meaningful segmentation, to more meaningful representations that …
View article: RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles
RECALL: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles Open
We introduce the concept of the self-referencing causal cycle (abbreviated RECALL) - a mechanism that enables large language models (LLMs) to bypass the limitations of unidirectional causality, which underlies a phenomenon known as the rev…
View article: Identification of Multiple Logical Interpretations in Counter-Arguments
Identification of Multiple Logical Interpretations in Counter-Arguments Open
View article: Repetition Neurons: How Do Language Models Produce Repetitions?
Repetition Neurons: How Do Language Models Produce Repetitions? Open
View article: Understanding the Side Effects of Rank-One Knowledge Editing
Understanding the Side Effects of Rank-One Knowledge Editing Open
View article: Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles
Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles Open
View article: Annotating Errors in English Learners’ Written Language Production: Advancing Automated Written Feedback Systems
Annotating Errors in English Learners’ Written Language Production: Advancing Automated Written Feedback Systems Open
View article: LLMs Can Compensate for Deficiencies in Visual Representations
LLMs Can Compensate for Deficiencies in Visual Representations Open
View article: The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces
The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces Open
View article: Spelling-out is not Straightforward: LLMs’ Capability of Tokenization from Token to Characters
Spelling-out is not Straightforward: LLMs’ Capability of Tokenization from Token to Characters Open
View article: SPIRIT: Patching Speech Language Models against Jailbreak Attacks
SPIRIT: Patching Speech Language Models against Jailbreak Attacks Open
View article: Rectifying Belief Space via Unlearning to Harness LLMs’ Reasoning
Rectifying Belief Space via Unlearning to Harness LLMs’ Reasoning Open
View article: Deterministic Compression of Word Embeddings
Deterministic Compression of Word Embeddings Open
Word embeddings are an indispensable technology in the field of artificial intelligence, particularly when working with natural language processing models. To further enhance their usability, several studies have tackled the compression of…
View article: Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference Open