Ziwei Ji
YOU?
Author Swipe
View article: Planning with Reasoning using Vision Language World Model
Planning with Reasoning using Vision Language World Model Open
Effective planning requires strong world models, but high-level world models that can understand and reason about actions with semantic and temporal abstraction remain largely underdeveloped. We introduce the Vision Language World Model (V…
View article: HalluLens: LLM Hallucination Benchmark
HalluLens: LLM Hallucination Benchmark Open
Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as "hallucination." These hallucinations undermine user trust and hinder the adoption of generative AI systems. Address…
View article: Conventional versus rubber band traction-assisted endoscopic submucosal dissection for rectal neuroendocrine tumors: a single-center retrospective study (with video)
Conventional versus rubber band traction-assisted endoscopic submucosal dissection for rectal neuroendocrine tumors: a single-center retrospective study (with video) Open
For r-NETs of < 2 cm in size, the RBT method did not significantly shorten the operation time but resulted in faster resection speed, less muscular layer injury, and earlier postoperative recovery to a liquid diet.
View article: ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models Open
Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domains and sizes, which struggl…
View article: LLM Internal States Reveal Hallucination Risk Faced With a Query
LLM Internal States Reveal Hallucination Risk Faced With a Query Open
The hallucination problem of Large Language Models (LLMs) significantly limits their reliability and trustworthiness. Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries. Inspired by …
View article: Efficient Document Ranking with Learnable Late Interactions
Efficient Document Ranking with Learnable Late Interactions Open
Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized …
View article: ANAH: Analytical Annotation of Hallucinations in Large Language Models
ANAH: Analytical Annotation of Hallucinations in Large Language Models Open
Reducing the `$\textit{hallucination}$' problem of Large Language Models (LLMs) is crucial for their wide applications. A comprehensive and fine-grained measurement of the hallucination is the first key step for the governance of this issu…
View article: High-Dimension Human Value Representation in Large Language Models
High-Dimension Human Value Representation in Large Language Models Open
The widespread application of LLMs across various tasks and fields has necessitated the alignment of these models with human values and preferences. Given various approaches of human value alignment, there is an urgent need to understand t…
View article: ANAH: Analytical Annotation of Hallucinations in Large Language Models
ANAH: Analytical Annotation of Hallucinations in Large Language Models Open
Reducing the 'hallucination' problem of Large Language Models (LLMs) is crucial for their wide applications. A comprehensive and fine-grained measurement of the hallucination is the first key step for the governance of this issue but is un…
View article: Contrastive Learning for Inference in Dialogue
Contrastive Learning for Inference in Dialogue Open
Inference, especially those derived from inductive processes, is a crucial component in our conversation to complement the information implicitly or explicitly conveyed by a speaker. While recent large language models show remarkable advan…
View article: Towards Mitigating Hallucination in Large Language Models via Self-Reflection
Towards Mitigating Hallucination in Large Language Models via Self-Reflection Open
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks. However, the practical deployment still faces challenges, notably the issue of "hallucination", where mod…
View article: Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models Open
Object hallucination poses a significant challenge in vision-language (VL) models, often leading to the generation of nonsensical or unfaithful responses with non-existent objects. However, the absence of a general measurement for evaluati…
View article: Think before you speak: Training Language Models With Pause Tokens
Think before you speak: Training Language Models With Pause Tokens Open
Language models generate responses by producing a series of tokens in immediate succession: the $(K+1)^{th}$ token is an outcome of manipulating $K$ hidden vectors per layer, one vector per preceding token. What if instead we were to let t…
View article: Improving Query-Focused Meeting Summarization with Query-Relevant Knowledge
Improving Query-Focused Meeting Summarization with Query-Relevant Knowledge Open
Query-Focused Meeting Summarization (QFMS) aims to generate a summary of a given meeting transcript conditioned upon a query. The main challenges for QFMS are the long input text length and sparse query-relevant information in the meeting …
View article: Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference
Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference Open
The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system. Common strategies either adopt a two-step paradigm, which optimizes knowledge…
View article: Depth Dependence of $μ$P Learning Rates in ReLU MLPs
Depth Dependence of $μ$P Learning Rates in ReLU MLPs Open
In this short note we consider random fully connected ReLU networks of width $n$ and depth $L$ equipped with a mean-field weight initialization. Our purpose is to study the dependence on $n$ and $L$ of the maximal update ($μ$P) learning ra…
View article: A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity Open
This paper proposes a framework for quantitatively evaluating interactive LLMs such as ChatGPT using publicly available data sets. We carry out an extensive technical evaluation of ChatGPT using 23 data sets covering 8 different common NLP…
View article: Improving Query-Focused Meeting Summarization with Query-Relevant Knowledge
Improving Query-Focused Meeting Summarization with Query-Relevant Knowledge Open
Query-Focused Meeting Summarization (QFMS) aims to generate a summary of a given meeting transcript conditioned upon a query. The main challenges for QFMS are the long input text length and sparse query-relevant information in the meeting …
View article: RHO: Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding
RHO: Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding Open
Dialogue systems can leverage large pretrained language models and knowledge to generate fluent and informative responses. However, these models are still prone to produce hallucinated responses not supported by the input source, which gre…
View article: Contrastive Learning for Inference in Dialogue
Contrastive Learning for Inference in Dialogue Open
Inference, especially those derived from inductive processes, is a crucial component in our conversation to complement the information implicitly or explicitly conveyed by a speaker. While recent large language models show remarkable advan…
View article: A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity Open
Yejin Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji, Tiezheng Yu, Willy Chung, Quyet V. Do, Yan Xu, Pascale Fung. Proceedings of the 13th International Joint Conference on Natural Language …
View article: Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training
Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training Open
Large-scale vision-language pre-trained (VLP) models are prone to hallucinate non-existent visual objects when generating text based on visual information. In this paper, we systematically study the object hallucination problem from three …
View article: NusaCrowd: Open Source Initiative for Indonesian NLP Resources
NusaCrowd: Open Source Initiative for Indonesian NLP Resources Open
Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Winata, Bryan Wilie, Fajri Koto, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Muhammad Sat…
View article: Towards Mitigating LLM Hallucination via Self Reflection
Towards Mitigating LLM Hallucination via Self Reflection Open
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks. However, the practical deployment still faces challenges, notably the issue of "hallucination", where mod…
View article: NusaCrowd: Open Source Initiative for Indonesian NLP Resources
NusaCrowd: Open Source Initiative for Indonesian NLP Resources Open
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have brought together 137 datasets …
View article: RHO ($ρ$): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding
RHO ($ρ$): Reducing Hallucination in Open-domain Dialogues with Knowledge Grounding Open
Dialogue systems can leverage large pre-trained language models and knowledge to generate fluent and informative responses. However, these models are still prone to produce hallucinated responses not supported by the input source, which gr…