Anh Tuan Luu
YOU?
Author Swipe
View article: Optimal Design o f Ground Geodetic - GNSS Networks Based o n Baseline Configuration a nd Robustness Analysis
Optimal Design o f Ground Geodetic - GNSS Networks Based o n Baseline Configuration a nd Robustness Analysis Open
The paper presents a study on the optimal design of ground geodetic networks using GNSS technology, focusing on baseline configuration and robustness analysis of the network under Vietnam’s field conditions. The main objective is to propos…
View article: P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs
P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs Open
During fine-tuning, large language models (LLMs) are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability and trustworthiness. However, existing defense strategies suffer from limited generalizatio…
View article: GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning
GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning Open
Recent advancements in reinforcement learning (RL) have enhanced the reasoning abilities of large language models (LLMs), yet the impact on multimodal LLMs (MLLMs) is limited. Particularly in vision-intensive tasks like geometric reasoning…
View article: Unsupervised Hallucination Detection by Inspecting Reasoning Processes
Unsupervised Hallucination Detection by Inspecting Reasoning Processes Open
Unsupervised hallucination detection aims to identify hallucinated content generated by large language models (LLMs) without relying on labeled data. While unsupervised methods have gained popularity by eliminating labor-intensive human an…
View article: Affective-ROPTester: Capability and Bias Analysis of LLMs in Predicting Retinopathy of Prematurity
Affective-ROPTester: Capability and Bias Analysis of LLMs in Predicting Retinopathy of Prematurity Open
Despite the remarkable progress of large language models (LLMs) across various domains, their capacity to predict retinopathy of prematurity (ROP) risk remains largely unexplored. To address this gap, we introduce a novel Chinese benchmark…
View article: Detecting Harmful Memes with Decoupled Understanding and Guided CoT Reasoning
Detecting Harmful Memes with Decoupled Understanding and Guided CoT Reasoning Open
Detecting harmful memes is essential for maintaining the integrity of online environments. However, current approaches often struggle with resource efficiency, flexibility, or explainability, limiting their practical deployment in content …
View article: ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations Open
The capabilities of large language models (LLMs) have been enhanced by training on data that reflects human thought processes, such as the Chain-of-Thought format. However, evidence suggests that the conventional scheme of next-word predic…
View article: SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation
SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation Open
Process Reward Models (PRMs) have demonstrated promising results in mathematical reasoning, but existing process annotation approaches, whether through human annotations or Monte Carlo simulations, remain computationally expensive. In this…
View article: EFFECTS OF SURFACTANT ON THE MORPHOLOGY AND NIR-SHIELDING PERFORMANCE OF CsxWO3 NANOPARTICLES SYNTHESIZED BY SOLVOTHERMAL METHOD
EFFECTS OF SURFACTANT ON THE MORPHOLOGY AND NIR-SHIELDING PERFORMANCE OF CsxWO3 NANOPARTICLES SYNTHESIZED BY SOLVOTHERMAL METHOD Open
Cesium tungsten bronze (CsxWO3) là vật liệu có khả năng hấp thụ mạnh các bức xạ cận hồng ngoại và có tiềm năng ứng dụng rộng rãi trong lĩnh vực tiết kiệm năng lượng. Trong nghiên cứu này, vật liệu CsxWO3 được tổng hợp bằng phương pháp thủy…
View article: Multi-Scale Contrastive Learning for Video Temporal Grounding
Multi-Scale Contrastive Learning for Video Temporal Grounding Open
Temporal grounding, which localizes video moments related to a natural language query, is a core problem of vision-language learning and video understanding. To encode video moments of varying lengths, recent methods employ a multi-level s…
View article: Motion-aware Contrastive Learning for Temporal Panoptic Scene Graph Generation
Motion-aware Contrastive Learning for Temporal Panoptic Scene Graph Generation Open
To equip artificial intelligence with a comprehensive understanding towards a temporal world, video and 4D panoptic scene graph generation abstracts visual data into nodes to represent entities and edges to capture temporal relations. Exis…
View article: Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning
Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning Open
Direct Preference Optimization (DPO) often struggles with long-chain mathematical reasoning. Existing approaches, such as Step-DPO, typically improve this by focusing on the first erroneous step in the reasoning chain. However, they overlo…
View article: CutPaste&Find: Efficient Multimodal Hallucination Detector with Visual-aid Knowledge Base
CutPaste&Find: Efficient Multimodal Hallucination Detector with Visual-aid Knowledge Base Open
Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal reasoning capabilities, but they remain susceptible to hallucination, particularly object hallucination where non-existent objects or incorrect attributes are fab…
View article: SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia
SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia Open
This study introduces two novel benchmarks, SeaExam and SeaBench, designed to evaluate the capabilities of Large Language Models (LLMs) in Southeast Asian (SEA) application scenarios. Unlike existing multilingual datasets primarily derived…
View article: Enhancing Multimodal Entity Linking with Jaccard Distance-based Conditional Contrastive Learning and Contextual Visual Augmentation
Enhancing Multimodal Entity Linking with Jaccard Distance-based Conditional Contrastive Learning and Contextual Visual Augmentation Open
Previous research on multimodal entity linking (MEL) has primarily employed contrastive learning as the primary objective. However, using the rest of the batch as negative samples without careful consideration, these studies risk leveragin…
View article: GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning
GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning Open
View article: Massively Multilingual Instruction-Following Information Extraction
Massively Multilingual Instruction-Following Information Extraction Open
View article: Enhancing Multimodal Entity Linking with Jaccard Distance-based Conditional Contrastive Learning and Contextual Visual Augmentation
Enhancing Multimodal Entity Linking with Jaccard Distance-based Conditional Contrastive Learning and Contextual Visual Augmentation Open
View article: Discrete Diffusion Language Model for Efficient Text Summarization
Discrete Diffusion Language Model for Efficient Text Summarization Open
View article: ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations Open
View article: Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Open
View article: SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia
SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia Open
View article: SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation
SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation Open
View article: Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines
Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines Open
View article: Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models
Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models Open
View article: MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering
MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering Open
View article: CodeArena: A Collective Evaluation Platform for LLM Code Generation
CodeArena: A Collective Evaluation Platform for LLM Code Generation Open
View article: FineReason: Evaluating and Improving LLMs’ Deliberate Reasoning through Reflective Puzzle Solving
FineReason: Evaluating and Improving LLMs’ Deliberate Reasoning through Reflective Puzzle Solving Open
View article: AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge Open
View article: Unsupervised Hallucination Detection by Inspecting Reasoning Processes
Unsupervised Hallucination Detection by Inspecting Reasoning Processes Open