Bryan Hooi
YOU?
Author Swipe
View article: Echoless Label-Based Pre-computation for Memory-Efficient Heterogeneous Graph Learning
Echoless Label-Based Pre-computation for Memory-Efficient Heterogeneous Graph Learning Open
Heterogeneous Graph Neural Networks (HGNNs) are widely used for deep learning on heterogeneous graphs. Typical end-to-end HGNNs require repetitive message passing during training, limiting efficiency for large-scale real-world graphs. Pre-…
View article: Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods
Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods Open
With the development of technology, large language models (LLMs) have dominated the downstream natural language processing (NLP) tasks. However, because of the LLMs' instruction-following abilities and inability to distinguish the instruct…
View article: How to Make Large Language Models Generate 100% Valid Molecules?
How to Make Large Language Models Generate 100% Valid Molecules? Open
Molecule generation is key to drug discovery and materials science, enabling the design of novel compounds with specific properties. Large language models (LLMs) can learn to perform a wide range of tasks from just a few examples. However,…
View article: RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design.
RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design. Open
We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone design. We build upon flow matching for protein backbone generation and establish protocols for data preparation and evaluation to address unique challenges p…
View article: Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance
Enabling Self-Improving Agents to Learn at Test Time With Human-In-The-Loop Guidance Open
Large language model (LLM) agents often struggle in environments where rules and required domain knowledge frequently change, such as regulatory compliance and user risk screening. Current approaches, like offline fine-tuning and standard …
View article: Campolina: A Deep Neural Framework for Accurate Segmentation of Nanopore Signals
Campolina: A Deep Neural Framework for Accurate Segmentation of Nanopore Signals Open
Nanopore sequencing enables real-time, long-read analysis by processing raw signals as they are produced. A key step, segmentation of signals into events, is typically handled algorithmically, struggling in noisy regions. We present Campol…
View article: NTSFormer: A Self-Teaching Graph Transformer for Multimodal Isolated Cold-Start Node Classification
NTSFormer: A Self-Teaching Graph Transformer for Multimodal Isolated Cold-Start Node Classification Open
Isolated cold-start node classification on multimodal graphs is challenging because such nodes have no edges and often have missing modalities (e.g., absent text or image features). Existing methods address structural isolation by degradin…
View article: VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents
VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents Open
Computer-Use Agents (CUAs) with full system access enable powerful task automation but pose significant security and privacy risks due to their ability to manipulate files, access user data, and execute arbitrary commands. While prior work…
View article: MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research
MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research Open
Recent advancements in AI agents have demonstrated their growing potential to drive and support scientific discovery. In this work, we introduce MLR-Bench, a comprehensive benchmark for evaluating AI agents on open-ended machine learning r…
View article: Efficient Reasoning via Chain of Unconscious Thought
Efficient Reasoning via Chain of Unconscious Thought Open
Large Reasoning Models (LRMs) achieve promising performance but compromise token efficiency due to verbose reasoning processes. Unconscious Thought Theory (UTT) posits that complex problems can be solved more efficiently through internaliz…
View article: Seeing Through Deception: Uncovering Misleading Creator Intent in Multimodal News with Vision-Language Models
Seeing Through Deception: Uncovering Misleading Creator Intent in Multimodal News with Vision-Language Models Open
The impact of misinformation arises not only from factual inaccuracies but also from the misleading narratives that creators deliberately embed. Interpreting such creator intent is therefore essential for multimodal misinformation detectio…
View article: PhishIntel: Toward Practical Deployment of Reference-Based Phishing Detection
PhishIntel: Toward Practical Deployment of Reference-Based Phishing Detection Open
View article: Safety in Large Reasoning Models: A Survey
Safety in Large Reasoning Models: A Survey Open
Large Reasoning Models (LRMs) have exhibited extraordinary prowess in tasks like mathematics and coding, leveraging their advanced reasoning capabilities. Nevertheless, as these capabilities progress, significant concerns regarding their v…
View article: UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs Open
View article: Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation
Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation Open
Recent advancements in visual language models (VLMs) have notably enhanced their capabilities in handling complex Graphical User Interface (GUI) interaction tasks. Despite these improvements, current frameworks often struggle to generate c…
View article: PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection
PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection Open
Phishing attacks are a major threat to online security, exploiting user vulnerabilities to steal sensitive information. Various methods have been developed to counteract phishing, each with varying levels of accuracy, but they also face no…
View article: Modality-Independent Graph Neural Networks with Global Transformers for Multimodal Recommendation
Modality-Independent Graph Neural Networks with Global Transformers for Multimodal Recommendation Open
Multimodal recommendation systems can learn users' preferences from existing user-item interactions as well as the semantics of multimodal data associated with items. Many existing methods model this through a multimodal user-item graph, a…
View article: Geneshift: Impact of different scenario shift on Jailbreaking LLM
Geneshift: Impact of different scenario shift on Jailbreaking LLM Open
Jailbreak attacks, which aim to cause LLMs to perform unrestricted behaviors, have become a critical and challenging direction in AI safety. Despite achieving the promising attack success rate using dictionary-based evaluation, existing ja…
View article: UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs
UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs Open
View article: N <scp>ode</scp> I <scp>mport</scp> : Imbalanced Node Classification with Node Importance Assessment
N <span>ode</span> I <span>mport</span> : Imbalanced Node Classification with Node Importance Assessment Open
View article: Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
Words or Vision: Do Vision-Language Models Have Blind Faith in Text? Open
Vision-Language Models (VLMs) excel in integrating visual and textual information for vision-centric tasks, but their handling of inconsistencies between modalities is underexplored. We investigate VLMs' modality preferences when faced wit…
View article: Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment
Fact or Guesswork? Evaluating Large Language Model's Medical Knowledge with Structured One-Hop Judgment Open
Large language models (LLMs) have been widely adopted in various downstream task domains. However, their ability to directly recall and apply factual medical knowledge remains under-explored. Most existing medical QA benchmarks assess comp…
View article: Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning
Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning Open
Instruction fine-tuning (IFT) can increase the informativeness of large language models (LLMs), but may reduce their truthfulness. This trade-off arises because IFT steers LLMs to generate responses containing long-tail knowledge that was …
View article: ReLearn: Unlearning via Learning for Large Language Models
ReLearn: Unlearning via Learning for Large Language Models Open
Current unlearning methods for large language models usually rely on reverse optimization to reduce target token probabilities. However, this paradigm disrupts the subsequent tokens prediction, degrading model performance and linguistic co…
View article: UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs
UniGraph2: Learning a Unified Embedding Space to Bind Multimodal Graphs Open
Existing foundation models, such as CLIP, aim to learn a unified embedding space for multimodal data, enabling a wide range of downstream web-based applications like search, recommendation, and content classification. However, these models…
View article: GuardReasoner: Towards Reasoning-based LLM Safeguards
GuardReasoner: Towards Reasoning-based LLM Safeguards Open
As LLMs increasingly impact safety-critical applications, ensuring their safety using guardrails remains a key challenge. This paper proposes GuardReasoner, a new safeguard for LLMs, by guiding the guard model to learn to reason. Concretel…
View article: CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs
CHiP: Cross-modal Hierarchical Direct Preference Optimization for Multimodal LLMs Open
Multimodal Large Language Models (MLLMs) still struggle with hallucinations despite their impressive capabilities. Recent studies have attempted to mitigate this by applying Direct Preference Optimization (DPO) to multimodal scenarios usin…
View article: Spatio-Temporal Foundation Models: Vision, Challenges, and Opportunities
Spatio-Temporal Foundation Models: Vision, Challenges, and Opportunities Open
Foundation models have revolutionized artificial intelligence, setting new benchmarks in performance and enabling transformative capabilities across a wide range of vision and language tasks. However, despite the prevalence of spatio-tempo…
View article: Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design
Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design Open
Handcrafting heuristics for solving complex optimization tasks (e.g., route planning and task allocation) is a common practice but requires extensive domain knowledge. Recently, Large Language Model (LLM)-based automatic heuristic design (…
View article: Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering Open