Explanipedia

Descriminative-Generative Custom Tokens for Vision-Language Models Open

Pramuditha Perera, Matthew Trager, Luca Zancato, Alessandro Achille, Stefano Soatto · 2025

Computer science Philosophy

This paper explores the possibility of learning custom tokens for representing new concepts in Vision-Language Models (VLMs). Our aim is to learn tokens that can be effective for both discriminative and generative tasks while composing wel…

Compositional Structures in Neural Embedding and Interaction Decompositions Open

Matthew Trager, Alessandro Achille, Pramuditha Perera, Luca Zancato, Stefano Soatto · 2024

Computer science Mathematics

We describe a basic correspondence between linear algebraic structures within vector embeddings in artificial neural networks and conditional independence constraints on the probability distributions modeled by these networks. Our framewor…

Multi-Modal Hallucination Control by Visual Information Grounding Open

Alessandro Favero, Luca Zancato, Matthew Trager, Siddharth Choudhary, Pramuditha Perera , et al. · 2024

Computer science Psychology Engineering

Generative Vision-Language Models (VLMs) are prone to generate plausible-sounding textual answers that, however, are not always grounded in the input image. We investigate this phenomenon, usually referred to as "hallucination" and show th…

Meaning Representations from Trajectories in Autoregressive Models Open

Tian Yu Liu, Matthew Trager, Alessandro Achille, Pramuditha Perera, Luca Zancato , et al. · 2023

Computer science Mathematics Psychology

We propose to extract meaning representations from autoregressive language models by considering the distribution of all possible trajectories extending an input text. This strategy is prompt-free, does not require fine-tuning, and is appl…

Prompt Algebra for Task Composition Open

Pramuditha Perera, Matthew Trager, Luca Zancato, Alessandro Achille, Stefano Soatto · 2023

Computer science Mathematics Philosophy

We investigate whether prompts learned independently for different tasks can be later combined through prompt algebra to obtain a model that supports composition of tasks. We consider Visual Language Models (VLM) with prompt tuning as our …

À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting Open

Benjamin Bowman, Alessandro Achille, Luca Zancato, Matthew Trager, Pramuditha Perera , et al. · 2023

Computer science Engineering Mathematics

We introduce À-la-carte Prompt Tuning (APT), a transformer-based scheme to tune prompts on distinct data so that they can be arbitrarily composed at inference time. The individual prompts can be trained in isolation, possibly on different …

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge Open

Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu , et al. · 2023

Computer science Political science Mathematics

The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task…

Benchmarking Diverse-Modal Entity Linking with Generative Models Open

Sijia Wang, Alexander Hanbo Li, Henry Zhu, Sheng Zhang, Chung-Wei Hang , et al. · 2023

Computer science Engineering Business

Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding, or s…

Train/Test-Time Adaptation with Retrieval Open

Luca Zancato, Alessandro Achille, Tian Yu Liu, Matthew Trager, Pramuditha Perera , et al. · 2023

Computer science Physics Economics

We introduce Train/Test-Time Adaptation with Retrieval (${\rm T^3AR}$), a method to adapt models both at train and test time by means of a retrieval module and a searchable pool of external samples. Before inference, ${\rm T^3AR}$ adapts a…

Linear Spaces of Meanings: Compositional Structures in Vision-Language Models Open

Matthew Trager, Pramuditha Perera, Luca Zancato, Alessandro Achille, Parminder Bhatia , et al. · 2023

Computer science Mathematics Philosophy

We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs). Traditionally, compositionality has been associated with algebraic operations on embeddings of words from a pre-existing vocabulary.…

À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting Open

Benjamin Bowman, Alessandro Achille, Luca Zancato, Matthew Trager, Pramuditha Perera , et al. · 2023

Computer science

We introduce À-la-carte Prompt Tuning (APT), a transformer-based scheme to tune prompts on distinct data so that they can be arbitrarily composed at inference time. The individual prompts can be trained in isolation, possibly on different …

Benchmarking Diverse-Modal Entity Linking with Generative Models Open

Sijia Wang, Alexander Hanbo Li, Henghui Zhu, Sheng Zhang, Pramuditha Perera , et al. · 2023

Computer science Philosophy History

Sijia Wang, Alexander Hanbo Li, Henghui Zhu, Sheng Zhang, Pramuditha Perera, Chung-Wei Hang, Jie Ma, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng. Findings of the Association for Computational Linguistics: ACL …

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge Open

Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu , et al. · 2023

Computer science Psychology Philosophy

Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang. Findings of the Association for Computational Ling…

Open-set Adversarial Defense with Clean-Adversarial Mutual Learning Open

Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel · 2022

Computer science Mathematics

Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversar…

Federated Generalized Face Presentation Attack Detection Open

Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel · 2021

Computer science Mathematics Physics

Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input …

A Joint Representation Learning and Feature Modeling Approach for One-class Recognition Open

Pramuditha Perera, Vishal M. Patel · 2021

Computer science Philosophy Political science

One-class recognition is traditionally approached either as a representation learning problem or a feature modeling problem. In this work, we argue that both of these approaches have their own limitations; and a more effective solution can…

One-Class Classification: A Survey Open

Pramuditha Perera, Poojan Oza, Vishal M. Patel · 2021

Computer science Mathematics

One-Class Classification (OCC) is a special case of multi-class classification, where data observed during training is from a single positive class. The goal of OCC is to learn a representation and/or a classifier that enables recognition …

Quickest Intruder Detection For Multiple User Active Authentication Open

Pramuditha Perera, Julián Fiérrez, Vishal M. Patel · 2020

Computer science

In this paper, we investigate how to detect intruders with low latency for Active Authentication (AA) systems with multiple-users. We extend the Quickest Change Detection (QCD) framework to the multiple-user case and formulate the Multiple…

Anomaly Detection-Based Unknown Face Presentation Attack Detection Open

Yashasvi Baweja, Poojan Oza, Pramuditha Perera, Vishal M. Patel · 2020

Computer science Philosophy

Anomaly detection-based spoof attack detection is a recent development in face Presentation Attack Detection (fPAD), where a spoof detector is learned using only non-attacked images of users. These detectors are of practical importance as …

Open-set Adversarial Defense Open

Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel · 2020

Computer science Mathematics Philosophy

Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversar…

Federated Face Presentation Attack Detection Open

Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel · 2020

Business Computer science Political science

Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different…

Federated Face Anti-spoofing. Open

Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel · 2020

Computer science Sociology Mathematics

Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face anti-spoofing (FAS) model with good generalization can be obtained when it is trained with face images from different input distributi…

Deep Transfer Learning for Multiple Class Novelty Detection Open

Pramuditha Perera, Vishal M. Patel · 2019

Computer science Philosophy Physics

We propose a transfer learning-based solution for the problem of multiple class novelty detection. In particular, we propose an end-to-end deep-learning based approach in which we investigate how the knowledge contained in an external, out…

OCGAN: One-class Novelty Detection Using GANs with Constrained Latent Representations Open

Pramuditha Perera, Ramesh Nallapati, Bing Xiang · 2019

Computer science Mathematics Philosophy

We present a novel model called OCGAN for the classical problem of one-class novelty detection, where, given a set of examples from a particular class, the goal is to determine if a query example is from the same class. Our solution is bas…

In2I: Unsupervised Multi-Image-to-Image Translation Using Generative Adversarial Networks Open

Pramuditha Perera, Mahdi Abavisani, Vishal M. Patel · 2018

Computer science Physics Chemistry

In unsupervised image-to-image translation, the goal is to learn the mapping between an input image and an output image using a set of unpaired training images. In this paper, we propose an extension of the unsupervised image-to-image tran…

Non-intrusive load monitoring based on low frequency active power measurements Open

Chinthaka Dinesh, Pramuditha Perera, Roshan Godaliyadda, M. P. B. Ekanayake, Janaka Ekanayake · 2016

Computer science Mathematics Physics

A Non-Intrusive Load Monitoring (NILM) method for residential appliances based on ac- \ntive power signal is presented. This method works e \nectively with a single active power measurement \ntaken at a low sampling rate (1 s). The propose…

Pramuditha Perera YOU? Author Swipe