Explanipedia

Efficient Training Corpus Retrieval for Large Language Model Fine Tuning: A Case Study in Cancer Open

Avisha Das, Chiamaka S Diala, Guocai Chen, Zhao Li, Rongbin Li , et al. · 2025

Computer science Physics

The objective is to create an automated knowledge extraction tool for cancer research that builds high-quality academic corpora for LLM fine-tuning while investigating its effectiveness in interleukin-6 and bladder cancer domains. To addre…

Weakly supervised language models for automated extraction of critical findings from radiology reports Open

Avisha Das, Ish Talati, Juan Manuel Zambrano Chaves, Daniel Rubin, Imon Banerjee · 2025

Computer science Medicine Chemistry

Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians for timely management of patients. Although challenging, advancements in natural language processing (NLP), particul…

Weakly Supervised Language Models for Automated Extraction of Critical Findings from Radiology Reports Open

Avisha Das, Ish Talati, Juan Manuel Zambrano Chaves, Daniel Rubin, Imon Banerjee · 2024

Computer science Medicine Chemistry

Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians (“critical findings”) for timely man-agement of patients. Flagging radiology reports of such incidents could facilit…

Framework for Exposing Vulnerabilities of Clinical Large Language Model: A Case Study in Breast Cancer Open

Avisha Das, Amara Tariq, Felipe Batalini, Bodhisattwa Dhara, Imon Banerjee · 2024

Computer science Medicine

Large language models (LLMs) with billions of parameters and trained on massive amounts of crowdsourced public data have made a dramatic impact on natural language processing (NLP) tasks. Domain specific 'finetuning' of LLMs has further im…

Ensemble pretrained language models to extract biomedical knowledge from literature Open

Zhao Li, Qiang Wei, Liang‐Chin Huang, Jianfu Li, Yan Hu , et al. · 2024

Computer science Business Philosophy

Objectives The rapid expansion of biomedical literature necessitates automated techniques to discern relationships between biomedical concepts from extensive free text. Such techniques facilitate the development of detailed knowledge bases…

Exposing Vulnerabilities in Clinical LLMs Through Data Poisoning Attacks: Case Study in Breast Cancer Open

Avisha Das, Amara Tariq, Felipe Batalini, Boddhisattwa Dhara, Imon Banerjee · 2024

Medicine Computer science

Training Large Language Models (LLMs) with billions of parameters on a dataset and publishing the model for public access is the standard practice currently. Despite their transformative impact on natural language processing, public LLMs p…

Domain-specific LLM Development and Evaluation – A Case-study for Prostate Cancer Open

Amara Tariq, Man Luo, Aisha Urooj, Avisha Das, Jiwoong Jeong , et al. · 2024

Medicine Mathematics

In this work, we present our strategy for developing domain-specific large language models which cover the vocabulary of the target domain and train on reliable sources of clinical information. Prostate cancer was chosen as a use-case for …

Extracting Drug-Protein Relation from Literature Using Ensembles of Biomedical Transformers Open

Avisha Das, Li Zhao, Wei Qiang, Jianfu Li, Liang-chin Huang , et al. · 2024

Computer science Engineering Biology

Automatic extraction of relations between drugs/chemicals and proteins from ever-growing biomedical literature is required to build up-to-date knowledge bases in biomedicine. To promote the development of automated methods, BioCreative-VII…

Representation Learning of Biological Concepts: A Systematic Review Open

Yuntao Yang, Xu Zuo, Avisha Das, Hua Xu, W. Jim Zheng · 2023

Computer science Biology Political science

Objective: Representation learning in the context of biological concepts involves acquiring their numerical representations through various sources of biological information, such as sequences, interactions, and literature. This study has …

Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues Open

Avisha Das, Salih Selek, Alia Warner, Xu Zuo, Yan Hu , et al. · 2022

Computer science Mathematics Physics

Conversational bots have become non-traditional methods for therapy among individuals suffering from psychological illnesses. Leveraging deep neural generative language models, we propose a deep trainable neural conversational model for th…

Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models Open

Daniel Lee, Rakesh Verma, Avisha Das, Arjun Mukherjee · 2020

Computer science Mathematics Sociology

In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, an…

Experiments in Extractive Summarization: Integer Linear Programming,\n Term/Sentence Scoring, and Title-driven Models Open

Daniel Lee, Rakesh Verma, Avisha Das, Arjun Mukherjee · 2020

Computer science Mathematics Sociology

In this paper, we revisit the challenging problem of unsupervised\nsingle-document summarization and study the following aspects: Integer linear\nprogramming (ILP) based algorithms, Parameterized normalization of term and\nsentence scores,…

Modeling Coherency in Generated Emails by Leveraging Deep Neural Learners Open

Avisha Das, Rakesh Verma · 2020

Computer science Biology Political science

Advanced machine learning and natural language techniques enable attackers to launch sophisticated and targeted social engineering-based attacks. To counter the active attacker issue, researchers have since resorted to proactive methods of…

Diverse Datasets and a Customizable Benchmarking Framework for Phishing Open

Victor Zeng, Shahryar Baki, Ayman El Aassal, Rakesh Verma, Luis F. T. Moraes , et al. · 2020

Computer science Business

Phishing is a challenging problem that has been addressed by many researchers in several papers using many different datatsets and techniques~\citedas2019sok. Researchers usually test their proposed methods with limited metrics, datasets, …

An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs Open

Ayman El Aassal, Shahryar Baki, Avisha Das, Rakesh Verma · 2020

Computer science Business

We perform an in-depth, systematic benchmarking study and evaluation of phishing features on diverse and extensive datasets. We propose a new taxonomy of features based on the interpretation and purpose of each feature. Next, we propose a …

Can Machines Tell Stories? A Comparative Study of Deep Neural Language Models and Metrics Open

Avisha Das, Rakesh Verma · 2020

Computer science Physics Economics

Massive textual content has enabled rapid advances in natural language modeling. The use of pre-trained deep neural language models has significantly improved natural language understanding tasks. However, the extent to which these systems…

SoK: A Comprehensive Reexamination of Phishing Research From the Security Perspective Open

Avisha Das, Shahryar Baki, Ayman El Aassal, Rakesh Verma, Arthur Dunbar · 2019

Computer science Philosophy Mathematics

Phishing and spear-phishing are typical examples of masquerade attacks since trust is built up through impersonation for the attack to succeed. Given the prevalence of these attacks, considerable research has been conducted on these proble…

Automated email Generation for Targeted Attacks using Natural Language Open

Avisha Das, Rakesh Verma · 2019

Computer science Physics

With an increasing number of malicious attacks, the number of people and organizations falling prey to social engineering attacks is proliferating. Despite considerable research in mitigation systems, attackers continually improve their mo…

Avisha Das YOU? Author Swipe