Avisha Das
YOU?
Author Swipe
View article: Efficient Training Corpus Retrieval for Large Language Model Fine Tuning: A Case Study in Cancer
Efficient Training Corpus Retrieval for Large Language Model Fine Tuning: A Case Study in Cancer Open
The objective is to create an automated knowledge extraction tool for cancer research that builds high-quality academic corpora for LLM fine-tuning while investigating its effectiveness in interleukin-6 and bladder cancer domains. To addre…
View article: Weakly supervised language models for automated extraction of critical findings from radiology reports
Weakly supervised language models for automated extraction of critical findings from radiology reports Open
Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians for timely management of patients. Although challenging, advancements in natural language processing (NLP), particul…
View article: Weakly Supervised Language Models for Automated Extraction of Critical Findings from Radiology Reports
Weakly Supervised Language Models for Automated Extraction of Critical Findings from Radiology Reports Open
Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians (“critical findings”) for timely man-agement of patients. Flagging radiology reports of such incidents could facilit…
View article: Framework for Exposing Vulnerabilities of Clinical Large Language Model: A Case Study in Breast Cancer
Framework for Exposing Vulnerabilities of Clinical Large Language Model: A Case Study in Breast Cancer Open
Large language models (LLMs) with billions of parameters and trained on massive amounts of crowdsourced public data have made a dramatic impact on natural language processing (NLP) tasks. Domain specific 'finetuning' of LLMs has further im…
View article: Ensemble pretrained language models to extract biomedical knowledge from literature
Ensemble pretrained language models to extract biomedical knowledge from literature Open
Objectives The rapid expansion of biomedical literature necessitates automated techniques to discern relationships between biomedical concepts from extensive free text. Such techniques facilitate the development of detailed knowledge bases…
View article: Exposing Vulnerabilities in Clinical LLMs Through Data Poisoning Attacks: Case Study in Breast Cancer
Exposing Vulnerabilities in Clinical LLMs Through Data Poisoning Attacks: Case Study in Breast Cancer Open
Training Large Language Models (LLMs) with billions of parameters on a dataset and publishing the model for public access is the standard practice currently. Despite their transformative impact on natural language processing, public LLMs p…
View article: Domain-specific LLM Development and Evaluation – A Case-study for Prostate Cancer
Domain-specific LLM Development and Evaluation – A Case-study for Prostate Cancer Open
In this work, we present our strategy for developing domain-specific large language models which cover the vocabulary of the target domain and train on reliable sources of clinical information. Prostate cancer was chosen as a use-case for …
View article: Extracting Drug-Protein Relation from Literature Using Ensembles of Biomedical Transformers
Extracting Drug-Protein Relation from Literature Using Ensembles of Biomedical Transformers Open
Automatic extraction of relations between drugs/chemicals and proteins from ever-growing biomedical literature is required to build up-to-date knowledge bases in biomedicine. To promote the development of automated methods, BioCreative-VII…
View article: Representation Learning of Biological Concepts: A Systematic Review
Representation Learning of Biological Concepts: A Systematic Review Open
Objective: Representation learning in the context of biological concepts involves acquiring their numerical representations through various sources of biological information, such as sequences, interactions, and literature. This study has …
View article: Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues
Conversational Bots for Psychotherapy: A Study of Generative Transformer Models Using Domain-specific Dialogues Open
Conversational bots have become non-traditional methods for therapy among individuals suffering from psychological illnesses. Leveraging deep neural generative language models, we propose a deep trainable neural conversational model for th…
View article: Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models
Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models Open
In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, an…
View article: Experiments in Extractive Summarization: Integer Linear Programming,\n Term/Sentence Scoring, and Title-driven Models
Experiments in Extractive Summarization: Integer Linear Programming,\n Term/Sentence Scoring, and Title-driven Models Open
In this paper, we revisit the challenging problem of unsupervised\nsingle-document summarization and study the following aspects: Integer linear\nprogramming (ILP) based algorithms, Parameterized normalization of term and\nsentence scores,…
View article: Modeling Coherency in Generated Emails by Leveraging Deep Neural Learners
Modeling Coherency in Generated Emails by Leveraging Deep Neural Learners Open
Advanced machine learning and natural language techniques enable attackers to launch sophisticated and targeted social engineering-based attacks. To counter the active attacker issue, researchers have since resorted to proactive methods of…
View article: Diverse Datasets and a Customizable Benchmarking Framework for Phishing
Diverse Datasets and a Customizable Benchmarking Framework for Phishing Open
Phishing is a challenging problem that has been addressed by many researchers in several papers using many different datatsets and techniques~\citedas2019sok. Researchers usually test their proposed methods with limited metrics, datasets, …
View article: An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs
An In-Depth Benchmarking and Evaluation of Phishing Detection Research for Security Needs Open
We perform an in-depth, systematic benchmarking study and evaluation of phishing features on diverse and extensive datasets. We propose a new taxonomy of features based on the interpretation and purpose of each feature. Next, we propose a …
View article: Can Machines Tell Stories? A Comparative Study of Deep Neural Language Models and Metrics
Can Machines Tell Stories? A Comparative Study of Deep Neural Language Models and Metrics Open
Massive textual content has enabled rapid advances in natural language modeling. The use of pre-trained deep neural language models has significantly improved natural language understanding tasks. However, the extent to which these systems…
View article: SoK: A Comprehensive Reexamination of Phishing Research From the Security Perspective
SoK: A Comprehensive Reexamination of Phishing Research From the Security Perspective Open
Phishing and spear-phishing are typical examples of masquerade attacks since trust is built up through impersonation for the attack to succeed. Given the prevalence of these attacks, considerable research has been conducted on these proble…
View article: Automated email Generation for Targeted Attacks using Natural Language
Automated email Generation for Targeted Attacks using Natural Language Open
With an increasing number of malicious attacks, the number of people and organizations falling prey to social engineering attacks is proliferating. Despite considerable research in mitigation systems, attackers continually improve their mo…