Abhilash Nandy
YOU?
Author Swipe
View article: $\left|\,\circlearrowright\,\boxed{\text{BUS}}\,\right|$: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles
$\left|\,\circlearrowright\,\boxed{\text{BUS}}\,\right|$: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles Open
Understanding Rebus Puzzles (Rebus Puzzles use pictures, symbols, and letters to represent words or phrases creatively) requires a variety of skills such as image recognition, cognitive skills, commonsense reasoning, multi-step reasoning, …
View article: Leveraging Large Language Models for Predictive Analysis of Human Misery
Leveraging Large Language Models for Predictive Analysis of Human Misery Open
This study investigates the use of Large Language Models (LLMs) for predicting human-perceived misery scores from natural language descriptions of real-world scenarios. The task is framed as a regression problem, where the model assigns a …
View article: Language Models of Code Are Few-Shot Planners and Reasoners for Multi-Document Summarization with Attribution
Language Models of Code Are Few-Shot Planners and Reasoners for Multi-Document Summarization with Attribution Open
Document summarization has greatly benefited from advances in large language models (LLMs). In real-world situations, summaries often need to be generated from multiple documents with diverse sources and authors, lacking a clear informatio…
View article: Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs Open
View article: A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents Open
In task-oriented dialogue systems, intent detection is crucial for interpreting user queries and providing appropriate responses. Existing research primarily addresses simple queries with a single intent, lacking effective systems for hand…
View article: YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models Open
Understanding satire and humor is a challenging task for even current Vision-Language models. In this paper, we propose the challenging tasks of Satirical Image Detection (detecting whether an image is satirical), Understanding (generating…
View article: SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models Open
Radiology Report Generation (R2Gen) demonstrates how Multi-modal Large Language Models (MLLMs) can automate the creation of accurate and coherent radiological reports. Existing methods often hallucinate details in text-based reports that d…
View article: Order-Based Pre-training Strategies for Procedural Text Understanding
Order-Based Pre-training Strategies for Procedural Text Understanding Open
In this paper, we propose sequence-based pretraining methods to enhance procedural understanding in natural language processing. Procedural text, containing sequential instructions to accomplish a task, is difficult to understand due to th…
View article: CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text
CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text Open
In this paper, we propose CLMSM, a domain-specific, continual pre-training framework, that learns from a large set of procedural recipes. CLMSM uses a Multi-Task Learning Framework to optimize two objectives - a) Contrastive Learning using…
View article: $FastDoc$: Domain-Specific Fast Continual Pre-training Technique using Document-Level Metadata and Taxonomy
$FastDoc$: Domain-Specific Fast Continual Pre-training Technique using Document-Level Metadata and Taxonomy Open
In this paper, we propose $FastDoc$ (Fast Continual Pre-training Technique using Document Level Metadata and Taxonomy), a novel, compute-efficient framework that utilizes Document metadata and Domain-Specific Taxonomy as supervision signal…
View article: CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text
CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text Open
In this paper, we propose ***CLMSM***, a domain-specific, continual pre-training framework, that learns from a large set of procedural recipes. ***CLMSM*** uses a Multi-Task Learning Framework to optimize two objectives - a) Contrastive Le…
View article: An Evaluation Framework for Legal Document Summarization
An Evaluation Framework for Legal Document Summarization Open
A law practitioner has to go through numerous lengthy legal case proceedings for their practices of various categories, such as land dispute, corruption, etc. Hence, it is important to summarize these documents, and ensure that summaries c…
View article: Fine-grained Intent Classification in the Legal Domain
Fine-grained Intent Classification in the Legal Domain Open
A law practitioner has to go through a lot of long legal case proceedings. To understand the motivation behind the actions of different parties/individuals in a legal case, it is essential that the parts of the document that express an int…
View article: Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language\n Models for Key Point Matching
Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language\n Models for Key Point Matching Open
We present the system description for our submission towards the Key Point\nAnalysis Shared Task at ArgMining 2021. Track 1 of the shared task requires\nparticipants to develop methods to predict the match score between each pair of\nargum…
View article: Question Answering over Electronic Devices: A New Benchmark Dataset and\n a Multi-Task Learning based QA Framework
Question Answering over Electronic Devices: A New Benchmark Dataset and\n a Multi-Task Learning based QA Framework Open
Answering questions asked from instructional corpora such as E-manuals,\nrecipe books, etc., has been far less studied than open-domain factoid\ncontext-based question answering. This can be primarily attributed to the\nabsence of standard…
View article: cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction\n using Transformer-based Language Models pre-trained on various text corpora
cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction\n using Transformer-based Language Models pre-trained on various text corpora Open
This paper describes the performance of the team cs60075_team2 at SemEval\n2021 Task 1 - Lexical Complexity Prediction. The main contribution of this\npaper is to fine-tune transformer-based language models pre-trained on several\ntext cor…
View article: indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages
indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages Open
The paper presents the submission of the team indicnlp@kgp to the EACL 2021 shared task "Offensive Language Identification in Dravidian Languages." The task aimed to classify different offensive content types in 3 code-mixed Dravidian lang…
View article: Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework
Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework Open
Answering questions asked from instructional corpora such as E-manuals, recipe books, etc., has been far less studied than open-domain factoid context-based question answering.This can be primarily attributed to the absence of standard ben…
View article: cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction using Transformer-based Language Models pre-trained on various text corpora
cs60075_team2 at SemEval-2021 Task 1 : Lexical Complexity Prediction using Transformer-based Language Models pre-trained on various text corpora Open
This paper describes the performance of the team cs60075 team2 at SemEval 2021 Task 1 -Lexical Complexity Prediction.The main contribution of this paper is to finetune transformer-based language models pretrained on several text corpora, s…
View article: Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language Models for Key Point Matching
Team Enigma at ArgMining-EMNLP 2021: Leveraging Pre-trained Language Models for Key Point Matching Open
We present the system description for our submission towards the Key Point Analysis Shared Task at ArgMining 2021.Track 1 of the shared task requires participants to develop methods to predict the match score between each pair of arguments…
View article: Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework
Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework Open
Answering questions asked from instructional corpora such as E-manuals, recipe books, etc., has been far less studied than open-domain factoid context-based question answering. This can be primarily attributed to the absence of standard be…
View article: Bayesian Optimization -- Multi-Armed Bandit Problem
Bayesian Optimization -- Multi-Armed Bandit Problem Open
In this report, we survey Bayesian Optimization methods focussed on the Multi-Armed Bandit Problem. We take the help of the paper "Portfolio Allocation for Bayesian Optimization". We report a small literature survey on the acquisition func…
View article: A Novel Multimodal Music Genre Classifier using Hierarchical Attention and Convolutional Neural Network
A Novel Multimodal Music Genre Classifier using Hierarchical Attention and Convolutional Neural Network Open
Music genre classification is one of the trending topics in regards to the current Music Information Retrieval (MIR) Research. Since, the dependency of genre is not only limited to the audio profile, we also make use of textual content pro…
View article: Identification of Cervical Pathology using Adversarial Neural Networks
Identification of Cervical Pathology using Adversarial Neural Networks Open
Various screening and diagnostic methods have led to a large reduction of cervical cancer death rates in developed countries. However, cervical cancer is the leading cause of cancer related deaths in women in India and other low and middle…
View article: KarNet: An Efficient Boolean Function Simplifier
KarNet: An Efficient Boolean Function Simplifier Open
Many approaches such as Quine-McCluskey algorithm, Karnaugh map solving, Petrick's method and McBoole's method have been devised to simplify Boolean expressions in order to optimize hardware implementation of digital circuits. However, the…