Ashutosh Modi
YOU?
Author Swipe
View article: MM-Telco: Benchmarks and Multimodal Large Language Models for Telecom Applications
MM-Telco: Benchmarks and Multimodal Large Language Models for Telecom Applications Open
Large Language Models (LLMs) have emerged as powerful tools for automating complex reasoning and decision-making tasks. In telecommunications, they hold the potential to transform network optimization, automate troubleshooting, enhance cus…
View article: IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval
IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval Open
Identifying/retrieving relevant statutes and prior cases/precedents for a given legal situation are common tasks exercised by law practitioners. Researchers to date have addressed the two tasks independently, thus developing completely dif…
View article: POSESTITCH-SLT: Linguistically Inspired Pose-Stitching for End-to-End Sign Language Translation
POSESTITCH-SLT: Linguistically Inspired Pose-Stitching for End-to-End Sign Language Translation Open
Sign language translation remains a challenging task due to the scarcity of large-scale, sentence-aligned datasets. Prior arts have focused on various feature extraction and architectural changes to support neural machine translation for s…
View article: Calibration Across Layers: Understanding Calibration Evolution in LLMs
Calibration Across Layers: Understanding Calibration Evolution in LLMs Open
Large Language Models (LLMs) have demonstrated inherent calibration capabilities, where predicted probabilities align well with correctness, despite prior findings that deep neural networks are often overconfident. Recent studies have link…
View article: CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations
CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations Open
Discourse parsing is an important task useful for NLU applications such as summarization, machine comprehension, and emotion recognition. The current discourse parsing datasets based on conversations consists of written English dialogues r…
View article: Towards Quantifying Commonsense Reasoning with Mechanistic Insights
Towards Quantifying Commonsense Reasoning with Mechanistic Insights Open
Commonsense reasoning deals with the implicit knowledge that is well understood by humans and typically acquired via interactions with the world. In recent times, commonsense reasoning and understanding of various LLMs have been evaluated …
View article: IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval
IL-PCSR: Legal Corpus for Prior Case and Statute Retrieval Open
View article: LoRMA: Low-Rank Multiplicative Adaptation for LLMs
LoRMA: Low-Rank Multiplicative Adaptation for LLMs Open
View article: Towards Quantifying Commonsense Reasoning with Mechanistic Insights
Towards Quantifying Commonsense Reasoning with Mechanistic Insights Open
View article: CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations
CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations Open
View article: Calibration Across Layers: Understanding Calibration Evolution in LLMs
Calibration Across Layers: Understanding Calibration Evolution in LLMs Open
View article: PoseStitch-SLT: Linguistically Inspired Pose-Stitching for End-to-End Sign Language Translation
PoseStitch-SLT: Linguistically Inspired Pose-Stitching for End-to-End Sign Language Translation Open
View article: COLD: Causal reasOning in cLosed Daily activities
COLD: Causal reasOning in cLosed Daily activities Open
Large Language Models (LLMs) have shown state-of-the-art performance in a variety of tasks, including arithmetic and reasoning; however, to gauge the intellectual capabilities of LLMs, causal reasoning has become a reliable proxy for valid…
View article: Towards Robust Evaluation of Unlearning in LLMs via Data Transformations
Towards Robust Evaluation of Unlearning in LLMs via Data Transformations Open
Large Language Models (LLMs) have shown to be a great success in a wide range of applications ranging from regular NLP-based use cases to AI agents. LLMs have been trained on a vast corpus of texts from various sources; despite the best ef…
View article: Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs
Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs Open
The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially si…
View article: IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning Open
Legal systems worldwide are inundated with exponential growth in cases and documents. There is an imminent need to develop NLP and ML techniques for automatically processing and understanding legal documents to streamline the legal system.…
View article: iSign: A Benchmark for Indian Sign Language Processing
iSign: A Benchmark for Indian Sign Language Processing Open
Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and treme…
View article: BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain
BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain Open
Several large-scale datasets (e.g., WikiSQL, Spider) for developing natural language interfaces to databases have recently been proposed. These datasets cover a wide breadth of domains but fall short on some essential domains, such as fina…
View article: IITK at SemEval-2024 Task 4: Hierarchical Embeddings for Detection of Persuasion Techniques in Memes
IITK at SemEval-2024 Task 4: Hierarchical Embeddings for Detection of Persuasion Techniques in Memes Open
Memes are one of the most popular types of content used in an online disinformation campaign. They are primarily effective on social media platforms since they can easily reach many users. Memes in a disinformation campaign achieve their g…
View article: IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials
IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials Open
Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies. This research inve…
View article: IITK at SemEval-2024 Task 10: Who is the speaker? Improving Emotion Recognition and Flip Reasoning in Conversations via Speaker Embeddings
IITK at SemEval-2024 Task 10: Who is the speaker? Improving Emotion Recognition and Flip Reasoning in Conversations via Speaker Embeddings Open
This paper presents our approach for the SemEval-2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversations. For the Emotion Recognition in Conversations (ERC) task, we utilize a masked-memory network along with speaker partic…
View article: IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts
IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts Open
This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness. The challenge is focused on automatically detecting the degree of relatedness between pairs of sentences for 14 languages including both h…
View article: Towards Measuring and Modeling "Culture" in LLMs: A Survey
Towards Measuring and Modeling "Culture" in LLMs: A Survey Open
We present a survey of more than 90 recent papers that aim to study cultural representation and inclusion in large language models (LLMs). We observe that none of the studies explicitly define "culture, which is a complex, multifaceted con…
View article: ScriptWorld: Text Based Environment for Learning Procedural Knowledge
ScriptWorld: Text Based Environment for Learning Procedural Knowledge Open
Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and ch…
View article: ISLTranslate: Dataset for Translating Indian Sign Language
ISLTranslate: Dataset for Translating Indian Sign Language Open
Sign languages are the primary means of communication for many hard-of-hearing people worldwide. Recently, to bridge the communication gap between the hard-of-hearing community and the rest of the population, several sign language translat…
View article: U-CREAT: Unsupervised Case Retrieval using Events extrAcTion
U-CREAT: Unsupervised Case Retrieval using Events extrAcTion Open
The task of Prior Case Retrieval (PCR) in the legal domain is about automatically citing relevant (based on facts and precedence) prior legal cases in a given query case. To further promote research in PCR, in this paper, we propose a new …
View article: ScriptWorld: Text Based Environment For Learning Procedural Knowledge
ScriptWorld: Text Based Environment For Learning Procedural Knowledge Open
Text-based games provide a framework for developing natural language understanding and commonsense knowledge about the world in reinforcement learning based agents. Existing text-based environments often rely on fictional situations and ch…
View article: SemEval 2023 Task 6: LegalEval - Understanding Legal Texts
SemEval 2023 Task 6: LegalEval - Understanding Legal Texts Open
In populous countries, pending legal cases have been growing exponentially. There is a need for developing NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we…
View article: ISLTranslate: Dataset for Translating Indian Sign Language
ISLTranslate: Dataset for Translating Indian Sign Language Open
Sign languages are the primary means of communication for many hard-of-hearing people worldwide. Recently, to bridge the communication gap between the hard-of-hearing community and the rest of the population, several sign language translat…
View article: SemEval-2023 Task 6: LegalEval - Understanding Legal Texts
SemEval-2023 Task 6: LegalEval - Understanding Legal Texts Open
Ashutosh Modi, Prathamesh Kalamkar, Saurabh Karn, Aman Tiwari, Abhinav Joshi, Sai Kiran Tanikella, Shouvik Kumar Guha, Sachin Malhan, Vivek Raghavan. Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023).…