Sameep Mehta
YOU?
Author Swipe
View article: Automated Creation and Enrichment Framework for Improved Invocation of Enterprise APIs as Tools
Automated Creation and Enrichment Framework for Improved Invocation of Enterprise APIs as Tools Open
Recent advancements in Large Language Models (LLMs) has lead to the development of agents capable of complex reasoning and interaction with external tools. In enterprise contexts, the effective use of such tools that are often enabled by a…
View article: Quality Assessment of Tabular Data using Large Language Models and Code Generation
Quality Assessment of Tabular Data using Large Language Models and Code Generation Open
Reliable data quality is crucial for downstream analysis of tabular datasets, yet rule-based validation often struggles with inefficiency, human intervention, and high computational costs. We present a three-stage framework that combines s…
View article: A Framework for Testing and Adapting REST APIs as LLM Tools
A Framework for Testing and Adapting REST APIs as LLM Tools Open
Large Language Models (LLMs) are increasingly used to build autonomous agents that perform complex tasks with external tools, often exposed through APIs in enterprise systems. Direct use of these APIs is difficult due to the complex input …
View article: Question-guided Insights Generation for Automated Exploratory Data Analysis
Question-guided Insights Generation for Automated Exploratory Data Analysis Open
Exploratory Data Analysis (EDA) derives meaningful insights from extensive and complex datasets. This process typically involves a series of analytical operations to identify the patterns within the data. However, the effectiveness of EDA …
View article: LLMGuard: Guarding against Unsafe LLM Behavior
LLMGuard: Guarding against Unsafe LLM Behavior Open
Although the rise of Large Language Models (LLMs) in enterprise settings brings new opportunities and capabilities, it also brings challenges, such as the risk of generating inappropriate, biased, or misleading content that violates regula…
View article: xLP: Explainable Link Prediction for Master Data Management
xLP: Explainable Link Prediction for Master Data Management Open
Explaining neural model predictions to users requires creativity. Especially in enterprise applications, where there are costs associated with users' time, and their trust in the model predictions is critical for adoption. For link predict…
View article: LLMGuard: Guarding Against Unsafe LLM Behavior
LLMGuard: Guarding Against Unsafe LLM Behavior Open
Although the rise of Large Language Models (LLMs) in enterprise settings brings new opportunities and capabilities, it also brings challenges, such as the risk of generating inappropriate, biased, or misleading content that violates regula…
View article: "Beware of deception": Detecting Half-Truth and Debunking it through Controlled Claim Editing
"Beware of deception": Detecting Half-Truth and Debunking it through Controlled Claim Editing Open
The prevalence of half-truths, which are statements containing some truth but that are ultimately deceptive, has risen with the increasing use of the internet. To help combat this problem, we have created a comprehensive pipeline consistin…
View article: CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation Open
We propose a method to control the attributes of Language Models (LMs) for the text generation task using Causal Average Treatment Effect (ATE) scores and counterfactual augmentation. We explore this method, in the context of LM detoxifica…
View article: CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation Open
We propose a method to control the attributes of Language Models (LMs) for the text generation task using Causal Average Treatment Effect (ATE) scores and counterfactual augmentation. We explore this method, in the context of LM detoxifica…
View article: Workshop on Data Fabric for Hybrid Clouds (WDFHC)
Workshop on Data Fabric for Hybrid Clouds (WDFHC) Open
A number of organizations have adopted the hybrid-cloud paradigm to optimize business processes. Hybrid clouds span public and private clouds, different public cloud providers as well as edge and cloud resources. A hybrid cloud architectur…
View article: Toward Scientific Workflows in a Serverless World
Toward Scientific Workflows in a Serverless World Open
Serverless computing and FaaS have gained popularity due to their ease of design, deployment, scaling and billing on clouds. However, when used to compose and orchestrate scientific workflows, they pose limitations due to cold starts, mess…
View article: Data Readiness Report
Data Readiness Report Open
Data exploration and quality analysis is an important yet tedious process in the AI pipeline. Current practices of data cleaning and data readiness assessment for machine learning tasks are mostly conducted in an arbitrary manner which lim…
View article: Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets
Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets Open
The quality of training data has a huge impact on the efficiency, accuracy and complexity of machine learning tasks. Various tools and techniques are available that assess data quality with respect to general cleaning and profiling checks.…
View article: Explainable Link Prediction for Privacy-Preserving Contact Tracing
Explainable Link Prediction for Privacy-Preserving Contact Tracing Open
Contact Tracing has been used to identify people who were in close proximity to those infected with SARS-Cov2 coronavirus. A number of digital contract tracing applications have been introduced to facilitate or complement physical contact …
View article: Multidimensional Analysis of Trust in News Articles (Student Abstract)
Multidimensional Analysis of Trust in News Articles (Student Abstract) Open
The advancements in the field of Information Communication Technology have engendered revolutionary changes in the journalism industry, not only on the part of the journalists and the media personnel, but also on the people consuming these…
View article: Fair Transfer of Multiple Style Attributes in Text
Fair Transfer of Multiple Style Attributes in Text Open
To preserve anonymity and obfuscate their identity on online platforms users may morph their text and portray themselves as a different gender or demographic. Similarly, a chatbot may need to customize its communication style to improve en…
View article: Hardening Deep Neural Networks via Adversarial Model Cascades
Hardening Deep Neural Networks via Adversarial Model Cascades Open
Deep neural networks (DNNs) are vulnerable to malicious inputs crafted by an adversary to produce erroneous outputs. Works on securing neural networks against adversarial examples achieve high empirical robustness on simple datasets such a…
View article: FactSheets: Increasing trust in AI services through supplier's declarations of conformity
FactSheets: Increasing trust in AI services through supplier's declarations of conformity Open
Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements …
View article: Model Extraction Warning in MLaaS Paradigm
Model Extraction Warning in MLaaS Paradigm Open
Cloud vendors are increasingly offering machine learning services as part of their platform and services portfolios. These services enable the deployment of machine learning models on the cloud that are offered on a pay-per-query basis to …
View article: What is my data worth? From data properties to data value
What is my data worth? From data properties to data value Open
Data today fuels both the economy and advances in machine learning and AI. All aspects of decision making, at the personal and enterprise level and in governments are increasingly data-driven. In this context, however, there are still some…
View article: AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias
AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias Open
Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Pytho…
View article: AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and\n Mitigating Unwanted Algorithmic Bias
AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and\n Mitigating Unwanted Algorithmic Bias Open
Fairness is an increasingly important concern as machine learning models are\nused to support decision making in high-stakes applications such as mortgage\nlending, hiring, and prison sentencing. This paper introduces a new open source\nPy…
View article: Extracting Fairness Policies from Legal Documents
Extracting Fairness Policies from Legal Documents Open
Machine Learning community is recently exploring the implications of bias and fairness with respect to the AI applications. The definition of fairness for such applications varies based on their domain of application. The policies governin…
View article: Efficiently Processing Workflow Provenance Queries on SPARK
Efficiently Processing Workflow Provenance Queries on SPARK Open
In this paper, we investigate how we can leverage Spark platform for efficiently processing provenance queries on large volumes of workflow provenance data. We focus on processing provenance queries at attribute-value level which is the fi…