Michael Elhadad
YOU?
Author Swipe
Semantic Parsing for Complex Data Retrieval: Targeting Query Plans vs. SQL for No-Code Access to Relational Databases Open
Large Language Models (LLMs) have spurred progress in text-to-SQL, the task of generating SQL queries from natural language questions based on a given database schema. Despite the declarative nature of SQL, it continues to be a complex pro…
Semantic Decomposition of Question and SQL for Text-to-SQL Parsing Open
Text-to-SQL semantic parsing faces challenges in generalizing to cross-domain and complex queries. Recent research has employed a question decomposition strategy to enhance the parsing of complex SQL queries. However, this strategy encount…
Emptying the Ocean with a Spoon: Should We Edit Models? Open
We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations. We contrast model editing with three similar but distinct approaches that pursue better defined objec…
Mining Eye-Tracking Data for Text Summarization Open
In this study, we introduce and evaluate a novel extractive text summarization methodology, "SummarEyes," based on the visual interaction of the user with the text, using eye-tracking data, as opposed to the traditional approaches based on…
Emptying the Ocean with a Spoon: Should We Edit Models? Open
We call into question the recently popularized method of direct model editing as a means of correcting factual errors in LLM generations. We contrast model editing with three similar but distinct approaches that pursue better defined objec…
Semantic Decomposition of Question and SQL for Text-to-SQL Parsing Open
Text-to-SQL semantic parsing faces challenges in generalizing to cross-domain and complex queries. Recent research has employed a question decomposition strategy to enhance the parsing of complex SQL queries.However, this strategy encounte…
Cross-Lingual UMLS Named Entity Linking using UMLS Dictionary Fine-Tuning Open
We study cross-lingual UMLS named entity linking, where mentions in a given source language are mapped to UMLS concepts, most of which are labeled in English. Our cross-lingual framework includes an offline unsupervised construction of a t…
Data Efficient Masked Language Modeling for Vision and Language Open
Masked language modeling (MLM) is one of the key sub-tasks in vision-language pretraining. In the cross-modal setting, tokens in the sentence are masked at random, and the model predicts the masked tokens given the image and the text. In t…
Automatic Generation of Contrast Sets from Scene Graphs: Probing the\n Compositional Consistency of GQA Open
Recent works have shown that supervised models often exploit data artifacts\nto achieve good test scores while their performance severely degrades on\nsamples outside their training distribution. Contrast sets (Gardneret al.,\n2020) quanti…
Data Efficient Masked Language Modeling for Vision and Language Open
Masked language modeling (MLM) is one of the key sub-tasks in vision-language pretraining. In the cross-modal setting, tokens in the sentence are masked at random, and the model predicts the masked tokens given the image and the text. In t…
Evaluation Guidelines to Deal with Implicit Phenomena to Assess Factuality in Data-to-Text Generation Open
Data-to-text generation systems are trained on large datasets, such as WebNLG, Ro-toWire, E2E or DART. Beyond traditional token-overlap evaluation metrics (BLEU or METEOR), a key concern faced by recent generators is to control the factual…
Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA Open
Recent works have shown that supervised models often exploit data artifacts to achieve good test scores while their performance severely degrades on samples outside their training distribution. Contrast sets (Gardneret al., 2020) quantify …
Cross-lingual Unified Medical Language System entity linking in online health communities Open
Objective In Hebrew online health communities, participants commonly write medical terms that appear as transliterated forms of a source term in English. Such transliterations introduce high variability in text and challenge text-analytics…
Building a Hebrew Semantic Role Labeling Lexical Resource from Parallel Movie Subtitles Open
We present a semantic role labeling resource for Hebrew built semi-automatically through annotation projection from English. This corpus is derived from the multilingual OpenSubtitles dataset and includes short informal sentences, for whic…
Sideways Transliteration: How to Transliterate Multicultural Person Names? Open
In a global setting, texts contain transliterated names from many cultural origins. Correct transliteration depends not only on target and source languages but also, on the source language of the name. We introduce a novel methodology for …
Question Answering as an Automatic Evaluation Metric for News Article Summarization Open
Recent work in the field of automatic summarization and headline generation focuses on maximizing ROUGE scores for various news datasets. We present an alternative, extrinsic, evaluation metric for this task, Answering Performance for Eval…
Question Answering as an Automatic Evaluation Metric for News Article Summarization Open
Matan Eyal, Tal Baumel, Michael Elhadad. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
Computational Text Analysis of a Scientific Resilience Management Corpus: Environmental Insights and Implications Open
Resilience is a multifaceted concept describing the ability to cope with change or disruption. Its importance in the era of emergency preparedness and response, combined with its multidisciplinary attributes, have led researches to study s…
Query Focused Abstractive Summarization: Incorporating Query Relevance, Multi-Document Coverage, and Summary Length Constraints into seq2seq Models Open
Query Focused Summarization (QFS) has been addressed mostly using extractive methods. Such methods, however, produce text which suffers from low coherence. We investigate how abstractive methods can be applied to QFS, to overcome such limi…
Multi-Label Classification of Patient Notes a Case Study on ICD Code Assignment Open
In the context of the Electronic Health Record, automated diagnosis coding of patient notes is a useful task, but a challenging one due to the large number of codes and the length of patient notes. We investigate four models for assigning …
Topic Concentration in Query Focused Summarization Datasets Open
Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state o…
Sentence Embedding Evaluation Using Pyramid Annotation Open
Word embedding vectors are used as input for a variety of tasks.Choosing the right model and features for producing such vectors is not a trivial task and different embedding methods can greatly affect results.In this paper we repurpose th…