Ramy Eskander
YOU?
Author Swipe
View article: Revisiting Funnel Transformers for Modern LLM Architectures with Comprehensive Ablations in Training and Inference Configurations
Revisiting Funnel Transformers for Modern LLM Architectures with Comprehensive Ablations in Training and Inference Configurations Open
Transformer-based Large Language Models, which suffer from high computational costs, advance so quickly that techniques proposed to streamline earlier iterations are not guaranteed to benefit more modern models. Building upon the Funnel Tr…
View article: ACR: A Benchmark for Automatic Cohort Retrieval
ACR: A Benchmark for Automatic Cohort Retrieval Open
Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured d…
View article: Coupling Symbolic Reasoning with Language Modeling for Efficient Longitudinal Understanding of Unstructured Electronic Medical Records
Coupling Symbolic Reasoning with Language Modeling for Efficient Longitudinal Understanding of Unstructured Electronic Medical Records Open
The application of Artificial Intelligence (AI) in healthcare has been revolutionary, especially with the recent advancements in transformer-based Large Language Models (LLMs). However, the task of understanding unstructured electronic med…
View article: TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation
TwHIN: Embedding the Twitter Heterogeneous Information Network for Personalized Recommendation Open
Social networks, such as Twitter, form a heterogeneous information network (HIN) where nodes represent domain entities (e.g., user, content, advertiser, etc.) and edges represent one of many entity interactions (e.g, a user re-sharing cont…
View article: Unsupervised Stem-based Cross-lingual Part-of-Speech Tagging for Morphologically Rich Low-Resource Languages
Unsupervised Stem-based Cross-lingual Part-of-Speech Tagging for Morphologically Rich Low-Resource Languages Open
Ramy Eskander, Cass Lowry, Sujay Khandagale, Judith Klavans, Maria Polinsky, Smaranda Muresan. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. …
View article: Towards Improved Distantly Supervised Multilingual Named-Entity Recognition for Tweets
Towards Improved Distantly Supervised Multilingual Named-Entity Recognition for Tweets Open
Recent low-resource named-entity recognition (NER) work has shown impressive gains by leveraging a single multilingual model trained using distantly supervised data derived from cross-lingual knowledge bases. In this work, we investigate s…
View article: Minimally-Supervised Morphological Segmentation using Adaptor Grammars with Linguistic Priors
Minimally-Supervised Morphological Segmentation using Adaptor Grammars with Linguistic Priors Open
With the increasing interest in low-resource languages, unsupervised morphological segmentation has become an active area of research, where approaches based on Adaptor Grammars achieve state-of-the-art results.We demonstrate the power of …
View article: Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios
Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios Open
With the high cost of manually labeling data and the increasing interest in low-resource languages, for which human annotators might not be even available, unsupervised approaches have become essential for processing a typologically divers…
View article: Multilingual Named Entity Recognition in Tweets using Wikidata
Multilingual Named Entity Recognition in Tweets using Wikidata Open
This paper describes a simple way to improve performance of Named Entity Recognition systems across languages using knowledge from Wikidata on Social Media Corpus. We use dictionary based and zero-shot multilingual transfer. Abstract: We p…
View article: Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios
Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios Open
We describe a fully unsupervised cross-lingual transfer approach for part-of-speech (POS) tagging under a truly low resource scenario. We assume access to parallel translations between the target language and one or more source languages f…
View article: An Evaluation of Subword Segmentation Strategies for Neural Machine Translation of Morphologically Rich Languages
An Evaluation of Subword Segmentation Strategies for Neural Machine Translation of Morphologically Rich Languages Open
Byte-Pair Encoding (BPE) (Sennrich et al., 2016) has become a standard pre-processing step when building neural machine translation systems. However, it is not clear whether this is an optimal strategy in all settings. We conduct a control…
View article: Surprise Languages: Rapid-Response Cross-Language IR
Surprise Languages: Rapid-Response Cross-Language IR Open
Sixteen years ago, the first "surprise language exercise" was conducted, in Cebuano. The evaluation goal of a surprise language exercise is to learn how well systems for a new language can be quickly built. This paper briefly reviews the h…
View article: Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages
Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages Open
Polysynthetic languages pose a challenge for morphological analysis due to the root-morpheme complexity and to the word class “squish”. In addition, many of these polysynthetic languages are low-resource. We propose unsupervised approaches…
View article: Automatically Tailoring Unsupervised Morphological Segmentation to the Language
Automatically Tailoring Unsupervised Morphological Segmentation to the Language Open
Morphological segmentation is beneficial for several natural language processing tasks dealing with large vocabularies. Unsupervised methods for morphological segmentation are essential for handling a diverse set of languages, including lo…
View article: Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic
Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic Open
International audience
View article: The Columbia University - New York University Abu Dhabi SIGMORPHON 2016 Morphological Reinflection Shared Task Submission
The Columbia University - New York University Abu Dhabi SIGMORPHON 2016 Morphological Reinflection Shared Task Submission Open
We present a high-level description and error analysis of the Columbia-NYUAD system for morphological reinflection, which builds on previous work on supervised morphological paradigm completion.Our system improved over the shared task base…