Explanipedia

Measuring and Mitigating Hallucinations in Vision-Language Dataset Generation for Remote Sensing Open

M. B. Anderson, Miriam Cha, William T. Freeman, J. Taylor Perron, Nathaniel Maidel , et al. · 2025

Vision language models have achieved impressive results across various fields. However, adoption in remote sensing remains limited, largely due to the scarcity of paired image-text data. To bridge this gap, synthetic caption generation has…

Improving Medical Visual Representations via Radiology Report Generation Open

Keegan Quigley, Miriam Cha, Geeticka Chauhan, Seth A. Berkowitz, Steven Horng , et al. · 2023

Computer science Physics Mathematics

Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. Contrastive learning approaches have increasingly been adopted for medical vision language p…

MultiEarth 2023 -- Multimodal Learning for Earth and Environment Workshop and Challenge Open

Miriam Cha, Gregory Angelides, Mark Hamilton, Andy Soszynski, Brandon Swenson , et al. · 2023

Computer science Geography Environmental science

The Multimodal Learning for Earth and Environment Workshop (MultiEarth 2023) is the second annual CVPR workshop aimed at the monitoring and analysis of the health of Earth ecosystems by leveraging the vast amount of remote sensing data tha…

RadTex: Learning Efficient Radiograph Representations from Text Reports Open

Keegan Quigley, Miriam Cha, Ruizhi Liao, Geeticka Chauhan, Steven Horng , et al. · 2022

Computer science Physics Mathematics

Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high…

SAR-to-EO Image Translation with Multi-Conditional Adversarial Networks Open

Armando Cabrera, Miriam Cha, Prafull Sharma, Michael Newey · 2022

Computer science Chemistry Sociology

This paper explores the use of multi-conditional adversarial networks for SAR-to-EO image translation. Previous methods condition adversarial networks only on the input SAR. We show that incorporating multiple complementary modalities such…

Developing a Series of AI Challenges for the United States Department of the Air Force Open

Vijay Gadepally, Gregory Angelides, Andrei Barbu, Andrew Bowne, Laura J. Brattain , et al. · 2022

Political science Computer science Engineering

Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Departme…

MultiEarth 2022 -- Multimodal Learning for Earth and Environment Workshop and Challenge Open

Miriam Cha, Kuan Wei Huang, Morgan S. Schmidt, Gregory Angelides, Mark F. Hamilton , et al. · 2022

Computer science Geography Environmental science

The Multimodal Learning for Earth and Environment Challenge (MultiEarth 2022) will be the first competition aimed at the monitoring and analysis of deforestation in the Amazon rainforest at any time and in any weather conditions. The goal …

Twitter Geolocation and Regional Classification via Sparse Coding Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2021

Computer science Geography Mathematics

We present a data-driven approach for Twitter geolocation and regional classification. Our method is based on sparse coding and dictionary learning, an unsupervised method popular in computer vision and pattern recognition. Through a serie…

Multimodal Representation Learning via Maximization of Local Mutual\n Information Open

Ruizhi Liao, Daniel Moyer, Miriam Cha, Keegan Quigley, Seth J. Berkowitz , et al. · 2021

Computer science Mathematics Political science

We propose and demonstrate a representation learning approach by maximizing\nthe mutual information between local features of images and text. The goal of\nthis approach is to learn useful image representations by taking advantage of\nthe …

Adversarial Learning of Semantic Relevance in Text to Image Synthesis Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2019

Computer science Political science Economics

We describe a new approach that improves the training of generative adversarial nets (GANs) for synthesizing diverse images from a text input. Our approach is based on the conditional version of GANs and expands on previous work leveraging…

Multimodal Sparse Representation Learning and Cross-Modal Synthesis Open

Miriam Cha · 2019

Computer science Materials science Political science

Humans have a natural ability to process and relate concurrent sensations in different sensory modalities such as vision, hearing, smell, and taste. In order for artificial intelligence to be more human-like in their capabilities, it needs…

Adversarial Learning of Semantic Relevance in Text to Image Synthesis Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2018

Computer science Economics Political science

We describe a new approach that improves the training of generative adversarial nets (GANs) for synthesizing diverse images from a text input. Our approach is based on the conditional version of GANs and expands on previous work leveraging…

Language Modeling by Clustering with Word Embeddings for Text Readability Assessment Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2017

Computer science Mathematics Philosophy

We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences. …

Language Modeling by Clustering with Word Embeddings for Text Readability Assessment Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2017

Computer science Mathematics Philosophy

We present a clustering-based language model using word embeddings for text readability prediction. Presumably, an Euclidean semantic space hypothesis holds true for word embeddings whose training is done by observing word co-occurrences. …

Adversarial nets with perceptual losses for text-to-image synthesis Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2017

Computer science Psychology Philosophy

Recent approaches in generative adversarial networks (GANs) can automatically synthesize realistic images from descriptive text. Despite the overall fair quality, the generated images often expose visible flaws that lack structural definit…

Deep Sparse-coded Network (DSN) Open

Youngjune Gwon, Miriam Cha, H. T. Kung · 2016

Computer science Mathematics Physics

We introduce Deep Sparse-coded Network (DSN), a deep architecture based on sparse coding and dictionary learning. Key advantage of our approach is two-fold. By interlacing max pooling with sparse coding layer, we achieve nonlinear activati…

Multimodal Sparse Coding for Event Detection Open

Youngjune Gwon, William M. Campbell, Kevin Brady, Douglas Sturim, Miriam Cha , et al. · 2016

Computer science Psychology Mathematics

Unsupervised feature learning methods have proven effective for classification tasks based on a single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared represe…

Multimodal sparse representation learning and applications Open

Miriam Cha, Youngjune Gwon, H. T. Kung · 2015

Computer science Mathematics Sociology

Unsupervised methods have proven effective for discriminative tasks in a single-modality scenario. In this paper, we present a multimodal framework for learning sparse representations that can capture semantic correlation between modalitie…

Miriam Cha YOU? Author Swipe