Ankit Ramchandani
YOU?
Author Swipe
View article: Imagine yourself: Tuning-Free Personalized Image Generation
Imagine yourself: Tuning-Free Personalized Image Generation Open
Diffusion models have demonstrated remarkable efficacy across various image-to-image tasks. In this research, we introduce Imagine yourself, a state-of-the-art model designed for personalized image generation. Unlike conventional tuning-ba…
View article: L <scp>umos</scp> : Empowering Multimodal LLMs with Scene Text Recognition
L <span>umos</span> : Empowering Multimodal LLMs with Scene Text Recognition Open
We introduce Lumos, the first end-to-end multimodal question-answering system with text understanding capabilities. At the core of Lumos is a Scene Text Recognition (STR) component that extracts text from first person point-of-view images,…
View article: Layout Agnostic Scene Text Image Synthesis with Diffusion Models
Layout Agnostic Scene Text Image Synthesis with Diffusion Models Open
While diffusion models have significantly advanced the quality of image generation their capability to accurately and coherently render text within these images remains a substantial challenge. Conventional diffusion-based methods for scen…
View article: Lumos : Empowering Multimodal LLMs with Scene Text Recognition
Lumos : Empowering Multimodal LLMs with Scene Text Recognition Open
We introduce Lumos, the first end-to-end multimodal question-answering system with text understanding capabilities. At the core of Lumos is a Scene Text Recognition (STR) component that extracts text from first person point-of-view images,…
View article: Animated Stickers: Bringing Stickers to Life with Video Diffusion
Animated Stickers: Bringing Stickers to Life with Video Diffusion Open
We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of tem…
View article: Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression Open
We introduce Style Tailoring, a recipe to finetune Latent Diffusion Models (LDMs) in a distinct domain with high visual quality, prompt alignment and scene diversity. We choose sticker image generation as the target domain, as the images s…
View article: DISGO: Automatic End-to-End Evaluation for Scene Text OCR
DISGO: Automatic End-to-End Evaluation for Scene Text OCR Open
This paper discusses the challenges of optical character recognition (OCR) on natural scenes, which is harder than OCR on documents due to the wild content and various image backgrounds. We propose to uniformly use word error rates (WER) a…
View article: DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and Their Interactions
DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and Their Interactions Open
In this paper, we propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days and we present a novel method to compute equidimensional representations of multivariate time series and multivaria…
View article: DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive\n Surveillance of COVID-19 Using Heterogeneous Features and their Interactions
DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive\n Surveillance of COVID-19 Using Heterogeneous Features and their Interactions Open
In this paper, we propose a deep learning model to forecast the range of\nincrease in COVID-19 infected cases in future days and we present a novel\nmethod to compute equidimensional representations of multivariate time series\nand multiva…
View article: DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and Their Interactions
DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive Surveillance of COVID-19 Using Heterogeneous Features and Their Interactions Open
In this paper, we propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days and we present a novel method to compute equidimensional representations of multivariate time series and multivaria…