Explanipedia

PractiLight: Practical Light Control Using Foundational Diffusion Models Open

Yotam Erel, Rishabh Dabral, Vladislav Golyanik, Amit H. Bermano, Christian Theobalt · 2025

Light control in generated images is a difficult task, posing specific challenges, spanning over the entire image and frequency spectrum. Most approaches tackle this problem by training on extensive yet domain-specific datasets, limiting t…

Express4D: Expressive, Friendly, and Extensible 4D Facial Motion Generation Benchmark Open

Yaron Aloni, Rotem Shalev-Arkushin, Yonatan Shafir, Guy Tevet, Ohad Fried , et al. · 2025

Dynamic facial expression generation from natural language is a crucial task in Computer Graphics, with applications in Animation, Virtual Avatars, and Human-Computer Interaction. However, current generative models suffer from datasets tha…

Attention (as Discrete-Time Markov) Chains Open

Yotam Erel, Olaf Dünkel, Rishabh Dabral, Vladislav Golyanik, Christian Theobalt , et al. · 2025

We introduce a new interpretation of the attention matrix as a discrete-time Markov chain. Our interpretation sheds light on common operations involving attention scores such as selection, summation, and averaging in a unified framework. I…

HOIDiNi: Human-Object Interaction through Diffusion Noise Optimization Open

Roey Ron, Guy Tevet, Haim Sawdayee, Amit H. Bermano · 2025

We present HOIDiNi, a text-driven diffusion framework for synthesizing realistic and plausible human-object interaction (HOI). HOI generation is extremely challenging since it induces strict contact accuracies alongside a diverse motion ma…

AnyTop: Character Animation Diffusion with Any Topology Open

Inbar Gat, Sigal Raab, Guy Tevet, Yuval Reshef, Amit H. Bermano , et al. · 2025

Generating motion for arbitrary skeletons is a longstanding challenge in computer graphics, remaining largely unexplored due to the scarcity of diverse datasets and the irregular nature of the data. In this work, we introduce AnyTop, a dif…

ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation Open

Rotem Shalev-Arkushin, Rinon Gal, Amit H. Bermano, Ohad Fried · 2025

Computer science

Diffusion models enable high-quality and diverse visual content synthesis. However, they struggle to generate rare or unseen concepts. To address this challenge, we explore the usage of Retrieval-Augmented Generation (RAG) with image gener…

Data Efficient Molecular Image Representation Learning using Foundation Models Open

Yonatan Harnik, Hadas Shalit Peleg, Amit H. Bermano, Anat Milo · 2025

Computer science Political science

Deep learning (DL) in chemistry has made significant progress, yet its applicability is limited by the scarcity of large, labeled datasets and the difficulty of extracting meaningful molecular features. Recently, molecular representation l…

Data efficient molecular image representation learning using foundation models Open

Yonatan Harnik, Hadas Shalit Peleg, Amit H. Bermano, Anat Milo · 2025

Computer science Geography Political science

A general image foundation model was used as the basis for molecular representation learning, showcasing its benefits in chemical property prediction through a stratified pretraining workflow.

Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects Open

Amir Barda, Matheus Gadelha, Vladimir G. Kim, Noam Aigerman, Amit H. Bermano , et al. · 2024

Computer science

We propose a generative technique to edit 3D shapes, represented as meshes, NeRFs, or Gaussian Splats, in approximately 3 seconds, without the need for running an SDS type of optimization. Our key insight is to cast 3D editing as a multivi…

CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control Open

Guy Tevet, Sigal Raab, Setareh Cohan, Daniele Reda, Zhengyi Luo , et al. · 2024

Computer science Mathematics Physics

Motion diffusion models and Reinforcement Learning (RL) based control for physics-based simulations have complementary strengths for human motion generation. The former is capable of generating a wide variety of motions, adhering to intuit…

ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation Open

Rinon Gal, Adi Haviv, Yuval Alaluf, Amit H. Bermano, Daniel Cohen‐Or , et al. · 2024

Computer science

The practical use of text-to-image generation has evolved from simple, monolithic models to complex workflows that combine multiple specialized components. While workflow-based approaches can lead to improved image quality, crafting effect…

Casper DPM: Cascaded Perceptual Dynamic Projection Mapping onto Hands Open

Yotam Erel, Or Kozlovsky-Mordenfeld, Daisuke Iwai, Kosuke Sato, Amit H. Bermano · 2024

Computer science Psychology

We present a technique for dynamically projecting 3D content onto human hands with short perceived motion-to-photon latency. Computing the pose and shape of human hands accurately and quickly is a challenging task due to their articulated …

Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion Open

Adi Haviv, Shahar Sarfaty, Uri Hacohen, Niva Elkin-Koren, Roi Livni , et al. · 2024

Computer science Political science Physics

This work addresses the challenge of quantifying originality in text-to-image (T2I) generative diffusion models, with a focus on copyright originality. We begin by evaluating T2I models' ability to innovate and generalize through controlle…

Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild Open

Nadav Orzech, Yotam Nitzan, Ulysse Mizrahi, Dov Danon, Amit H. Bermano · 2024

Computer science Chemistry Philosophy

Virtual Try-On (VTON) is a highly active line of research, with increasing demand. It aims to replace a piece of garment in an image with one from another, while preserving person and garment characteristics as well as image fidelity. Curr…

V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data Open

Rotem Shalev-Arkushin, Aharon Azulay, Tavi Halperin, Eitan Richardson, Amit H. Bermano , et al. · 2024

Geology Medicine

Diffusion-based generative models have recently shown remarkable image and video editing capabilities. However, local video editing, particularly removal of small attributes like glasses, remains a challenge. Existing methods either alter …

Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer Open

Sigal Raab, Inbar Gat, Nathan Sala, Guy Tevet, Rotem Shalev-Arkushin , et al. · 2024

Physics Computer science Materials science

Given the remarkable results of motion synthesis with diffusion models, a natural question arises: how can we effectively leverage these models for motion editing? Existing diffusion-based motion editing methods overlook the profound poten…

LCM-Lookahead for Encoder-based Text-to-Image Personalization Open

Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit H. Bermano , et al. · 2024

Computer science

Recent advancements in diffusion models have introduced fast sampling methods that can effectively produce high-quality images in just one or a few denoising steps. Interestingly, when these are distilled from existing diffusion models, th…

Not All Similarities Are Created Equal: Leveraging Data-Driven Biases to Inform GenAI Copyright Disputes Open

Uri Hacohen, Adi Haviv, Shahar Sarfaty, Bruria Friedman, Niva Elkin-Koren , et al. · 2024

Computer science Political science

The advent of Generative Artificial Intelligence (GenAI) models, including GitHub Copilot, OpenAI GPT, and Stable Diffusion, has revolutionized content creation, enabling non-professionals to produce high-quality content across various dom…

MagicClay: Sculpting Meshes With Generative Neural Fields Open

Amir Barda, Vladimir G. Kim, Noam Aigerman, Amit H. Bermano, Thibault Groueix · 2024

Computer science

The recent developments in neural fields have brought phenomenal capabilities to the field of shape generation, but they lack crucial properties, such as incremental control - a fundamental requirement for artistic work. Triangular meshes,…

Breathing Life Into Sketches Using Text-to-Video Priors Open

Rinon Gal, Yael Vinker, Yuval Alaluf, Amit H. Bermano, Daniel Cohen‐Or , et al. · 2023

Computer science Political science Mathematics

A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually. An animated sketch opens another dimension to the expression of ideas and is widely used by designers for a variety of purposes. Animating…

MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion Open

Roy Kapon, Guy Tevet, Daniel Cohen‐Or, Amit H. Bermano · 2023

Computer science Physics

We introduce Multi-view Ancestral Sampling (MAS), a method for 3D motion generation, using 2D diffusion models that were trained on motions obtained from in-the-wild videos. As such, MAS opens opportunities to exciting and diverse fields o…

State of the Art on Diffusion Models for Visual Computing Open

Ryan Po, Wang Yi-fan, Vladislav Golyanik, Kfir Aberman, Jonathan T. Barron , et al. · 2023

Computer science Mathematics

The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes. …

OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks Open

Ofir Tal, Adi Haviv, Amit H. Bermano · 2023

Computer science Engineering Chemistry

Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data to misguide the model into incorrect classifications. Creating these attacks is a challenging task, especially with the ever-increasin…

Amit H. Bermano YOU? Author Swipe