Explanipedia

A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation Open

Andrew Z. Wang, Songwei Ge, Tero Karras, Ming-Yu Liu, Yogesh Balaji · 2025

Both text-to-image generation and large language models (LLMs) have made significant advancements. However, many text-to-image models still employ the somewhat outdated T5 and CLIP as their text encoders. In this work, we investigate the e…

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Open

Nvidia Nvidia, NULL AUTHOR_ID, Yuval Atzmon, Madhu Bala, Yogesh Balaji , et al. · 2024

We introduce Edify Image, a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy. Edify Image utilizes cascaded pixel-space diffusion models trained using a novel Laplacian diffusion pro…

Guiding a Diffusion Model with a Bad Version of Itself Open

Tero Karras, Miika Aittala, Tuomas Kynkäänniemi, Jaakko Lehtinen, Timo Aila , et al. · 2024

Computer science Psychology Physics

The primary axes of interest in image-generating diffusion models are image quality, the amount of variation in the results, and how well the results align with a given condition, e.g., a class label or a text prompt. The popular classifie…

Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models Open

Tuomas Kynkäänniemi, Miika Aittala, Tero Karras, Samuli Laine, Timo Aila , et al. · 2024

Computer science Mathematics Engineering

Guidance is a crucial technique for extracting the best performance out of image-generating diffusion models. Traditionally, a constant guidance weight has been applied throughout the sampling chain of an image. We show that guidance is cl…

Analyzing and Improving the Training Dynamics of Diffusion Models Open

Tero Karras, Miika Aittala, Jaakko Lehtinen, Janne Hellsten, Timo Aila , et al. · 2023

Computer science Mathematics Physics

Diffusion models currently dominate the field of data-driven image synthesis with their unparalleled scaling to large datasets. In this paper, we identify and rectify several causes for uneven and ineffective training in the popular ADM di…

Generative Novel View Synthesis with 3D-Aware Diffusion Models Open

Eric R. Chan, Koki Nagano, Matthew A. Chan, Alexander W. Bergman, Jeong Joon Park , et al. · 2023

Computer science Philosophy

We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image. Our model samples from the distribution of possible renderings consistent with the input and, even in the presence of ambi…

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis Open

Axel Sauer, Tero Karras, Samuli Laine, Andreas Geiger, Timo Aila · 2023

Computer science Mathematics Chemistry

Text-to-image synthesis has recently seen significant progress thanks to large pretrained language models, large-scale training data, and the introduction of scalable model families such as diffusion and autoregressive models. However, the…

Simulator-Based Self-Supervision for Learned 3D Tomography Reconstruction Open

Onni Kosomaa, Samuli Laine, Tero Karras, Miika Aittala, Jaakko Lehtinen · 2022

Computer science Mathematics Physics

We propose a deep learning method for 3D volumetric reconstruction in low-dose helical cone-beam computed tomography. Prior machine learning approaches require reference reconstructions computed by another algorithm for training. In contra…

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers Open

Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song , et al. · 2022

Computer science Geography Physics

Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashio…

Generating Long Videos of Dynamic Scenes Open

Tim Brooks, Janne Hellsten, Miika Aittala, Ting-Chun Wang, Timo Aila , et al. · 2022

Computer science Geography Political science

We present a video generation model that accurately reproduces object motion, changes in camera viewpoint, and new content that arises over time. Existing video generation methods often fail to produce new content as a function of time whi…

Elucidating the Design Space of Diffusion-Based Generative Models Open

Tero Karras, Miika Aittala, Timo Aila, Samuli Laine · 2022

Computer science Sociology Physics

We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices. This lets u…

The Role of ImageNet Classes in Fréchet Inception Distance Open

Tuomas Kynkäänniemi, Tero Karras, Miika Aittala, Timo Aila, J. Lehtinen · 2022

Computer science Mathematics Philosophy

Fréchet Inception Distance (FID) is the primary metric for ranking models in data-driven generative modeling. While remarkably successful, the metric is known to sometimes disagree with human judgement. We investigate a root cause of these…

Efficient Geometry-aware 3D Generative Adversarial Networks Open

Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan , et al. · 2021

Computer science Engineering

Unsupervised generation of high-quality multi-view-consistent images and 3D shapes using only collections of single-view 2D photographs has been a long-standing challenge. Existing 3D GANs are either compute-intensive or make approximation…

Alias-Free Generative Adversarial Networks Open

Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten , et al. · 2021

Computer science Philosophy Physics

We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearin…

Alias-Free Generative Adversarial Networks Open

Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten , et al. · 2021

Computer science

We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearin…

Modular primitives for high-performance differentiable rendering Open

Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen , et al. · 2020

Computer science Engineering Mathematics

We present a modular differentiable renderer design that yields performance superior to previous methods by leveraging existing, highly optimized hardware graphics pipelines. Our design supports all crucial operations in a modern graphics …

Training Generative Adversarial Networks with Limited Data Open

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen , et al. · 2020

Computer science Mathematics Geography

Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes train…

Analyzing and Improving the Image Quality of StyleGAN Open

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen , et al. · 2020

Computer science Geography Physics

The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architectu…

Semi-Supervised StyleGAN for Disentanglement Learning Open

Weili Nie, Tero Karras, Animesh Garg, Shoubhik Debnath, Anjul Patney , et al. · 2020

Computer science Mathematics Physics

Disentanglement learning is crucial for obtaining disentangled representations and controllable generation. Current disentanglement methods face several inherent limitations: difficulty with high-resolution images, primarily focusing on le…

Few-Shot Unsupervised Image-to-Image Translation Open

Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila , et al. · 2019

Computer science Mathematics Chemistry

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images. While remarkably successful, current methods requ…

A Style-Based Generator Architecture for Generative Adversarial Networks Open

Tero Karras, Samuli Laine, Timo Aila · 2019

Computer science Philosophy History

We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g.,…

Few-Shot Unsupervised Image-to-Image Translation Open

Ming-Yu Liu, Xun Huang, Arun Mallya, Tero Karras, Timo Aila , et al. · 2019

Computer science Mathematics Geography

Unsupervised image-to-image translation methods learn to map images in a given class to an analogous image in a different class, drawing on unstructured (non-registered) datasets of images. While remarkably successful, current methods requ…

Improved Precision and Recall Metric for Assessing Generative Models Open

Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, Timo Aila · 2019

Computer science Mathematics Chemistry

The ability to automatically estimate the quality and coverage of the samples produced by a generative model is a vital requirement for driving algorithm research. We present an evaluation metric that can separately and reliably measure bo…

High-Quality Self-Supervised Deep Image Denoising Open

Samuli Laine, Tero Karras, Jaakko Lehtinen, Timo Aila · 2019

Computer science Physics

We describe a novel method for training high-quality image denoising models based on unorganized collections of corrupted images. The training does not need access to clean reference images, or explicit pairs of corrupted images, and can t…

Texture Level of Detail Strategies for Real-Time Ray Tracing Open

Tomas Akenine‐Möller, Jim Nilsson, Magnus Andersson, Colin Barré-Brisebois, Róbert Tóth , et al. · 2019

Computer science Physics

Unlike rasterization, where one can rely on pixel quad partial derivatives, an alternative approach must be taken for filtered texturing during ray tracing. We describe two methods for computing texture level of detail for ray tracing. The…

Noise2Noise: Learning Image Restoration without Clean Data Open

Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras , et al. · 2018

Computer science Mathematics Philosophy

We apply basic statistical reasoning to signal reconstruction by machine learning -- learning to map corrupted observations to clean signals -- with a simple and powerful conclusion: it is possible to learn to restore images by only lookin…

Noise2Noise: Learning image restoration without clean data Open

Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras , et al. · 2018

Computer science Mathematics

We apply basic statistical reasoning to signal reconstruction by machine learning - learning to map corrupted observations to clean signals - with a simple and powerful conclusion: It is possible to learn to restore images by only looking …

Progressive Growing of GANs for Improved Quality, Stability, and Variation Open

Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen · 2017

Computer science Engineering Physics

We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details …

Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning. Open

Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz · 2016

Computer science Biology

We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation - a computationally efficient procedure that m…

Pruning Convolutional Neural Networks for Resource Efficient Inference Open

Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz · 2016

Computer science Biology

We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation - a computationally efficient procedure that m…

Tero Karras YOU? Author Swipe