Adam Finkelstein
YOU?
Author Swipe
Temporally Smooth Mesh Extraction for Procedural Scenes with Long-Range Camera Trajectories using Spacetime Octrees Open
The procedural occupancy function is a flexible and compact representation for creating 3D scenes. For rasterization and other tasks, it is often necessary to extract a mesh that represents the shape. Unbounded scenes with long-range camer…
CORN: Co-Trained Full- And No-Reference Speech Quality Assessment Open
Perceptual evaluation constitutes a crucial aspect of various audio-processing tasks. Full reference (FR) or similarity-based metrics rely on high-quality reference recordings, to which lower-quality or corrupted versions of the recording …
View article: A <i>δ</i>
A <i>δ</i> Open
Over the last decade, automatic differentiation (AD) has profoundly impacted graphics and vision applications --- both broadly via deep learning and specifically for inverse rendering. Traditional AD methods ignore gradients at discontinui…
Audio Similarity is Unreliable as a Proxy for Audio Quality Open
Many audio processing tasks require perceptual assessment. However, the time and expense of obtaining ``gold standard'' human judgments limit the availability of such data. Most applications incorporate full reference or other similarity-b…
CDPAM: Contrastive Learning for Perceptual Audio Similarity Open
Many speech processing methods based on deep learning require an automatic and differentiable audio metric for the loss function. The DPAM approach of Manocha et al. learns a full-reference metric trained directly on human judgments, and t…
View article: Learning from Shader Program Traces
Learning from Shader Program Traces Open
Deep learning for image processing typically treats input imagery as pixels in some color space. This paper proposes instead to learn from program traces of procedural fragment shaders -- programs that generate images. At each pixel, we co…
HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks Open
Real-world audio recordings are often degraded by factors such as noise, reverberation, and equalization distortion. This paper introduces HiFi-GAN, a deep learning method to transform recorded speech to sound as though it had been recorde…
A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences Open
Many audio processing tasks require perceptual assessment. The ``gold standard`` of obtaining human judgments is time-consuming, expensive, and cannot be used as an optimization criterion. On the other hand, automated metrics are efficient…
Text-based editing of talking-head video Open
Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the…
A New Massive Open Online Course on Natural Disasters Open
Two professors put their college course online. Enrollment jumped more than 20-fold, and a forum for exchanging ideas with a multigenerational international community was born.
High-Precision Localization Using Ground Texture Open
Location-aware applications play an increasingly critical role in everyday life. However, satellite-based localization (e.g., GPS) has limited accuracy and can be unusable in dense urban areas and indoors. We introduce an image-based globa…