Explanipedia

Generative Data Mining with Longtail-Guided Diffusion Open

D. Hayden, Mao Ye, Timur Garipov, Gregory P. Meyer, Carl Vondrick , et al. · 2025

It is difficult to anticipate the myriad challenges that a predictive model will encounter once deployed. Common practice entails a reactive, cyclical approach: model deployment, data mining, and retraining. We instead develop a proactive …

DriveGPT: Scaling Autoregressive Behavior Models for Driving Open

Xin Huang, Eric M. Wolff, Paul Vernaza, Tung Phan-Minh, Hongge Chen , et al. · 2024

We present DriveGPT, a scalable behavior model for autonomous driving. We model driving as a sequential decision-making task, and learn a transformer model to predict future agent states as tokens in an autoregressive fashion. We scale up …

PROFIT: A Specialized Optimizer for Deep Fine Tuning Open

Anirudh Chakravarthy, Shuai Zheng, Xin Huang, Sachithra Hemachandra, Xiao Zhang , et al. · 2024

The fine-tuning of pre-trained models has become ubiquitous in generative AI, computer vision, and robotics. Although much attention has been paid to improving the efficiency of fine-tuning model, there has been less scholarship around fin…

VLMine: Long-Tail Data Mining with Vision Language Models Open

Mao Ye, Gregory P. Meyer, Zaiwei Zhang, Dennis Park, Siva Karthik Mustikovela , et al. · 2024

Ensuring robust performance on long-tail examples is an important problem for many real-world applications of machine learning, such as autonomous driving. This work focuses on the problem of identifying rare examples within a corpus of un…

Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous Driving Open

Yichen Xie, Hongge Chen, Gregory P. Meyer, Yong Jae Lee, Eric M. Wolff , et al. · 2024

Due to the lack of depth cues in images, multi-frame inputs are important for the success of vision-based perception, prediction, and planning in autonomous driving. Observations from different angles enable the recovery of 3D object state…

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts Open

Mu Cai, Haotian Liu, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai , et al. · 2023

While existing large vision-language multimodal models focus on whole image understanding, there is a prominent gap in achieving region-specific comprehension. Current approaches that use textual coordinates or spatial encodings often fail…

SHIFT3D: Synthesizing Hard Inputs For Tricking 3D Detectors Open

Hongge Chen, Chen Zhao, Gregory P. Meyer, Dennis Park, Carl Vondrick , et al. · 2023

We present SHIFT3D, a differentiable pipeline for generating 3D shapes that are structurally plausible yet challenging to 3D object detectors. In safety-critical applications like autonomous driving, discovering such novel challenging obje…

NOVA: NOvel View Augmentation for Neural Composition of Dynamic Objects Open

Dakshit Agrawal, Jiajie Xu, Siva Karthik Mustikovela, Ioannis Gkioulekas, Ashish Shrivastava , et al. · 2023

We propose a novel-view augmentation (NOVA) strategy to train NeRFs for photo-realistic 3D composition of dynamic objects in a static scene. Compared to prior work, our framework significantly reduces blending artifacts when inserting mult…

Efficient Transformer-based 3D Object Detection with Dynamic Token Halting Open

Mao Ye, Gregory P. Meyer, Yuning Chai, Qiang Liu · 2023

Balancing efficiency and accuracy is a long-standing problem for deploying deep learning models. The trade-off is even more important for real-time safety-critical systems like autonomous vehicles. In this paper, we propose an effective ap…

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps Open

Mi Lu, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao , et al. · 2022

High Definition (HD) maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. They are critical for several key stages in an autonomous driving system, including motion forecasting and planning. Howeve…

Occupancy Flow Fields for Motion Forecasting in Autonomous Driving Open

Reza Mahjourian, Jinkyu Kim, Yuning Chai, Mingxing Tan, Ben Sapp , et al. · 2022

We propose Occupancy Flow Fields, a new representation for motion forecasting of multiple agents, an important task in autonomous driving. Our representation is a spatio-temporal grid with each grid cell containing both the probability of …

HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps Open

Mi Lu, Hang Zhao, Charlie Nash, Xiaohan Jin, Jiyang Gao , et al. · 2021

High Definition (HD) maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. They are critical for several key stages in an autonomous driving system, including motion forecasting and planning. Howeve…

To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels Open

Yuning Chai, Pei Sun, Jiquan Ngiam, Weiyue Wang, Benjamin Caine , et al. · 2021

3D object detection is vital for many robotics applications. For tasks where a 2D perspective range image exists, we propose to learn a 3D representation directly from this range image view. To this end, we designed a 2D convolutional netw…

RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection Open

Pei Sun, Weiyue Wang, Yuning Chai, Gamaleldin F. Elsayed, Alex Bewley , et al. · 2021

The detection of 3D objects from LiDAR data is a critical component in most autonomous driving systems. Safe, high speed driving needs larger detection ranges, which are enabled by new LiDARs. These larger detection ranges require more eff…

Large Scale Interactive Motion Forecasting for Autonomous Driving : The\n Waymo Open Motion Dataset Open

Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao , et al. · 2021

As autonomous driving systems mature, motion forecasting has received\nincreasing attention as a critical requirement for planning. Of particular\nimportance are interactive situations such as merges, unprotected turns, etc.,\nwhere predic…

Large Scale Interactive Motion Forecasting for Autonomous Driving : The Waymo Open Motion Dataset Open

Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao , et al. · 2021

As autonomous driving systems mature, motion forecasting has received increasing attention as a critical requirement for planning. Of particular importance are interactive situations such as merges, unprotected turns, etc., where predictin…

Pseudo-labeling for Scalable 3D Object Detection Open

Benjamin Caine, Rebecca Roelofs, Vijay Vasudevan, Jiquan Ngiam, Yuning Chai , et al. · 2021

To safely deploy autonomous vehicles, onboard perception systems must work reliably at high accuracy across a diverse set of environments and geographies. One of the most common techniques to improve the efficacy of such systems in new dom…

Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout Open

Chen Zhao, Jiquan Ngiam, Yanping Huang, Thang Luong, Henrik Kretzschmar , et al. · 2020

The vast majority of deep models use multiple gradient signals, typically corresponding to a sum of multiple loss terms, to update a shared set of trainable weights. However, these multiple updates can impede optimal training by pulling th…

TNT: Target-driveN Trajectory Prediction Open

Hang Zhao, Jiyang Gao, Tian Lan, Chen Sun, Benjamin Sapp , et al. · 2020

Predicting the future behavior of moving agents is essential for real world applications. It is challenging as the intent of the agent and the corresponding behavior is unknown and intrinsically multimodal. Our key insight is that for pred…

TNT: Target-driveN Trajectory Prediction Open

Hang Zhao, Jiyang Gao, Tian Lan, Chen Sun, Benjamin Sapp , et al. · 2020

Predicting the future behavior of moving agents is essential for real world applications. It is challenging as the intent of the agent and the corresponding behavior is unknown and intrinsically multimodal. Our key insight is that for pred…

SoDA: Multi-Object Tracking with Soft Data Association Open

Wei-Chih Hung, Henrik Kretzschmar, Tsung-Yi Lin, Yuning Chai, Ruichi Yu , et al. · 2020

Robust multi-object tracking (MOT) is a prerequisite fora safe deployment of self-driving cars. Tracking objects, however, remains a highly challenging problem, especially in cluttered autonomous driving scenes in which objects tend to int…

SurfelGAN: Synthesizing Realistic Sensor Data for Autonomous Driving Open

Zhenpei Yang, Yuning Chai, Dragomir Anguelov, Yin Zhou, Pei Sun , et al. · 2020

Autonomous driving system development is critically dependent on the ability to replay complex and diverse traffic scenarios in simulation. In such scenarios, the ability to accurately simulate the vehicle sensors such as cameras, lidar or…

Scalability in Perception for Autonomous Driving: Waymo Open Dataset Open

Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurélien Chouard, Vijaysai Patnaik , et al. · 2019

The research community has increasing interest in autonomous driving research, despite the resource intensity of obtaining representative real world data. Existing self-driving datasets are limited in the scale and variation of the environ…

Scalability in Perception for Autonomous Driving: An Open Dataset Benchmark Open

Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurélien Chouard, Vijaysai Patnaik , et al. · 2019

The research community has increasing interest in autonomous driving research, despite the resource intensity of obtaining representative real world data. Existing self-driving datasets are limited in the scale and variation of the environ…

MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction Open

Yuning Chai, Benjamin Sapp, Mayank Bansal, Dragomir Anguelov · 2019

Predicting human behavior is a difficult and crucial task required for motion planning. It is challenging in large part due to the highly uncertain and multi-modal set of possible outcomes in real-world domains such as autonomous driving. …

MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for\n Behavior Prediction Open

Yuning Chai, Benjamin Sapp, Mayank Bansal, Dragomir Anguelov · 2019

Predicting human behavior is a difficult and crucial task required for motion\nplanning. It is challenging in large part due to the highly uncertain and\nmulti-modal set of possible outcomes in real-world domains such as autonomous\ndrivin…

StarNet: Targeted Computation for Object Detection in Point Clouds Open

Jiquan Ngiam, Benjamin Caine, Wei Han, Brandon Yang, Yuning Chai , et al. · 2019

Detecting objects from LiDAR point clouds is an important component of self-driving car technology as LiDAR provides high resolution spatial information. Previous work on point-cloud 3D object detection has re-purposed convolutional approa…

Patchwork: A Patch-wise Attention Network for Efficient Object Detection and Segmentation in Video Streams Open

Yuning Chai · 2019

Recent advances in single-frame object detection and segmentation techniques have motivated a wide range of works to extend these methods to process video streams. In this paper, we explore the idea of hard attention aimed for latency-sens…

FEELVOS: Fast End-to-End Embedding Learning for Video Object Segmentation Open

Paul Voigtlaender, Yuning Chai, Florian Schroff, Hartwig Adam, Bastian Leibe , et al. · 2019

Many of the recent successful methods for video object segmentation (VOS) are overly complicated, heavily rely on fine-tuning on the first frame, and/or are slow, and are hence of limited practical use. In this work, we propose FEELVOS as …

Advances in fine-grained visual categorization Open

Yuning Chai · 2015

The objective of this work is to improve performance in fine-grained visual categorization (FGVC). In particular, we are interested in the large-scale classification between hundreds of different flower, bird, dog species. FGVC is challeng…

Yuning Chai YOU? Author Swipe