K. Madhava Krishna
YOU?
Author Swipe
View article: DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps
DAGDiff: Guiding Dual-Arm Grasp Diffusion to Stable and Collision-Free Grasps Open
Reliable dual-arm grasping is essential for manipulating large and complex objects but remains a challenging problem due to stability, collision, and generalization requirements. Prior methods typically decompose the task into two independ…
View article: MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control
MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control Open
Navigating unknown environments with a single RGB camera is challenging, as the lack of depth information prevents reliable collision-checking. While some methods use estimated depth to build collision maps, we found that depth estimates f…
View article: DG16M: A Large-Scale Dataset for Dual-Arm Grasping with Force-Optimized Grasps
DG16M: A Large-Scale Dataset for Dual-Arm Grasping with Force-Optimized Grasps Open
Dual-arm robotic grasping is crucial for handling large objects that require stable and coordinated manipulation. While single-arm grasping has been extensively studied, datasets tailored for dual-arm settings remain scarce. We introduce a…
View article: Dynamic safety cases for frontier AI
Dynamic safety cases for frontier AI Open
Frontier artificial intelligence (AI) systems present both benefits and risks to society. Safety cases - structured arguments supported by evidence - are one way to help ensure the safe development and deployment of these systems. Yet the …
View article: MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation
MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation Open
Recovering metric depth from a single image remains a fundamental challenge in computer vision, requiring both scene understanding and accurate scaling. While deep learning has advanced monocular depth estimation, current models often stru…
View article: Imagine-2-Drive: Leveraging High-Fidelity World Models via Multi-Modal Diffusion Policies
Imagine-2-Drive: Leveraging High-Fidelity World Models via Multi-Modal Diffusion Policies Open
World Model-based Reinforcement Learning (WMRL) enables sample efficient policy learning by reducing the need for online interactions which can potentially be costly and unsafe, especially for autonomous driving. However, existing world mo…
View article: DA-VIL: Adaptive Dual-Arm Manipulation with Reinforcement Learning and Variable Impedance Control
DA-VIL: Adaptive Dual-Arm Manipulation with Reinforcement Learning and Variable Impedance Control Open
Dual-arm manipulation is an area of growing interest in the robotics community. Enabling robots to perform tasks that require the coordinated use of two arms, is essential for complex manipulation tasks such as handling large objects, asse…
View article: Towards Global Localization using Multi-Modal Object-Instance Re-Identification
Towards Global Localization using Multi-Modal Object-Instance Re-Identification Open
Re-identification (ReID) is a critical challenge in computer vision, predominantly studied in the context of pedestrians and vehicles. However, robust object-instance ReID, which has significant implications for tasks such as autonomous ex…
View article: Leveraging Cycle-Consistent Anchor Points for Self-Supervised RGB-D Registration
Leveraging Cycle-Consistent Anchor Points for Self-Supervised RGB-D Registration Open
With the rise in consumer depth cameras, a wealth of unlabeled RGB-D data has become available. This prompts the question of how to utilize this data for geometric reasoning of scenes. While many RGB-D registration meth- ods rely on geomet…
View article: Constrained 6-DoF Grasp Generation on Complex Shapes for Improved Dual-Arm Manipulation
Constrained 6-DoF Grasp Generation on Complex Shapes for Improved Dual-Arm Manipulation Open
Efficiently generating grasp poses tailored to specific regions of an object is vital for various robotic manipulation tasks, especially in a dual-arm setup. This scenario presents a significant challenge due to the complex geometries invo…
View article: Bi-level Trajectory Optimization on Uneven Terrains with Differentiable Wheel-Terrain Interaction Model
Bi-level Trajectory Optimization on Uneven Terrains with Differentiable Wheel-Terrain Interaction Model Open
Navigation of wheeled vehicles on uneven terrain necessitates going beyond the 2D approaches for trajectory planning. Specifically, it is essential to incorporate the full 6dof variation of vehicle pose and its associated stability cost in…
View article: LeGo-Drive: Language-enhanced Goal-oriented Closed-Loop End-to-End Autonomous Driving
LeGo-Drive: Language-enhanced Goal-oriented Closed-Loop End-to-End Autonomous Driving Open
Existing Vision-Language models (VLMs) estimate either long-term trajectory waypoints or a set of control actions as a reactive solution for closed-loop planning based on their rich scene comprehension. However, these estimations are coars…
View article: ATPPNet: Attention based Temporal Point cloud Prediction Network
ATPPNet: Attention based Temporal Point cloud Prediction Network Open
Point cloud prediction is an important yet challenging task in the field of autonomous driving. The goal is to predict future point cloud sequences that maintain object structures while accurately representing their temporal motion. These …
View article: LIP-Loc: LiDAR Image Pretraining for Cross-Modal Localization
LIP-Loc: LiDAR Image Pretraining for Cross-Modal Localization Open
Global visual localization in LiDAR-maps, crucial for autonomous driving applications, remains largely unexplored due to the challenging issue of bridging the cross-modal heterogeneity gap. Popular multi-modal learning approach Contrastive…
View article: Automated Detection and Counting of Windows using UAV Imagery based Remote Sensing
Automated Detection and Counting of Windows using UAV Imagery based Remote Sensing Open
Despite the technological advancements in the construction and surveying sector, the inspection of salient features like windows in an under-construction or existing building is predominantly a manual process. Moreover, the number of windo…
View article: Hilbert Space Embedding-based Trajectory Optimization for Multi-Modal Uncertain Obstacle Trajectory Prediction
Hilbert Space Embedding-based Trajectory Optimization for Multi-Modal Uncertain Obstacle Trajectory Prediction Open
Safe autonomous driving critically depends on how well the ego-vehicle can predict the trajectories of neighboring vehicles. To this end, several trajectory prediction algorithms have been presented in the existing literature. Many of thes…
View article: DiffPrompter: Differentiable Implicit Visual Prompts for Semantic-Segmentation in Adverse Conditions
DiffPrompter: Differentiable Implicit Visual Prompts for Semantic-Segmentation in Adverse Conditions Open
Semantic segmentation in adverse weather scenarios is a critical task for autonomous driving systems. While foundation models have shown promise, the need for specialized adaptors becomes evident for handling more challenging scenarios. We…
View article: NeuroSMPC: A Neural Network Guided Sampling Based MPC for On-Road Autonomous Driving
NeuroSMPC: A Neural Network Guided Sampling Based MPC for On-Road Autonomous Driving Open
In this paper we show an effective means of integrating data driven\nframeworks to sampling based optimal control to vastly reduce the compute time\nfor easy adoption and adaptation to real time applications such as on-road\nautonomous dri…
View article: HyP-NeRF: Learning Improved NeRF Priors using a HyperNetwork
HyP-NeRF: Learning Improved NeRF Priors using a HyperNetwork Open
Neural Radiance Fields (NeRF) have become an increasingly popular representation to capture high-quality appearance and shape of scenes and objects. However, learning generalizable NeRF priors over categories of scenes or objects has been …
View article: GDIP: Gated Differentiable Image Processing for Object-Detection in Adverse Conditions
GDIP: Gated Differentiable Image Processing for Object-Detection in Adverse Conditions Open
Detecting objects under adverse weather and lighting conditions is crucial for the safe and continuous operation of an autonomous vehicle, and remains an unsolved problem. We present a Gated Differentiable Image Processing (GDIP) block, a …
View article: Fast Joint Multi-Robot Trajectory Optimization by GPU Accelerated Batch Solution of Distributed Sub-Problems
Fast Joint Multi-Robot Trajectory Optimization by GPU Accelerated Batch Solution of Distributed Sub-Problems Open
We present a joint multi-robot trajectory optimizer that can compute trajectories for tens of robots in aerial swarms within a small fraction of a second. The computational efficiency of our approach is built on breaking the per-iteration …
View article: Approaches and Challenges in Robotic Perception for Table-top Rearrangement and Planning
Approaches and Challenges in Robotic Perception for Table-top Rearrangement and Planning Open
Table-top Rearrangement and Planning is a challenging problem that relies heavily on an excellent perception stack. The perception stack involves observing and registering the 3D scene on the table, detecting what objects are on the table,…
View article: UrbanFly: Uncertainty-Aware Planning for Navigation Amongst High-Rises with Monocular Visual-Inertial SLAM Maps
UrbanFly: Uncertainty-Aware Planning for Navigation Amongst High-Rises with Monocular Visual-Inertial SLAM Maps Open
We present UrbanFly: an uncertainty-aware real-time planning framework for quadrotor navigation in urban high-rise environments. A core aspect of UrbanFly is its ability to robustly plan directly on the sparse point clouds generated by a M…
View article: Design And Analysis Of Three-Output Open Differential with 3-DOF
Design And Analysis Of Three-Output Open Differential with 3-DOF Open
This paper presents a novel passive three-output differential with three degrees of freedom (3DOF), that translates motion and torque from a single input to three outputs. The proposed Three-Output Open Differential is designed such that i…
View article: Monocular multi-layer layout estimation for warehouse racks
Monocular multi-layer layout estimation for warehouse racks Open
Given a monocular colour image of a warehouse rack, we aim to predict the\nbird's-eye view layout for each shelf in the rack, which we term as multi-layer\nlayout prediction. To this end, we present RackLay, a deep neural network for\nreal…
View article: Modular Pipe Climber III with Three-Output Open Differential
Modular Pipe Climber III with Three-Output Open Differential Open
The paper introduces the novel Modular Pipe Climber III with a Three-Output Open Differential (3-OOD) mechanism to eliminate slipping of the tracks due to the changing cross-sections of the pipe. This will be achieved in any orientation of…
View article: Learning Actions for Drift-Free Navigation in Highly Dynamic Scenes
Learning Actions for Drift-Free Navigation in Highly Dynamic Scenes Open
We embark on a hitherto unreported problem of an autonomous robot (self-driving car) navigating in dynamic scenes in a manner that reduces its localization error and eventual cumulative drift or Absolute Trajectory Error, which is pronounc…
View article: CCO-VOXEL: Chance Constrained Optimization over Uncertain Voxel-Grid Representation for Safe Trajectory Planning
CCO-VOXEL: Chance Constrained Optimization over Uncertain Voxel-Grid Representation for Safe Trajectory Planning Open
We present CCO-VOXEL: the very first chance-constrained optimization (CCO) algorithm that can compute trajectory plans with probabilistic safety guarantees in real-time directly on the voxel-grid representation of the world. CCO-VOXEL maps…