Explanipedia

YAP+: Pad-Layout-Aware Yield Modeling and Simulation for Hybrid Bonding Open

Zhichao Chen, Puneet Gupta · 2025

Near-energy-free photonic Fourier transformation for convolution operation acceleration Open

Hangbo Yang, Nicola Peserico, Shurui Li, Xiaoxuan Ma, Russell L. T. Schwartz , et al. · 2025

ChipletPart: Cost-Aware Partitioning for 2.5D Systems Open

Alexander Graening, Puneet Gupta, Andrew B. Kahng, Bodhisatta Pramanik · 2025

Industry adoption of chiplets has been increasing as a cost-effective option for making larger high-performance systems. Consequently, partitioning large systems into chiplets is increasingly important. In this work, we introduce ChipletPa…

FRED: A Wafer-scale Fabric for 3D Parallel DNN Training Open

Saeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna · 2025

Optimizing Base Layer Design Rule Checks in Chip Physical Design Open

Puneet Gupta · 2025

This article presents a comprehensive analysis of abstract modeling approaches for base layer design rule checks in advanced semiconductor design. As semiconductor technology continues to advance toward smaller nodes, the complexity of bas…

CATCH: a Cost Analysis Tool for Co-optimization of chiplet-based Heterogeneous systems Open

Alexander Graening, Jonti Talukdar, Saptadeep Pal, Krishnendu Chakrabarty, Puneet Gupta · 2025

With the increasing prevalence of chiplet systems in high-performance computing applications, the number of design options has increased dramatically. Instead of chips defaulting to a single die design, now there are options for 2.5D and 3…

Machine Learning-Enhanced Greedy Algorithm for Optimizing Hold Time Violations in Advanced Node SoC Designs Open

Puneet Gupta · 2025

This article presents an innovative approach to resolving hold time violations in advanced technology nodes using a greedy algorithm methodology. The article addresses critical challenges in modern System-on-Chip (SoC) designs, particularl…

Experimental validation of a novel characterization procedure based on fast sweep measurements for linear resonators with a large time constant Open

Alexis Brenes, Jérôme Juillard, Jorge Cuevas Ayala, Lucca Reinehr, Erwan Libessart , et al. · 2024

International audience

FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models Open

Saeed Rashidi, William Won, Sudarshan K. Srinivasan, Puneet Gupta, Tushar Krishna · 2024

Distributed Deep Neural Network (DNN) training is a technique to reduce the training overhead by distributing the training tasks into multiple accelerators, according to a parallelization strategy. However, high-performance compute and int…

Smoothing Disruption Across the Stack: Tales of Memory, Heterogeneity, & Compilers Open

Michael Niemier, Zephan M. Enciso, Monir Sharifi, Xiaobo Sharon Hu, Ian O’Connor , et al. · 2024

International audience

Experimental Validation of a Novel Characterization Technique for Linear Resonators with a Large Time Constant Open

Alexis Brenes, Jérôme Juillard, Jorge Cuevas Ayala, Lucca Reinehr, Erwan Libessart , et al. · 2024

DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems Open

Newsha Ardalani, Saptadeep Pal, Puneet Gupta · 2023

Over the past decade, machine learning model complexity has grown at an extraordinary rate, as has the scale of the systems training such large models. However, there is an alarmingly low hardware utilization (5–20%) in large scale AI syst…

ReFOCUS: Reusing Light for Efficient Fourier Optics-Based Photonic Neural Network Accelerator Open

Shurui Li, Hangbo Yang, Chee Wei Wong, Volker J. Sorger, Puneet Gupta · 2023

In recent years, there has been a significant focus on achieving low-latency and high-throughput convolutional neural network (CNN) inference. Integrated photonics offers the potential to substantially expedite neural networks due to its i…

Cost-Driven Hardware-Software Co-Optimization of Machine Learning Pipelines Open

Ravit Sharma, Wojciech Romaszkan, Feiqian Zhu, Puneet Gupta · 2023

Researchers have long touted a vision of the future enabled by a proliferation of internet-of-things devices, including smart sensors, homes, and cities. Increasingly, embedding intelligence in such devices involves the use of deep neural …

End-to-end differentiability and tensor processing unit computing to accelerate materials’ inverse design Open

Han Liu, Yuhan Liu, Kevin Li, Zhangji Zhao, Samuel S. Schoenholz , et al. · 2023

Numerical simulations have revolutionized material design. However, although simulations excel at mapping an input material to its output property, their direct application to inverse design has traditionally been limited by their high com…

Training Neural Networks for Execution on Approximate Hardware Open

Tianmu Li, Shurui Li, Puneet Gupta · 2023

Approximate computing methods have shown great potential for deep learning. Due to the reduced hardware costs, these methods are especially suitable for inference tasks on battery-operated devices that are constrained by their power budget…

A Nonvolatile Compute-in-Memory Macro Using Voltage-Controlled MRAM and In Situ Magnetic-to-Digital Converter Open

Vinod Kurian Jacob, Jiyue Yang, Haoran He, Puneet Gupta, Kang L. Wang , et al. · 2023

Compute-in-memory (CIM) accelerator has become a popular solution to achieve high energy efficiency for deep learning applications in edge devices. Recent works have demonstrated CIM macros using nonvolatile memories [spin transfer torque …

PhotoFourier: A Photonic Joint Transform Correlator-Based Neural Network Accelerator Open

Shurui Li, Hangbo Yang, Chee Wei Wong, Volker J. Sorger, Puneet Gupta · 2022

The last few years have seen a lot of work to address the challenge of low-latency and high-throughput convolutional neural network inference. Integrated photonics has the potential to dramatically accelerate neural networks because of its…

DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems Open

Newsha Ardalani, Saptadeep Pal, Puneet Gupta · 2022

Over the past decade, machine learning model complexity has grown at an extraordinary rate, as has the scale of the systems training such large models. However there is an alarmingly low hardware utilization (5-20%) in large scale AI syste…

High‐Throughput Multichannel Parallelized Diffraction Convolutional Neural Network Accelerator Open

Zibo Hu, Shurui Li, Russell L. T. Schwartz, Maria Solyanik‐Gorgone, Mario Miscuglio , et al. · 2022

Convolutional neural networks are paramount in image and signal processing, and are responsible for the majority of image recognition power consumption today, concentrated mainly in convolution computations. With convolution operations bei…

Bit-serial Weight Pools: Compression and Arbitrary Precision Execution of Neural Networks on Resource Constrained Processors Open

Shurui Li, Puneet Gupta · 2022

Applications of neural networks on edge systems have proliferated in recent years but the ever-increasing model size makes neural networks not able to deploy on resource-constrained microcontrollers efficiently. We propose bit-serial weigh…

High Throughput Multi-Channel Parallelized Diffraction Convolutional Neural Network Accelerator Open

Zibo Hu, Shurui Li, Russell L. T. Schwartz, Maria Solyanik‐Gorgone, Mario Miscuglio , et al. · 2021

Convolutional neural networks are paramount in image and signal processing including the relevant classification and training tasks alike and constitute for the majority of machine learning compute demand today. With convolution operations…

Lightweight Software-Defined Error Correction for Memories Open

Irina Alam, Lara Dolecek, Puneet Gupta · 2020

Reliability of the memory subsystem is a growing concern in computer architecture and system design. From on-chip embedded memories in Internet-of-Things (IoT) devices and on-chip caches to off-chip main memories, the memory subsystems hav…

Massively parallel amplitude-only Fourier neural network Open

Mario Miscuglio, Zibo Hu, Shurui Li, Jonathan George, Roberto Capanna , et al. · 2020

Machine intelligence has become a driving factor in modern society. However, its demand outpaces the underlying electronic technology due to limitations given by fundamental physics, such as capacitive charging of wires, but also by system…

Channel Tiling for Improved Performance and Accuracy of Optical Neural Network Accelerators Open

Shurui Li, Mario Miscuglio, Volker J. Sorger, Puneet Gupta · 2020

Low latency, high throughput inference on Convolution Neural Networks (CNNs) remains a challenge, especially for applications requiring large input or large kernel sizes. 4F optics provides a solution to accelerate CNNs by converting convo…

Pathfinding for 2.5D interconnect technologies Open

Saptadeep Pal, Puneet Gupta · 2020

As conventional technology scaling becomes harder, 2.5D integration provides a viable pathway to building larger systems at lower cost. Therefore recently, there has been a proliferation of multiple 2.5D integration technologies that offer…

Perceived sources of stress amongst Indian dental students in Bareilly city Open

Adeeba Saleem, Puneet Gupta, KK Shivalingesh, Henna Mir, Divya Srivastav , et al. · 2020

Introduction: In addition to the stresses pertaining to dentistry as a profession, dental students have to face the additional stress of their studies. Through stress can also contribute to decreased student performance. The aim of this st…

Smart Hoover with Mower Open

Puneet Gupta, K Sudha, Pratik Kumar, Chirag Garg, Payal · 2020

This paper presents the advancement in the design and development of a vacuum cleaner with lawn mower. This paper focuses on the developing a handy automated vacuum cleaner with lawn mower robot which operates on Arduino programming and ca…

Implant Surface Microtopography – A Review Open

Sunny Sharma, Gyan P. Bharti, Ramandeep Singh, Puneet Gupta, Basu Dev Basnet , et al. · 2020

Osseointegration is the direct contact between the living bone and the implant surface without interposed soft tissue at the microscopic level and it is a critical process for implant stability and consequent short-and long-term clinical s…

MOMBAT: Heart Rate Monitoring from Face Video using Pulse Modeling and\n Bayesian Tracking Open

Puneet Gupta, Brojeshwar Bhowmick, Arpan Pal · 2020

A non-invasive yet inexpensive method for heart rate (HR) monitoring is of\ngreat importance in many real-world applications including healthcare,\npsychology understanding, affective computing and biometrics. Face videos are\ncurrently ut…

Puneet Gupta YOU? Author Swipe