A V Subramanyam
YOU?
Author Swipe
View article: Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining
Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining Open
Scaling laws for language model training traditionally characterize how performance scales with model size and dataset volume. Prior work has explored architecture variants and data treatments such as dataset filtering and noise injection …
View article: Nearest Neighbor Projection Removal Adversarial Training
Nearest Neighbor Projection Removal Adversarial Training Open
Deep neural networks have exhibited impressive performance in image classification tasks but remain vulnerable to adversarial examples. Standard adversarial training enhances robustness but typically fails to explicitly address inter-class…
View article: Boosting Weak Positives for Text Based Person Search
Boosting Weak Positives for Text Based Person Search Open
Large vision-language models have revolutionized cross-modal object retrieval, but text-based person search (TBPS) remains a challenging task due to limited data and fine-grained nature of the task. Existing methods primarily focus on alig…
View article: LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation
LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation Open
In the current paradigm of image captioning, deep learning models are trained to generate text from image embeddings of latent features. We challenge the assumption that fine-tuning of large, bespoke models is required to improve model gen…
View article: Keypoint Aware Masked Image Modelling
Keypoint Aware Masked Image Modelling Open
SimMIM is a widely used method for pretraining vision transformers using masked image modeling. However, despite its success in fine-tuning performance, it has been shown to perform sub-optimally when used for linear probing. We propose an…
View article: Resource Efficient Perception for Vision Systems
Resource Efficient Perception for Vision Systems Open
Despite the rapid advancement in the field of image recognition, the processing of high-resolution imagery remains a computational challenge. However, this processing is pivotal for extracting detailed object insights in areas ranging from…
View article: Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models for Scene Graphs
Image Synthesis with Graph Conditioning: CLIP-Guided Diffusion Models for Scene Graphs Open
Advancements in generative models have sparked significant interest in generating images while adhering to specific structural guidelines. Scene graph to image generation is one such task of generating images which are consistent with the …
View article: Language Guided Adversarial Purification
Language Guided Adversarial Purification Open
Adversarial purification using generative models demonstrates strong adversarial defense performance. These methods are classifier and attack-agnostic, making them versatile but often computationally intensive. Recent strides in diffusion …
View article: IIITD-20K: Dense captioning for Text-Image ReID
IIITD-20K: Dense captioning for Text-Image ReID Open
Text-to-Image (T2I) ReID has attracted a lot of attention in the recent past. CUHK-PEDES, RSTPReid and ICFG-PEDES are the three available benchmarks to evaluate T2I ReID methods. RSTPReid and ICFG-PEDES comprise of identities from MSMT17 b…
View article: Certified Zeroth-order Black-Box Defense with Robust UNet Denoiser
Certified Zeroth-order Black-Box Defense with Robust UNet Denoiser Open
Certified defense methods against adversarial perturbations have been recently investigated in the black-box setting with a zeroth-order (ZO) perspective. However, these methods suffer from high model variance with low performance on high-…
View article: Meta Generative Attack on Person Reidentification
Meta Generative Attack on Person Reidentification Open
Adversarial attacks have been recently investigated in person re-identification. These attacks perform well under cross dataset or cross model setting. However, the challenges present in cross-dataset cross-model scenario does not allow th…
View article: Antenna Design Optimization using GAN-based Surrogate Model
Antenna Design Optimization using GAN-based Surrogate Model Open
Deep neural network (DNN) based surrogate models have been used to rapidly create antenna characteristics as a function of design geometry parameters in place of computationally expensive electromagnetic (EM) solvers.The limitation of sing…
View article: Novel deep learning framework for wideband spectrum characterization at sub-Nyquist rate
Novel deep learning framework for wideband spectrum characterization at sub-Nyquist rate Open
View article: OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications
OneStopTuner: An End to End Architecture for JVM Tuning of Spark Applications Open
Java is the backbone of widely used big data frameworks, such as Apache Spark, due to its productivity, portability from JVM-based execution, and support for a rich set of libraries. However, the performance of these applications can widel…
View article: Novel Deep Learning Framework for Wideband Spectrum Characterization at\n Sub-Nyquist Rate
Novel Deep Learning Framework for Wideband Spectrum Characterization at\n Sub-Nyquist Rate Open
Introduction of spectrum-sharing in 5G and subsequent generation networks\ndemand base-station(s) with the capability to characterize the wideband\nspectrum spanned over licensed, shared and unlicensed non-contiguous frequency\nbands. Spec…
View article: SenseNet: Deep Learning based Wideband spectrum sensing and modulation classification network.
SenseNet: Deep Learning based Wideband spectrum sensing and modulation classification network. Open
Next generation networks are expected to operate in licensed, shared as well
as unlicensed spectrum to support spectrum demands of a wide variety of
services.Due to shortage of radio spectrum, the need for communication
systems(like cognit…
View article: Attentional Road Safety Networks
Attentional Road Safety Networks Open
Road safety mapping using satellite images is a cost-effective but a challenging problem for smart city planning. The scarcity of labeled data, misalignment and ambiguity makes it hard for supervised deep networks to learn efficient embedd…
View article: CARF-Net: CNN attention and RNN fusion network for video-based person reidentification
CARF-Net: CNN attention and RNN fusion network for video-based person reidentification Open
Video-based person reidentification is a challenging and important task in surveillance-based applications. Toward this, several shallow and deep networks have been proposed. However, the performance of existing shallow networks does not g…
View article: Hdrnet: Person Re-Identification Using Hybrid Sampling in Deep Reconstruction Network
Hdrnet: Person Re-Identification Using Hybrid Sampling in Deep Reconstruction Network Open
Person re-identification (re-id) is the task of identifying a person across non-overlapping cameras. Most of the current techniques apply deep learning and achieve a significant accuracy. However, learning a deep model that can generalize …
View article: Joint Spatial and Discrete Cosine Transform Domain-Based Counter Forensics for Adaptive Contrast Enhancement
Joint Spatial and Discrete Cosine Transform Domain-Based Counter Forensics for Adaptive Contrast Enhancement Open
Contrast enhancement (CE) is a common post-processing step in image forgery to create visually convincing tampered images. However, the artifacts embedded during this process can be captured to determine the presence of CE. To overcome the…
View article: Framework on Quality of Service in Mobile Ad Hoc Networks
Framework on Quality of Service in Mobile Ad Hoc Networks Open
Mobile Ad Hoc Network (MANET) has been continuously self-configuring network composed of a set of mobile devices that can communicate between them without infrastructure connected wirelessly.With the expanding scope of MANET applications t…
View article: PicHunt: Social Media Image Retrieval for Improved Law Enforcement
PicHunt: Social Media Image Retrieval for Improved Law Enforcement Open
First responders are increasingly using social media to identify and reduce crime for well-being and safety of the society. Images shared on social media hurting religious, political, communal and other sentiments of people, often instigat…
View article: Prevalence of hypothyroidism in common bile duct stone patients
Prevalence of hypothyroidism in common bile duct stone patients Open
Background: Hypothyroidism affects bile content, bile flow, and functions of the sphincter of Oddi, thereby increases the formation of common bile duct (CBD) stones. The exact prevalence of hypothyroidism in CBD stone patients is not known…