Richa Singh
YOU?
Author Swipe
View article: Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything
Agent-Omni: Test-Time Multimodal Reasoning via Model Coordination for Understanding Anything Open
Multimodal large language models (MLLMs) have shown strong capabilities but remain limited to fixed modality pairs and require costly fine-tuning with large aligned datasets. Building fully omni-capable models that can integrate text, imag…
View article: Multimodal dual-stage feature refinement for robust skin lesion classification
Multimodal dual-stage feature refinement for robust skin lesion classification Open
Skin cancer continues to pose a formidable global health challenge, where expedient detection is paramount to diminishing mortality. However, the inherent heterogeneity of skin lesions, exacerbated by class imbalance, frequently undermines…
View article: Silicodata: An Annotated Benchmark CXR Dataset for Silicosis Detection
Silicodata: An Annotated Benchmark CXR Dataset for Silicosis Detection Open
View article: TAIGen: Training-Free Adversarial Image Generation via Diffusion Models
TAIGen: Training-Free Adversarial Image Generation via Diffusion Models Open
Adversarial attacks from generative models often produce low-quality images and require substantial computational resources. Diffusion models, though capable of high-quality generation, typically need hundreds of sampling steps for adversa…
View article: How To Impact AI In Traffic Solution
How To Impact AI In Traffic Solution Open
Introduction: Traffic congestion has been a city issue long enough, and the existing systems have not succeeded in managing it. AI-based models facilitate real-time optimization with enhanced analytics. AI application to traffic management…
View article: Artificial Intelligence: A Paradigm Shift in General Dental Practice
Artificial Intelligence: A Paradigm Shift in General Dental Practice Open
AI is rapidly transforming the landscape of general dental practice by enhancing diagnostic accuracy, offering innovative treatment strategies, and streamlining administrative workflows. This descriptive review explores the latest advancem…
View article: AQUAFace: Age-Invariant Quality Adaptive Face Recognition for Unconstrained Selfie vs ID Verification
AQUAFace: Age-Invariant Quality Adaptive Face Recognition for Unconstrained Selfie vs ID Verification Open
Face recognition in the presence of age and quality variations poses a formidable challenge. While recent margin-based loss functions have shown promise in addressing these variations individually, real-world scenarios such as selfie versu…
View article: 3D virtual reality and image processing in landscape design
3D virtual reality and image processing in landscape design Open
View article: HyperSpaceX: Radial and Angular Exploration of HyperSpherical Dimensions
HyperSpaceX: Radial and Angular Exploration of HyperSpherical Dimensions Open
Traditional deep learning models rely on methods such as softmax cross-entropy and ArcFace loss for tasks like classification and face recognition. These methods mainly explore angular features in a hyperspherical space, often resulting in…
View article: Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations
Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations Open
Existing vision-language models (VLMs) treat text descriptions as a unit, confusing individual concepts in a prompt and impairing visual semantic matching and reasoning. An important aspect of reasoning in logic and language is negations. …
View article: BirdCollect: A Comprehensive Benchmark for Analyzing Dense Bird Flock Attributes
BirdCollect: A Comprehensive Benchmark for Analyzing Dense Bird Flock Attributes Open
Automatic recognition of bird behavior from long-term, un controlled outdoor imagery can contribute to conservation efforts by enabling large-scale monitoring of bird populations. Current techniques in AI-based wildlife monitoring have foc…
View article: Adventures of Trustworthy Vision-Language Models: A Survey
Adventures of Trustworthy Vision-Language Models: A Survey Open
Recently, transformers have become incredibly popular in computer vision and vision-language tasks. This notable rise in their usage can be primarily attributed to the capabilities offered by attention mechanisms and the outstanding abilit…
View article: Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration
Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration Open
The rising global prevalence of skin conditions, some of which can escalate to life-threatening stages if not timely diagnosed and treated, presents a significant healthcare challenge. This issue is particularly acute in remote areas where…
View article: Adventures of Trustworthy Vision-Language Models: A Survey
Adventures of Trustworthy Vision-Language Models: A Survey Open
Recently, transformers have become incredibly popular in computer vision and vision-language tasks. This notable rise in their usage can be primarily attributed to the capabilities offered by attention mechanisms and the outstanding abilit…
View article: On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms Open
Artificial Intelligence (AI) has made its way into various scientific fields, providing astonishing improvements over existing algorithms for a wide variety of tasks. In recent years, there have been severe concerns over the trustworthines…
View article: Multi-task Explainable Skin Lesion Classification
Multi-task Explainable Skin Lesion Classification Open
Skin cancer is one of the deadliest diseases and has a high mortality rate if left untreated. The diagnosis generally starts with visual screening and is followed by a biopsy or histopathological examination. Early detection can aid in low…
View article: On AI-Assisted Pneumoconiosis Detection from Chest X-rays
On AI-Assisted Pneumoconiosis Detection from Chest X-rays Open
According to theWorld Health Organization, Pneumoconiosis affects millions of workers globally, with an estimated 260,000 deaths annually. The burden of Pneumoconiosis is particularly high in low-income countries, where occupational safety…
View article: Long-term Monitoring of Bird Flocks in the Wild
Long-term Monitoring of Bird Flocks in the Wild Open
Monitoring and analysis of wildlife are key to conservation planning and conflict management. The widespread use of camera traps coupled with AI-based analysis tools serves as an excellent example of successful and non-invasive use of tech…
View article: Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects Open
Audio has become an increasingly crucial biometric modality due to its ability to provide an intuitive way for humans to interact with machines. It is currently being used for a range of applications including person authentication to bank…
View article: NutriAI: AI-Powered Child Malnutrition Assessment in Low-Resource Environments
NutriAI: AI-Powered Child Malnutrition Assessment in Low-Resource Environments Open
Malnutrition among infants and young children is a pervasive public health concern, particularly in developing countries where resources are limited. Millions of children globally suffer from malnourishment and its complications1. Despite …
View article: Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects
Uncovering the Deceptions: An Analysis on Audio Spoofing Detection and Future Prospects Open
Audio has become an increasingly crucial biometric modality due to its ability to provide an intuitive way for humans to interact with machines. It is currently being used for a range of applications, including person authentication to ban…
View article: Best Paper Section IEEE International Conference on Automatic Face and Gesture Recognition 2021
Best Paper Section IEEE International Conference on Automatic Face and Gesture Recognition 2021 Open
The IEEE International Conference on Automatic Face and Gesture Recognition (FG) is the premier international conference on vision-based automatic face and body behavior analysis and applications. Since the first meeting in Zurich in 1994,…
View article: IdProv: Identity-Based Provenance for Synthetic Image Generation (Student Abstract)
IdProv: Identity-Based Provenance for Synthetic Image Generation (Student Abstract) Open
Recent advancements in Generative Adversarial Networks (GANs) have made it possible to obtain high-quality face images of synthetic identities. These networks see large amounts of real faces in order to learn to generate realistic looking …
View article: AI-based radiodiagnosis using chest X-rays: A review
AI-based radiodiagnosis using chest X-rays: A review Open
Chest Radiograph or Chest X-ray (CXR) is a common, fast, non-invasive, relatively cheap radiological examination method in medical sciences. CXRs can aid in diagnosing many lung ailments such as Pneumonia, Tuberculosis, Pneumoconiosis, COV…
View article: Corruption Depth: Analysis of DNN Depth for Misclassification
Corruption Depth: Analysis of DNN Depth for Misclassification Open
View article: A novel abnormality annotation database for COVID-19 affected frontal lung X-rays
A novel abnormality annotation database for COVID-19 affected frontal lung X-rays Open
Consistent clinical observations of characteristic findings of COVID-19 pneumonia on chest X-rays have attracted the research community to strive to provide a fast and reliable method for screening suspected patients. Several machine learn…
View article: On AI Approaches for Promoting Maternal and Neonatal Health in Low Resource Settings: A Review
On AI Approaches for Promoting Maternal and Neonatal Health in Low Resource Settings: A Review Open
A significant challenge for hospitals and medical practitioners in low- and middle-income nations is the lack of sufficient health care facilities for timely medical diagnosis of chronic and deadly diseases. Particularly, maternal and neon…
View article: Boosting Face Presentation Attack Detection in Multi-Spectral Videos Through Score Fusion of Wavelet Partition Images
Boosting Face Presentation Attack Detection in Multi-Spectral Videos Through Score Fusion of Wavelet Partition Images Open
Presentation attack detection (PAD) algorithms have become an integral requirement for the secure usage of face recognition systems. As face recognition algorithms and applications increase from constrained to unconstrained environments an…
View article: Facial Retouching and Alteration Detection
Facial Retouching and Alteration Detection Open
On the social media platforms, the filters for digital retouching and face beautification have become a common trend. With the availability of easy-to-use image editing tools, the generation of altered images has become an effortless task.…
View article: MagNet: Detecting Digital Presentation Attacks on Face Recognition
MagNet: Detecting Digital Presentation Attacks on Face Recognition Open
Presentation attacks on face recognition systems are classified into two categories: physical and digital. While much research has focused on physical attacks such as photo, replay, and mask attacks, digital attacks such as morphing have r…