Silvan Mertes
YOU?
Author Swipe
View article: GestaltGAN: synthetic photorealistic portraits of individuals with rare genetic disorders
GestaltGAN: synthetic photorealistic portraits of individuals with rare genetic disorders Open
The facial gestalt (overall facial morphology) is a characteristic clinical feature in many genetic disorders that is often essential for suspecting and establishing a specific diagnosis. Therefore, publishing images of individuals affecte…
View article: The ForDigitStress Dataset: A Multi-Modal Dataset for Automatic Stress Recognition
The ForDigitStress Dataset: A Multi-Modal Dataset for Automatic Stress Recognition Open
We present a multi-modal stress dataset that uses digital job interviews to induce stress. The dataset provides multi-modal data of 40 participants including audio, video (motion capturing, facial landmarks, eye tracking) as well as physio…
View article: Towards Automated Annotation of Infant-Caregiver Engagement Phases with Multimodal Foundation Models
Towards Automated Annotation of Infant-Caregiver Engagement Phases with Multimodal Foundation Models Open
Caregiver mental health disorders increase the risk of insecure infant attachment and can negatively impact multiple aspects of child development, including cognitive, emotional, and social growth. Infant-caregiver interactions contain sub…
View article: XR Composition in the Wild: The Impact of User Environments on Creativity, UX and Flow during Music Production in Augmented Reality
XR Composition in the Wild: The Impact of User Environments on Creativity, UX and Flow during Music Production in Augmented Reality Open
With the advent of HMD-based Mixed Reality, or “Spatial Computing” as framed by Apple, creativity- and productivity-related use-cases in XR, such as music production, are rising in popularity. However, even though the importance of environ…
View article: A Machine Learning-Driven Interactive Training System for Extreme Vocal Techniques
A Machine Learning-Driven Interactive Training System for Extreme Vocal Techniques Open
The scarcity of vocal instructors proficient in extreme vocal techniques and the lack of individualized feedback present challenges for novices learning these techniques. Therefore, this work explores the use of neural networks to provide …
View article: GestaltGAN: Synthetic photorealistic portraits of individuals with rare genetic disorders
GestaltGAN: Synthetic photorealistic portraits of individuals with rare genetic disorders Open
The facial gestalt (overall facial morphology) is a characteristic clinical feature in many genetic disorders that is often essential for suspecting and establishing a specific diagnosis. For that reason, publishing images of individuals a…
View article: VoiceX: A Text-To-Speech Framework for Custom Voices
VoiceX: A Text-To-Speech Framework for Custom Voices Open
Modern TTS systems are capable of creating highly realistic and natural-sounding speech. Despite these developments, the process of customizing TTS voices remains a complex task, mostly requiring the expertise of specialists within the fie…
View article: The STOIC2021 COVID-19 AI challenge: Applying reusable training methodologies to private data
The STOIC2021 COVID-19 AI challenge: Applying reusable training methodologies to private data Open
Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions r…
View article: Anonymization of Faces
Anonymization of Faces Open
Zusammenfassung This paper explores face anonymization techniques in the context of the General Data Protection Regulation (GDPR) amidst growing privacy concerns due to the widespread use of personal data in machine learning. We focus on u…
View article: Giving Robots a Voice: Human-in-the-Loop Voice Creation and open-ended Labeling
Giving Robots a Voice: Human-in-the-Loop Voice Creation and open-ended Labeling Open
Speech is a natural interface for humans to interact with robots. Yet, aligning a robot's voice to its appearance is challenging due to the rich vocabulary of both modalities. Previous research has explored a few labels to describe robots …
View article: Relevant Irrelevance: Generating Alterfactual Explanations for Image Classifiers
Relevant Irrelevance: Generating Alterfactual Explanations for Image Classifiers Open
In this paper, we demonstrate the feasibility of alterfactual explanations for black box image classifiers. Traditional explanation mechanisms from the field of Counterfactual Thinking are a widely-used paradigm for Explainable Artificial …
View article: The feeling of being classified: raising empathy and awareness for AI bias through perspective-taking in VR
The feeling of being classified: raising empathy and awareness for AI bias through perspective-taking in VR Open
In a world increasingly driven by AI systems, controversial use cases for AI that significantly affect people’s lives become more likely scenarios. Hence, increasing awareness of AI bias that might affect underprivileged groups becomes an …
View article: The AffectToolbox: Affect Analysis for Everyone
The AffectToolbox: Affect Analysis for Everyone Open
In the field of affective computing, where research continually advances at a rapid pace, the demand for user-friendly tools has become increasingly apparent. In this paper, we present the AffectToolbox, a novel software system that aims t…
View article: GANonymization: A GAN-Based Face Anonymization Framework for Preserving Emotional Expressions
GANonymization: A GAN-Based Face Anonymization Framework for Preserving Emotional Expressions Open
In recent years, the increasing availability of personal data has raised concerns regarding privacy and security. One of the critical processes to address these concerns is data anonymization, which aims to protect individual privacy and p…
View article: ASMRcade: Interactive Audio Triggers for an Autonomous Sensory Meridian Response
ASMRcade: Interactive Audio Triggers for an Autonomous Sensory Meridian Response Open
Autonomous Sensory Meridian Response (ASMR) is a sensory phenomenon involving pleasurable tingling sensations in response to stimuli such as whispering, tapping, and hair brushing. It is increasingly used to promote health and well-being, …
View article: Multimodal Irony for Virtual Characters
Multimodal Irony for Virtual Characters Open
Humor is an important communicative skill in human interactions. Intelligent virtual agents can leverage it to increase their believability and overall interaction experience. In this paper, we focus on transferring and implementing existi…
View article: The Affective Bar Piano
The Affective Bar Piano Open
Music is a great way of supporting a story. It adds a new layer of affective information and as such substantially increases the listening experience in storytelling scenarios. However, in real-time settings, creating emotionally fitting m…
View article: The STOIC2021 COVID-19 AI challenge: applying reusable training methodologies to private data
The STOIC2021 COVID-19 AI challenge: applying reusable training methodologies to private data Open
Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions r…
View article: Towards Automated COVID-19 Presence and Severity Classification
Towards Automated COVID-19 Presence and Severity Classification Open
COVID-19 presence classification and severity prediction via (3D) thorax computed tomography scans have become important tasks in recent times. Especially for capacity planning of intensive care units, predicting the future severity of a C…
View article: Towards Automated COVID-19 Presence and Severity Classification
Towards Automated COVID-19 Presence and Severity Classification Open
COVID-19 presence classification and severity prediction via (3D) thorax computed tomography scans have become important tasks in recent times. Especially for capacity planning of intensive care units, predicting the future severity of a C…
View article: GANonymization: A GAN-based Face Anonymization Framework for Preserving Emotional Expressions
GANonymization: A GAN-based Face Anonymization Framework for Preserving Emotional Expressions Open
In recent years, the increasing availability of personal data has raised concerns regarding privacy and security. One of the critical processes to address these concerns is data anonymization, which aims to protect individual privacy and p…
View article: Wish You Were Here: Mental and Physiological Effects of Remote Music Collaboration in Mixed Reality
Wish You Were Here: Mental and Physiological Effects of Remote Music Collaboration in Mixed Reality Open
With face-to-face music collaboration being severely limited during the recent pandemic, mixed reality technologies and their potential to provide musicians a feeling of "being there" with their musical partner can offer tremendous opportu…
View article: ForDigitStress: A multi-modal stress dataset employing a digital job interview scenario
ForDigitStress: A multi-modal stress dataset employing a digital job interview scenario Open
We present a multi-modal stress dataset that uses digital job interviews to induce stress. The dataset provides multi-modal data of 40 participants including audio, video (motion capturing, facial recognition, eye tracking) as well as phys…
View article: An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Open
Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understand…
View article: GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations
GANterfactual-RL: Understanding Reinforcement Learning Agents' Strategies through Visual Counterfactual Explanations Open
Counterfactual explanations are a common tool to explain artificial intelligence models. For Reinforcement Learning (RL) agents, they answer "Why not?" or "What if?" questions by illustrating what minimal change to a state is needed such t…
View article: An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Open
Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understand…
View article: Alterfactual Explanations -- The Relevance of Irrelevance for Explaining AI Systems
Alterfactual Explanations -- The Relevance of Irrelevance for Explaining AI Systems Open
Explanation mechanisms from the field of Counterfactual Thinking are a widely-used paradigm for Explainable Artificial Intelligence (XAI), as they follow a natural way of reasoning that humans are familiar with. However, all common approac…
View article: "GAN I hire you?" -- A System for Personalized Virtual Job Interview Training
"GAN I hire you?" -- A System for Personalized Virtual Job Interview Training Open
Job interviews are usually high-stakes social situations where professional and behavioral skills are required for a satisfactory outcome. Professional job interview trainers give educative feedback about the shown behavior according to co…