Explanipedia

Towards an Automated Multimodal Approach for Video Summarization: Building a Bridge Between Text, Audio and Facial Cue-Based Summarization Open

Md. Moinul Islam, Sofoklis Kakouros, Janne Heikkilä, Mourad Oussalah · 2025

The increasing volume of video content in educational, professional, and social domains necessitates effective summarization techniques that go beyond traditional unimodal approaches. This paper proposes a behaviour-aware multimodal video …

Investigating the Impact of Word Informativeness on Speech Emotion Recognition Open

Sofoklis Kakouros · 2025

In emotion recognition from speech, a key challenge lies in identifying speech signal segments that carry the most relevant acoustic variations for discerning specific emotions. Traditional approaches compute functionals for features such …

Sounding Like a Winner? Prosodic Differences in Post-Match Interviews Open

Sofoklis Kakouros, Haoyu Chen · 2025

This study examines the prosodic characteristics associated with winning and losing in post-match tennis interviews. Additionally, this research explores the potential to classify match outcomes solely based on post-match interview recordi…

Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody Open

Sofoklis Kakouros, Juraj Šimko, Martti Vainio, Antti Suni · 2023

This paper investigates the use of word surprisal, a measure of the predictability of a word in a given context, as a feature to aid speech synthesis prosody. We explore how word surprisal extracted from large language models (LLMs) correl…

The Power of Prosody and Prosody of Power: An Acoustic Analysis of Finnish Parliamentary Speech Open

Martti Vainio, Antti Suni, Juraj Šimko, Sofoklis Kakouros · 2023

Political science Psychology Computer science

Parliamentary recordings provide a rich source of data for studying how politicians use speech to convey their messages and influence their audience. This provides a unique context for studying how politicians use speech, especially prosod…

North Sámi Dialect Identification with Self-supervised Speech Models Open

Sofoklis Kakouros, Katri Hiovain-Asikainen · 2023

Computer science Philosophy

The North Sámi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary. The unique geopolitical location of NS speakers means that in many ca…

What does BERT learn about prosody? Open

Sofoklis Kakouros, Johannah O’Mahony · 2023

Computer science

Language models have become nearly ubiquitous in natural language processing applications achieving state-of-the-art results in many tasks including prosody. As the model design does not define predetermined linguistic targets during train…

Speech-based emotion recognition with self-supervised models using attentive channel-wise correlations and label smoothing Open

Sofoklis Kakouros, Themos Stafylakis, Ladislav Mošner, Lukáš Burget · 2022

Computer science

When recognizing emotions from speech, we encounter two common problems: how to optimally capture emotion-relevant information from the speech signal and how to best quantify or categorize the noisy subjective emotion labels. Self-supervis…

Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations Open

Themos Stafylakis, Ladislav Mošner, Sofoklis Kakouros, Oldřich Plchot, Lukáš Burget , et al. · 2022

Computer science Mathematics Political science

Self-supervised learning of speech representations from large amounts of unlabeled data has enabled state-of-the-art results in several speech processing tasks. Aggregating these speech representations across time is typically approached b…

The Effects of a Digital Articulatory Game on the Ability to Perceive Speech-Sound Contrasts in Another Language Open

Sari Ylinen, Anna-Riikka Smolander, Reima Karhila, Sofoklis Kakouros, Jari Lipsanen , et al. · 2021

Computer science Psychology Philosophy

Digital and mobile devices enable easy access to applications for the learning of foreign languages. However, experimental studies on the effectiveness of these applications are scarce. Moreover, it is not understood whether the effects of…

Comparative Analysis of Majority Language Influence on North Sámi Prosody Using WaveNet-Based modeling Open

Katri Hiovain, Antti Suni, Sofoklis Kakouros, Juraj Šimko · 2020

Computer science Psychology Biology

The Finnmark North Sámi is a variety of North Sámi language, an indigenous, endangered minority language spoken in the northernmost parts of Norway and Finland. The speakers of this language are bilingual, and regularly speak the majority …

Prosodic Representations of Prominence Classification Neural Networks and Autoencoders Using Bottleneck Features Open

Sofoklis Kakouros, Antti Suni, Juraj Šimko, Martti Vainio · 2019

Computer science

Prominence perception has been known to correlate with a complex interplay of the acoustic features of energy, fundamental frequency, spectral tilt, and duration. The contribution and importance of each of these features in distinguishing …

Predicting Prosodic Prominence from Text with Pre-trained Contextualized\n Word Representations Open

Aarne Talman, Antti Suni, Hande Çelikkanat, Sofoklis Kakouros, Jörg Tiedemann , et al. · 2019

Computer science Mathematics Philosophy

In this paper we introduce a new natural language processing dataset and\nbenchmark for predicting prosodic prominence from written text. To our\nknowledge this will be the largest publicly available dataset with prosodic\nlabels. We descr…

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations Open

Aarne Talman, Antti Suni, Hande Çelikkanat, Sofoklis Kakouros, Jörg Tiedemann , et al. · 2019

Computer science Mathematics Geography

In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe…

Cross-linguistic Influences on Sentence Accent Detection in Background Noise Open

Odette Scharenborg, Sofoklis Kakouros, Brechtje Post, Fanny Meunier · 2019

Psychology Computer science Philosophy

This paper investigates whether sentence accent detection in a non-native language is dependent on (relative) similarity between prosodic cues to accent between the non-native and the native language, and whether cross-linguistic differenc…

The Effect of Noise on Emotion Perception in an Unknown Language Open

Odette Scharenborg, Sofoklis Kakouros, Jiska Koemans · 2018

Psychology Computer science

This is the first study investigating the influence of “realistic” noise on verbal emotion perception in an unknown language. We do so by linking emotion perception to acoustic characteristics known to be correlated with emotion perception…

Sentence Accent Perception in Noise by French Non-Native Listeners of English Open

Odette Scharenborg, Fanny Meunier, Sofoklis Kakouros, Brechtje Post · 2018

Computer science Psychology Physics

International audience

Evaluation of Spectral Tilt Measures for Sentence Prominence Under Different Noise Conditions Open

Sofoklis Kakouros, Okko Räsänen, Paavo Alku · 2017

Computer science Mathematics Physics

Spectral tilt has been suggested to be a correlate of prominence in speech, although several studies have not replicated this empirically. This may be partially due to the lack of a standard method for tilt estimation from speech, renderin…

Cognitive and probabilistic basis of prominence perception in speech Open

Sofoklis Kakouros · 2017

Psychology Computer science Mathematics

The research in this thesis examines the topic of the cognitive and probabilistic nature of prominence perception in speech. In recent years, there has been an accumulating number of studies from linguistics, phonetics, and neuroscience pr…

Perception of Sentence Stress in Speech Correlates With the Temporal Unpredictability of Prosodic Features Open

Sofoklis Kakouros, Okko Räsänen · 2015

Psychology Computer science Mathematics

Numerous studies have examined the acoustic correlates of sentential stress and its underlying linguistic functionality. However, the mechanism that connects stress cues to the listener's attentional processing has remained unclear. Also, …

Sofoklis Kakouros YOU? Author Swipe