Explanipedia

Global Average Feature Augmentation for Robust Semantic Segmentation with Transformers Open

António J. Salgado, Maying Schen, Philipp Harzig, Péter Mayer, José Manuel González y Fernández Valles · 2024

Robustness to out-of-distribution data is crucial for deploying modern neural networks. Recently, Vision Transformers, such as SegFormer for semantic segmentation, have shown impressive robustness to visual corruptions like blur or noise a…

Extended Self-Critical Pipeline for Transforming Videos to Text (TRECVID-VTT Task 2021) -- Team: MMCUniAugsburg Open

Philipp Harzig, Moritz Einfalt, Katja Ludwig, Rainer Lienhart · 2021

Computer science Engineering Art

The Multimedia and Computer Vision Lab of the University of Augsburg participated in the VTT task only. We use the VATEX and TRECVID-VTT datasets for training our VTT models. We base our model on the Transformer approach for both of our su…

Synchronized Audio-Visual Frames with Fractional Positional Encoding for Transformers in Video-to-Text Translation Open

Philipp Harzig, Moritz Einfalt, Rainer Lienhart · 2021

Computer science Physics

Video-to-Text (VTT) is the task of automatically generating descriptions for short audio-visual video clips, which can support visually impaired people to understand scenes of a YouTube video for instance. Transformer architectures have sh…

A comprehensive analysis of classification methods in gastrointestinal endoscopy imaging Open

Debesh Jha, Sharib Ali, Steven A. Hicks, Vajira Thambawita, Hanna Borgli , et al. · 2021

Medicine Computer science Economics

Gastrointestinal (GI) endoscopy has been an active field of research motivated by the large number of highly lethal GI cancers. Early GI cancer precursors are often missed during the endoscopic surveillance. The high missed rate of such ab…

Addressing Data Bias Problems for Chest X-ray Image Report Generation Open

Philipp Harzig, Yanying Chen, Francine Chen, Rainer Lienhart · 2019

Computer science Psychology Physics

Automatic medical report generation from chest X-ray images is one possibility for assisting doctors to reduce their workload. However, the different patterns and data distribution of normal and abnormal cases can bias machine learning mod…

Addressing Data Bias Problems for Chest X-ray Image Report Generation Open

Philipp Harzig, Yanying Chen, Francine Chen, Rainer Lienhart · 2019

Computer science Psychology Physics

Automatic medical report generation from chest X-ray images is one possibility for assisting doctors to reduce their workload. However, the different patterns and data distribution of normal and abnormal cases can bias machine learning mod…

Image Captioning with Clause-Focused Metrics in a Multi-modal Setting for Marketing Open

Philipp Harzig, Dan Zecha, Rainer Lienhart, Carolin Kaiser, René Schallner · 2019

Computer science Physics Philosophy

Automatically generating descriptive captions for images is a well-researched\narea in computer vision. However, existing evaluation approaches focus on\nmeasuring the similarity between two sentences disregarding fine-grained\nsemantics o…

Visual Question Answering With a Hybrid Convolution Recurrent Model Open

Philipp Harzig, Christian Eggert, Rainer Lienhart · 2018

Computer science Economics

Visual Question Answering (VQA) is a relatively new task, which tries to infer answer sentences for an input image coupled with a corresponding question. Instead of dynamically generating answers, they are usually inferred by finding the m…

Multimodal Image Captioning for Marketing Analysis Open

Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser, René Schallner · 2018

Computer science Mathematics Philosophy

Automatically captioning images with natural language sentences is an\nimportant research topic. State of the art models are able to produce\nhuman-like sentences. These models typically describe the depicted scene as a\nwhole and do not t…

Philipp Harzig YOU? Author Swipe