Manuel Brack
YOU?
Author Swipe
View article: Measuring and Guiding Monosemanticity
Measuring and Guiding Monosemanticity Open
There is growing interest in leveraging mechanistic interpretability and controllability to better understand and influence the internal dynamics of large language models (LLMs). However, current methods face fundamental challenges in reli…
View article: LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Inconsistencies
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Inconsistencies Open
Building safe Large Language Models (LLMs) across multiple languages is essential in ensuring both safe access and linguistic diversity. To this end, we conduct a large-scale, comprehensive safety evaluation of the current LLM landscape. F…
View article: SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs
SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs Open
Large Language Models (LLMs) have demonstrated remarkable capabilities in generating human-like text, but their output may not be aligned with the user or even produce harmful content. This paper presents a novel approach to detect and ste…
View article: Core Tokensets for Data-efficient Sequential Training of Transformers
Core Tokensets for Data-efficient Sequential Training of Transformers Open
Deep networks are frequently tuned to novel tasks and continue learning from ongoing data streams. Such sequential training requires consolidation of new and past information, a challenge predominantly addressed by retaining the most impor…
View article: Does CLIP Know My Face?
Does CLIP Know My Face? Open
With the rise of deep learning in various applications, privacy concerns around the protection of training data have become a critical area of research. Whereas prior studies have focused on privacy risks in single-modal models, we introdu…
View article: T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings Open
Tokenizers are crucial for encoding information in Large Language Models, but their development has recently stagnated, and they contain inherent weaknesses. Major limitations include computational overhead, ineffective vocabulary use, and…
View article: LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models
LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models Open
This paper introduces LlavaGuard, a suite of VLM-based vision safeguards that address the critical need for reliable guardrails in the era of large-scale data and models. To this end, we establish a novel open framework, describing a custo…
View article: DeiSAM: Segment Anything with Deictic Prompting
DeiSAM: Segment Anything with Deictic Prompting Open
Large-scale, pre-trained neural networks have demonstrated strong capabilities in various tasks, including zero-shot image segmentation. To identify concrete objects in complex scenes, humans instinctively rely on deictic descriptions in n…
View article: LEDITS++: Limitless Image Editing using Text-to-Image Models
LEDITS++: Limitless Image Editing using Text-to-Image Models Open
Text-to-image diffusion models have recently received increasing interest for their astonishing ability to produce high-fidelity images from solely text inputs. Subsequent research efforts aim to exploit and apply their capabilities to rea…
View article: Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge Open
Text-conditioned image generation models have recently achieved astonishing image quality and alignment results. Consequently, they are employed in a fast-growing number of applications. Since they are highly data-driven, relying on billio…
View article: MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Open
The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful interp…
View article: Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations
Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations Open
Neural network-based image classifiers are powerful tools for computer vision tasks, but they inadvertently reveal sensitive attribute information about their classes, raising concerns about their privacy. To investigate this privacy leaka…
View article: Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness Open
Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly scrap…
View article: SEGA: Instructing Text-to-Image Models using Semantic Guidance
SEGA: Instructing Text-to-Image Models using Semantic Guidance Open
Text-to-image diffusion models have recently received a lot of interest for their astonishing ability to produce high-fidelity images from text only. However, achieving one-shot generation that aligns with the user's intent is nearly impos…
View article: Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge Open
Text-conditioned image generation models have recently achieved astonishing image quality and alignment results.Consequently, they are employed in a fast-growing number of applications.Since they are highly data-driven, relying on billion-…
View article: The Stable Artist: Steering Semantics in Diffusion Latent Space
The Stable Artist: Steering Semantics in Diffusion Latent Space Open
Large, text-conditioned generative diffusion models have recently gained a lot of attention for their impressive performance in generating high-fidelity images from text alone. However, achieving high-quality results is almost unfeasible i…
View article: Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Open
Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of conce…
View article: Does CLIP Know My Face?
Does CLIP Know My Face? Open
With the rise of deep learning in various applications, privacy concerns around the protection of training data have become a critical area of research. Whereas prior studies have focused on privacy risks in single-modal models, we introdu…
View article: I2G Benchmark
I2G Benchmark Open
Inappropriate Image Prompts (I2G) benchmark for text to image diffusion models.