Zakir Durumeric
YOU?
Author Swipe
View article: Censys: A Map of Internet Hosts and Services
Censys: A Map of Internet Hosts and Services Open
View article: Formalizing Dependence of Web Infrastructure
Formalizing Dependence of Web Infrastructure Open
View article: Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings
Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings Open
Contextual large language model embeddings are increasingly utilized for topic modeling and clustering. However, current methods often scale poorly, rely on opaque similarity metrics, and struggle in multilingual settings. In this work, we…
View article: User Profiles: The Achilles' Heel of Web Browsers
User Profiles: The Achilles' Heel of Web Browsers Open
Web browsers provide the security foundation for our online experiences. Significant research has been done into the security of browsers themselves, but relatively little investigation has been done into how they interact with the operati…
View article: Tracking the Takes and Trajectories of English-Language News Narratives across Trustworthy and Worrisome Websites
Tracking the Takes and Trajectories of English-Language News Narratives across Trustworthy and Worrisome Websites Open
Understanding how misleading and outright false information enters news ecosystems remains a difficult challenge that requires tracking how narratives spread across thousands of fringe and mainstream news websites. To do this, we introduce…
View article: Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings
Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings Open
View article: PressProtect: Helping Journalists Navigate Social Media in the Face of Online Harassment
PressProtect: Helping Journalists Navigate Social Media in the Face of Online Harassment Open
Social media has become a critical tool for journalists to disseminate their work, engage with their audience, and connect with sources. Unfortunately, journalists also regularly endure significant online harassment on social media platfor…
View article: Characterizing the MrDeepFakes Sexual Deepfake Marketplace
Characterizing the MrDeepFakes Sexual Deepfake Marketplace Open
The prevalence of sexual deepfake material has exploded over the past several years. Attackers create and utilize deepfakes for many reasons: to seek sexual gratification, to harass and humiliate targets, or to exert power over an intimate…
View article: On the Centralization and Regionalization of the Web
On the Centralization and Regionalization of the Web Open
Over the past decade, Internet centralization and its implications for both people and the resilience of the Internet has become a topic of active debate. While the networking community informally agrees on the definition of centralization…
View article: Ten Years of ZMap
Ten Years of ZMap Open
Since ZMap's debut in 2013, networking and security researchers have used the open-source scanner to write hundreds of research papers that study Internet behavior. In addition, ZMap has been adopted by the security industry to build new c…
View article: Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites Open
As large language models (LLMs) like ChatGPT have gained traction, an increasing number of news websites have begun utilizing them to generate articles. However, not only can these language models produce factually inaccurate articles on r…
View article: Watch Your Language: Investigating Content Moderation with Large Language Models
Watch Your Language: Investigating Content Moderation with Large Language Models Open
Large language models (LLMs) have exploded in popularity due to their ability to perform a wide array of natural language tasks. Text-based content moderation is one LLM use case that has received recent enthusiasm, however, there is littl…
View article: Partial Mobilization: Tracking Multilingual Information Flows amongst Russian Media Outlets and Telegram
Partial Mobilization: Tracking Multilingual Information Flows amongst Russian Media Outlets and Telegram Open
In response to disinformation and propaganda from Russian online media following the invasion of Ukraine, Russian media outlets such as Russia Today and Sputnik News were banned throughout Europe. To maintain viewership, many of these Russ…
View article: The Code the World Depends On: A First Look at Technology Makers' Open Source Software Dependencies
The Code the World Depends On: A First Look at Technology Makers' Open Source Software Dependencies Open
Open-source software (OSS) supply chain security has become a topic of concern for organizations. Patching an OSS vulnerability can require updating other dependent software products in addition to the original package. However, the landsc…
View article: CATO: End-to-End Optimization of ML-Based Traffic Analysis Pipelines
CATO: End-to-End Optimization of ML-Based Traffic Analysis Pipelines Open
Machine learning has shown tremendous potential for improving the capabilities of network traffic analysis applications, often outperforming simpler rule-based heuristics. However, ML-based solutions remain difficult to deploy in practice.…
View article: PressProtect: Helping Journalists Navigate Social Media in the Face of Online Harassment
PressProtect: Helping Journalists Navigate Social Media in the Face of Online Harassment Open
Social media has become a critical tool for journalists to disseminate their work, engage with their audience, and connect with sources. Unfortunately, journalists also regularly endure significant online harassment on social media platfor…
View article: Cloud Watching: Understanding Attacks Against Cloud-Hosted Services
Cloud Watching: Understanding Attacks Against Cloud-Hosted Services Open
Cloud computing has dramatically changed service deployment patterns. In this work, we analyze how attackers identify and target cloud services in contrast to traditional enterprise networks and network telescopes. Using a diverse set of c…
View article: Stale TLS Certificates: Investigating Precarious Third-Party Access to Valid TLS Keys
Stale TLS Certificates: Investigating Precarious Third-Party Access to Valid TLS Keys Open
Certificate authorities enable TLS server authentication by generating certificates that attest to the mapping between a domain name and a cryptographic keypair, for up to 398 days. This static, name-to-key caching mechanism belies a compl…
View article: TATA: Stance Detection via Topic-Agnostic and Topic-Aware Embeddings
TATA: Stance Detection via Topic-Agnostic and Topic-Aware Embeddings Open
Stance detection is important for understanding different attitudes and beliefs on the Internet. However, given that a passage's stance toward a given topic is often highly dependent on that topic, building a stance detection model that ge…
View article: Watch Your Language: Investigating Content Moderation with Large Language Models
Watch Your Language: Investigating Content Moderation with Large Language Models Open
Large language models (LLMs) have exploded in popularity due to their ability to perform a wide array of natural language tasks. Text-based content moderation is one LLM use case that has received recent enthusiasm, however, there is littl…
View article: Specious Sites: Tracking the Spread and Sway of Spurious News Stories at Scale
Specious Sites: Tracking the Spread and Sway of Spurious News Stories at Scale Open
Misinformation, propaganda, and outright lies proliferate on the web, with some narratives having dangerous real-world consequences on public health, elections, and individual safety. However, despite the impact of misinformation, the rese…
View article: Twits, Toxic Tweets, and Tribal Tendencies: Trends in Politically Polarized Posts on Twitter
Twits, Toxic Tweets, and Tribal Tendencies: Trends in Politically Polarized Posts on Twitter Open
Social media platforms are often blamed for exacerbating political polarization and worsening public dialogue. Many claim that hyperpartisan users post pernicious content, slanted to their political views, inciting contentious and toxic co…
View article: Democratizing LEO Satellite Network Measurement
Democratizing LEO Satellite Network Measurement Open
Low Earth Orbit (LEO) satellite networks are quickly gaining traction with promises of impressively low latency, high bandwidth, and global reach. However, the research community knows relatively little about their operation and performanc…
View article: "A Special Operation": A Quantitative Approach to Dissecting and Comparing Different Media Ecosystems’ Coverage of the Russo-Ukrainian War
"A Special Operation": A Quantitative Approach to Dissecting and Comparing Different Media Ecosystems’ Coverage of the Russo-Ukrainian War Open
The coverage of the Russian invasion of Ukraine has varied widely between Western, Russian, and Chinese media ecosystems with propaganda, disinformation, and narrative spins present in all three. By utilizing the normalized pointwise mutua…
View article: Happenstance: Utilizing Semantic Search to Track Russian State Media Narratives about the Russo-Ukrainian War on Reddit
Happenstance: Utilizing Semantic Search to Track Russian State Media Narratives about the Russo-Ukrainian War on Reddit Open
In the buildup to and in the weeks following the Russian Federation’s invasion of Ukraine, Russian state media outlets output torrents of misleading and outright false information. In this work, we study this coordinated information campai…
View article: Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites Open
As large language models (LLMs) like ChatGPT have gained traction, an increasing number of news websites have begun utilizing them to generate articles. However, not only can these language models produce factually inaccurate articles on r…
View article: Understanding the Behaviors of Toxic Accounts on Reddit
Understanding the Behaviors of Toxic Accounts on Reddit Open
Toxic comments are the top form of hate and harassment experienced online. While many studies have investigated the types of toxic comments posted online, the effects that such content has on people, and the impact of potential defenses, n…
View article: Quantifying the Systematic Bias in the Accessibility and Inaccessibility of Web Scraping Content from URL-Logged Web-Browsing Digital Trace Data
Quantifying the Systematic Bias in the Accessibility and Inaccessibility of Web Scraping Content from URL-Logged Web-Browsing Digital Trace Data Open
Social scientists and computer scientists are increasingly using observational digital trace data and analyzing these data post hoc to understand the content people are exposed to online. However, these content collection efforts may be sy…
View article: Sub-Standards and Mal-Practices: Misinformation's Role in Insular, Polarized, and Toxic Interactions on Reddit
Sub-Standards and Mal-Practices: Misinformation's Role in Insular, Polarized, and Toxic Interactions on Reddit Open
In this work, we examine the influence of unreliable information on political incivility and toxicity on the social media platform Reddit. We show that comments on articles from unreliable news websites are posted more often in right-leani…
View article: A Golden Age: Conspiracy Theories' Relationship with Misinformation Outlets, News Media, and the Wider Internet
A Golden Age: Conspiracy Theories' Relationship with Misinformation Outlets, News Media, and the Wider Internet Open
Do we live in a "Golden Age of Conspiracy Theories?" In the last few decades, conspiracy theories have proliferated on the Internet with some having dangerous real-world consequences. A large contingent of those who participated in the Jan…