A Public and Reproducible Assessment of the Topics API on Real Data Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2403.19577
The Topics API for the web is Google's privacy-enhancing alternative to replace third-party cookies. Results of prior work have led to an ongoing discussion between Google and research communities about the capability of Topics to trade off both utility and privacy. The central point of contention is largely around the realism of the datasets used in these analyses and their reproducibility; researchers using data collected on a small sample of users or generating synthetic datasets, while Google's results are inferred from a private dataset. In this paper, we complement prior research by performing a reproducible assessment of the latest version of the Topics API on the largest and publicly available dataset of real browsing histories. First, we measure how unique and stable real users' interests are over time. Then, we evaluate if Topics can be used to fingerprint the users from these real browsing traces by adapting methodologies from prior privacy studies. Finally, we call on web actors to perform and enable reproducible evaluations by releasing anonymized distributions. We find that for the 1207 real users in this dataset, the probability of being re-identified across websites is of 2%, 3%, and 4% after 1, 2, and 3 observations of their topics by advertisers, respectively. This paper shows on real data that Topics does not provide the same privacy guarantees to all users and that the information leakage worsens over time, further highlighting the need for public and reproducible evaluations of the claims made by new web proposals.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2403.19577
- https://arxiv.org/pdf/2403.19577
- OA Status
- green
- Cited By
- 2
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4393336310
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4393336310Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2403.19577Digital Object Identifier
- Title
-
A Public and Reproducible Assessment of the Topics API on Real DataWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-03-28Full publication date if available
- Authors
-
Yohan Beugin, Patrick McDanielList of authors in order
- Landing page
-
https://arxiv.org/abs/2403.19577Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2403.19577Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2403.19577Direct OA link when available
- Concepts
-
Computer science, Data science, Information retrievalTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
2Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1, 2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4393336310 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2403.19577 |
| ids.doi | https://doi.org/10.48550/arxiv.2403.19577 |
| ids.openalex | https://openalex.org/W4393336310 |
| fwci | |
| type | preprint |
| title | A Public and Reproducible Assessment of the Topics API on Real Data |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11986 |
| topics[0].field.id | https://openalex.org/fields/18 |
| topics[0].field.display_name | Decision Sciences |
| topics[0].score | 0.934499979019165 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1802 |
| topics[0].subfield.display_name | Information Systems and Management |
| topics[0].display_name | Scientific Computing and Data Management |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.5236356258392334 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C2522767166 |
| concepts[1].level | 1 |
| concepts[1].score | 0.41676488518714905 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q2374463 |
| concepts[1].display_name | Data science |
| concepts[2].id | https://openalex.org/C23123220 |
| concepts[2].level | 1 |
| concepts[2].score | 0.33503592014312744 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[2].display_name | Information retrieval |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.5236356258392334 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/data-science |
| keywords[1].score | 0.41676488518714905 |
| keywords[1].display_name | Data science |
| keywords[2].id | https://openalex.org/keywords/information-retrieval |
| keywords[2].score | 0.33503592014312744 |
| keywords[2].display_name | Information retrieval |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2403.19577 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2403.19577 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2403.19577 |
| locations[1].id | doi:10.48550/arxiv.2403.19577 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2403.19577 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5007771274 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0991-7926 |
| authorships[0].author.display_name | Yohan Beugin |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Beugin, Yohan |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5055368149 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2091-7484 |
| authorships[1].author.display_name | Patrick McDaniel |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | McDaniel, Patrick |
| authorships[1].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2403.19577 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Public and Reproducible Assessment of the Topics API on Real Data |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11986 |
| primary_topic.field.id | https://openalex.org/fields/18 |
| primary_topic.field.display_name | Decision Sciences |
| primary_topic.score | 0.934499979019165 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1802 |
| primary_topic.subfield.display_name | Information Systems and Management |
| primary_topic.display_name | Scientific Computing and Data Management |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W2382290278, https://openalex.org/W2478288626, https://openalex.org/W4391913857, https://openalex.org/W2350741829, https://openalex.org/W2530322880 |
| cited_by_count | 2 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2403.19577 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2403.19577 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2403.19577 |
| primary_location.id | pmh:oai:arXiv.org:2403.19577 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2403.19577 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2403.19577 |
| publication_date | 2024-03-28 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.3 | 196 |
| abstract_inverted_index.a | 66, 81, 93 |
| abstract_inverted_index.1, | 193 |
| abstract_inverted_index.2, | 194 |
| abstract_inverted_index.4% | 191 |
| abstract_inverted_index.In | 84 |
| abstract_inverted_index.We | 168 |
| abstract_inverted_index.an | 21 |
| abstract_inverted_index.be | 134 |
| abstract_inverted_index.by | 91, 145, 164, 201, 243 |
| abstract_inverted_index.if | 131 |
| abstract_inverted_index.in | 55, 176 |
| abstract_inverted_index.is | 6, 46, 186 |
| abstract_inverted_index.of | 15, 32, 44, 51, 69, 96, 100, 111, 181, 187, 198, 239 |
| abstract_inverted_index.on | 65, 104, 155, 207 |
| abstract_inverted_index.or | 71 |
| abstract_inverted_index.to | 10, 20, 34, 136, 158, 219 |
| abstract_inverted_index.we | 87, 116, 129, 153 |
| abstract_inverted_index.2%, | 188 |
| abstract_inverted_index.3%, | 189 |
| abstract_inverted_index.API | 2, 103 |
| abstract_inverted_index.The | 0, 41 |
| abstract_inverted_index.all | 220 |
| abstract_inverted_index.and | 26, 39, 58, 107, 120, 160, 190, 195, 222, 236 |
| abstract_inverted_index.are | 78, 125 |
| abstract_inverted_index.can | 133 |
| abstract_inverted_index.for | 3, 171, 234 |
| abstract_inverted_index.how | 118 |
| abstract_inverted_index.led | 19 |
| abstract_inverted_index.new | 244 |
| abstract_inverted_index.not | 213 |
| abstract_inverted_index.off | 36 |
| abstract_inverted_index.the | 4, 30, 49, 52, 97, 101, 105, 138, 172, 179, 215, 224, 232, 240 |
| abstract_inverted_index.web | 5, 156, 245 |
| abstract_inverted_index.1207 | 173 |
| abstract_inverted_index.This | 204 |
| abstract_inverted_index.both | 37 |
| abstract_inverted_index.call | 154 |
| abstract_inverted_index.data | 63, 209 |
| abstract_inverted_index.does | 212 |
| abstract_inverted_index.find | 169 |
| abstract_inverted_index.from | 80, 140, 148 |
| abstract_inverted_index.have | 18 |
| abstract_inverted_index.made | 242 |
| abstract_inverted_index.need | 233 |
| abstract_inverted_index.over | 126, 228 |
| abstract_inverted_index.real | 112, 122, 142, 174, 208 |
| abstract_inverted_index.same | 216 |
| abstract_inverted_index.that | 170, 210, 223 |
| abstract_inverted_index.this | 85, 177 |
| abstract_inverted_index.used | 54, 135 |
| abstract_inverted_index.work | 17 |
| abstract_inverted_index.Then, | 128 |
| abstract_inverted_index.about | 29 |
| abstract_inverted_index.after | 192 |
| abstract_inverted_index.being | 182 |
| abstract_inverted_index.paper | 205 |
| abstract_inverted_index.point | 43 |
| abstract_inverted_index.prior | 16, 89, 149 |
| abstract_inverted_index.shows | 206 |
| abstract_inverted_index.small | 67 |
| abstract_inverted_index.their | 59, 199 |
| abstract_inverted_index.these | 56, 141 |
| abstract_inverted_index.time, | 229 |
| abstract_inverted_index.time. | 127 |
| abstract_inverted_index.trade | 35 |
| abstract_inverted_index.users | 70, 139, 175, 221 |
| abstract_inverted_index.using | 62 |
| abstract_inverted_index.while | 75 |
| abstract_inverted_index.First, | 115 |
| abstract_inverted_index.Google | 25 |
| abstract_inverted_index.Topics | 1, 33, 102, 132, 211 |
| abstract_inverted_index.across | 184 |
| abstract_inverted_index.actors | 157 |
| abstract_inverted_index.around | 48 |
| abstract_inverted_index.claims | 241 |
| abstract_inverted_index.enable | 161 |
| abstract_inverted_index.latest | 98 |
| abstract_inverted_index.paper, | 86 |
| abstract_inverted_index.public | 235 |
| abstract_inverted_index.sample | 68 |
| abstract_inverted_index.stable | 121 |
| abstract_inverted_index.topics | 200 |
| abstract_inverted_index.traces | 144 |
| abstract_inverted_index.unique | 119 |
| abstract_inverted_index.users' | 123 |
| abstract_inverted_index.Results | 14 |
| abstract_inverted_index.between | 24 |
| abstract_inverted_index.central | 42 |
| abstract_inverted_index.dataset | 110 |
| abstract_inverted_index.further | 230 |
| abstract_inverted_index.largely | 47 |
| abstract_inverted_index.largest | 106 |
| abstract_inverted_index.leakage | 226 |
| abstract_inverted_index.measure | 117 |
| abstract_inverted_index.ongoing | 22 |
| abstract_inverted_index.perform | 159 |
| abstract_inverted_index.privacy | 150, 217 |
| abstract_inverted_index.private | 82 |
| abstract_inverted_index.provide | 214 |
| abstract_inverted_index.realism | 50 |
| abstract_inverted_index.replace | 11 |
| abstract_inverted_index.results | 77 |
| abstract_inverted_index.utility | 38 |
| abstract_inverted_index.version | 99 |
| abstract_inverted_index.worsens | 227 |
| abstract_inverted_index.Finally, | 152 |
| abstract_inverted_index.Google's | 7, 76 |
| abstract_inverted_index.adapting | 146 |
| abstract_inverted_index.analyses | 57 |
| abstract_inverted_index.browsing | 113, 143 |
| abstract_inverted_index.cookies. | 13 |
| abstract_inverted_index.dataset, | 178 |
| abstract_inverted_index.dataset. | 83 |
| abstract_inverted_index.datasets | 53 |
| abstract_inverted_index.evaluate | 130 |
| abstract_inverted_index.inferred | 79 |
| abstract_inverted_index.privacy. | 40 |
| abstract_inverted_index.publicly | 108 |
| abstract_inverted_index.research | 27, 90 |
| abstract_inverted_index.studies. | 151 |
| abstract_inverted_index.websites | 185 |
| abstract_inverted_index.available | 109 |
| abstract_inverted_index.collected | 64 |
| abstract_inverted_index.datasets, | 74 |
| abstract_inverted_index.interests | 124 |
| abstract_inverted_index.releasing | 165 |
| abstract_inverted_index.synthetic | 73 |
| abstract_inverted_index.anonymized | 166 |
| abstract_inverted_index.assessment | 95 |
| abstract_inverted_index.capability | 31 |
| abstract_inverted_index.complement | 88 |
| abstract_inverted_index.contention | 45 |
| abstract_inverted_index.discussion | 23 |
| abstract_inverted_index.generating | 72 |
| abstract_inverted_index.guarantees | 218 |
| abstract_inverted_index.histories. | 114 |
| abstract_inverted_index.performing | 92 |
| abstract_inverted_index.proposals. | 246 |
| abstract_inverted_index.alternative | 9 |
| abstract_inverted_index.communities | 28 |
| abstract_inverted_index.evaluations | 163, 238 |
| abstract_inverted_index.fingerprint | 137 |
| abstract_inverted_index.information | 225 |
| abstract_inverted_index.probability | 180 |
| abstract_inverted_index.researchers | 61 |
| abstract_inverted_index.third-party | 12 |
| abstract_inverted_index.advertisers, | 202 |
| abstract_inverted_index.highlighting | 231 |
| abstract_inverted_index.observations | 197 |
| abstract_inverted_index.reproducible | 94, 162, 237 |
| abstract_inverted_index.methodologies | 147 |
| abstract_inverted_index.re-identified | 183 |
| abstract_inverted_index.respectively. | 203 |
| abstract_inverted_index.distributions. | 167 |
| abstract_inverted_index.reproducibility; | 60 |
| abstract_inverted_index.privacy-enhancing | 8 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |