Automating Website Registration for Studying GDPR Compliance Article Swipe
Karel Kubíček
,
Jakob Merane
,
Ahmed Bouhoula
,
David Basin
·
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1145/3589334.3645709
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1145/3589334.3645709
660k websites from Tranco 1M 25.7% found registration form 23.6% errors 50.7% no form 5.2% of forms are insecure 22.8% of forms submitted successfully 33.9k websites send us emails 12 605 (37.2%) potentially non-compliant senders Automated crawl Automated registration ML-based violation detetectionFigure 1: Overview of steps of our study and results.
Related Topics
Metadata
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1145/3589334.3645709
- OA Status
- green
- Cited By
- 1
- References
- 27
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4396758672
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4396758672Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1145/3589334.3645709Digital Object Identifier
- Title
-
Automating Website Registration for Studying GDPR ComplianceWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-05-08Full publication date if available
- Authors
-
Karel Kubíček, Jakob Merane, Ahmed Bouhoula, David BasinList of authors in order
- Landing page
-
https://doi.org/10.1145/3589334.3645709Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://hdl.handle.net/20.500.11850/674024Direct OA link when available
- Concepts
-
Compliance (psychology), Computer science, Psychology, Social psychologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
- References (count)
-
27Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4396758672 |
|---|---|
| doi | https://doi.org/10.1145/3589334.3645709 |
| ids.doi | https://doi.org/10.1145/3589334.3645709 |
| ids.openalex | https://openalex.org/W4396758672 |
| fwci | 2.09598175 |
| type | article |
| title | Automating Website Registration for Studying GDPR Compliance |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | 1306 |
| biblio.first_page | 1295 |
| topics[0].id | https://openalex.org/T11045 |
| topics[0].field.id | https://openalex.org/fields/33 |
| topics[0].field.display_name | Social Sciences |
| topics[0].score | 0.996999979019165 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/3312 |
| topics[0].subfield.display_name | Sociology and Political Science |
| topics[0].display_name | Privacy, Security, and Data Protection |
| topics[1].id | https://openalex.org/T10764 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9944999814033508 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Privacy-Preserving Technologies in Data |
| topics[2].id | https://openalex.org/T12034 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9732999801635742 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1710 |
| topics[2].subfield.display_name | Information Systems |
| topics[2].display_name | Digital and Cyber Forensics |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2781460075 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6529706716537476 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1399332 |
| concepts[0].display_name | Compliance (psychology) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.573118269443512 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C15744967 |
| concepts[2].level | 0 |
| concepts[2].score | 0.07664716243743896 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[2].display_name | Psychology |
| concepts[3].id | https://openalex.org/C77805123 |
| concepts[3].level | 1 |
| concepts[3].score | 0.0 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q161272 |
| concepts[3].display_name | Social psychology |
| keywords[0].id | https://openalex.org/keywords/compliance |
| keywords[0].score | 0.6529706716537476 |
| keywords[0].display_name | Compliance (psychology) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.573118269443512 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/psychology |
| keywords[2].score | 0.07664716243743896 |
| keywords[2].display_name | Psychology |
| language | en |
| locations[0].id | doi:10.1145/3589334.3645709 |
| locations[0].is_oa | False |
| locations[0].source | |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | publishedVersion |
| locations[0].raw_type | proceedings-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Proceedings of the ACM Web Conference 2024 |
| locations[0].landing_page_url | https://doi.org/10.1145/3589334.3645709 |
| locations[1].id | pmh:oai:www.research-collection.ethz.ch:20.500.11850/674024 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306402302 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | Repository for Publications and Research Data (ETH Zurich) |
| locations[1].source.host_organization | https://openalex.org/I35440088 |
| locations[1].source.host_organization_name | ETH Zurich |
| locations[1].source.host_organization_lineage | https://openalex.org/I35440088 |
| locations[1].license | other-oa |
| locations[1].pdf_url | |
| locations[1].version | acceptedVersion |
| locations[1].raw_type | info:eu-repo/semantics/acceptedVersion |
| locations[1].license_id | https://openalex.org/licenses/other-oa |
| locations[1].is_accepted | True |
| locations[1].is_published | False |
| locations[1].raw_source_name | WWW '24: Proceedings of the ACM on Web Conference 2024 |
| locations[1].landing_page_url | http://hdl.handle.net/20.500.11850/674024 |
| locations[2].id | doi:10.3929/ethz-b-000674024 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S7407051236 |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | ETH Zürich Research Collection |
| locations[2].source.host_organization | |
| locations[2].source.host_organization_name | |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | |
| locations[2].raw_type | article-journal |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://doi.org/10.3929/ethz-b-000674024 |
| indexed_in | crossref, datacite |
| authorships[0].author.id | https://openalex.org/A5102989278 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-7419-2784 |
| authorships[0].author.display_name | Karel Kubíček |
| authorships[0].countries | CH |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I35440088 |
| authorships[0].affiliations[0].raw_affiliation_string | ETH Zurich, Zurich, Switzerland |
| authorships[0].institutions[0].id | https://openalex.org/I35440088 |
| authorships[0].institutions[0].ror | https://ror.org/05a28rw58 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I2799323385, https://openalex.org/I35440088 |
| authorships[0].institutions[0].country_code | CH |
| authorships[0].institutions[0].display_name | ETH Zurich |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Karel Kubicek |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | ETH Zurich, Zurich, Switzerland |
| authorships[1].author.id | https://openalex.org/A5092893130 |
| authorships[1].author.orcid | https://orcid.org/0009-0008-5841-7091 |
| authorships[1].author.display_name | Jakob Merane |
| authorships[1].countries | CH |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I35440088 |
| authorships[1].affiliations[0].raw_affiliation_string | ETH Zurich, Zurich, Switzerland |
| authorships[1].institutions[0].id | https://openalex.org/I35440088 |
| authorships[1].institutions[0].ror | https://ror.org/05a28rw58 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I2799323385, https://openalex.org/I35440088 |
| authorships[1].institutions[0].country_code | CH |
| authorships[1].institutions[0].display_name | ETH Zurich |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Jakob Merane |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | ETH Zurich, Zurich, Switzerland |
| authorships[2].author.id | https://openalex.org/A5096972545 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5549-4568 |
| authorships[2].author.display_name | Ahmed Bouhoula |
| authorships[2].countries | CH |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I35440088 |
| authorships[2].affiliations[0].raw_affiliation_string | ETH Zurich, Zurich, Switzerland |
| authorships[2].institutions[0].id | https://openalex.org/I35440088 |
| authorships[2].institutions[0].ror | https://ror.org/05a28rw58 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I2799323385, https://openalex.org/I35440088 |
| authorships[2].institutions[0].country_code | CH |
| authorships[2].institutions[0].display_name | ETH Zurich |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Ahmed Bouhoula |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | ETH Zurich, Zurich, Switzerland |
| authorships[3].author.id | https://openalex.org/A5025344654 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-2952-939X |
| authorships[3].author.display_name | David Basin |
| authorships[3].countries | CH |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I35440088 |
| authorships[3].affiliations[0].raw_affiliation_string | ETH Zurich, Zurich, Switzerland |
| authorships[3].institutions[0].id | https://openalex.org/I35440088 |
| authorships[3].institutions[0].ror | https://ror.org/05a28rw58 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I2799323385, https://openalex.org/I35440088 |
| authorships[3].institutions[0].country_code | CH |
| authorships[3].institutions[0].display_name | ETH Zurich |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | David Basin |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | ETH Zurich, Zurich, Switzerland |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | http://hdl.handle.net/20.500.11850/674024 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Automating Website Registration for Studying GDPR Compliance |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T11045 |
| primary_topic.field.id | https://openalex.org/fields/33 |
| primary_topic.field.display_name | Social Sciences |
| primary_topic.score | 0.996999979019165 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/3312 |
| primary_topic.subfield.display_name | Sociology and Political Science |
| primary_topic.display_name | Privacy, Security, and Data Protection |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2410395228, https://openalex.org/W2390279801, https://openalex.org/W3125941065, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2484615095 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 3 |
| best_oa_location.id | pmh:oai:www.research-collection.ethz.ch:20.500.11850/674024 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306402302 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Repository for Publications and Research Data (ETH Zurich) |
| best_oa_location.source.host_organization | https://openalex.org/I35440088 |
| best_oa_location.source.host_organization_name | ETH Zurich |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I35440088 |
| best_oa_location.license | other-oa |
| best_oa_location.pdf_url | |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | info:eu-repo/semantics/acceptedVersion |
| best_oa_location.license_id | https://openalex.org/licenses/other-oa |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | WWW '24: Proceedings of the ACM on Web Conference 2024 |
| best_oa_location.landing_page_url | http://hdl.handle.net/20.500.11850/674024 |
| primary_location.id | doi:10.1145/3589334.3645709 |
| primary_location.is_oa | False |
| primary_location.source | |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | publishedVersion |
| primary_location.raw_type | proceedings-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Proceedings of the ACM Web Conference 2024 |
| primary_location.landing_page_url | https://doi.org/10.1145/3589334.3645709 |
| publication_date | 2024-05-08 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W3174086521, https://openalex.org/W3121736041, https://openalex.org/W2295598076, https://openalex.org/W4224319565, https://openalex.org/W4385517833, https://openalex.org/W3107473573, https://openalex.org/W2782867239, https://openalex.org/W2535603283, https://openalex.org/W3205400134, https://openalex.org/W4384302739, https://openalex.org/W6969061235, https://openalex.org/W3006928035, https://openalex.org/W2169511760, https://openalex.org/W3179937365, https://openalex.org/W4220680677, https://openalex.org/W2904027722, https://openalex.org/W2999830435, https://openalex.org/W6891412766, https://openalex.org/W4385522472, https://openalex.org/W3164790661, https://openalex.org/W4307020316, https://openalex.org/W3046708287, https://openalex.org/W3003683799, https://openalex.org/W4376626867, https://openalex.org/W3100328669, https://openalex.org/W2962940036, https://openalex.org/W2588383695 |
| referenced_works_count | 27 |
| abstract_inverted_index.12 | 29 |
| abstract_inverted_index.1: | 42 |
| abstract_inverted_index.1M | 4 |
| abstract_inverted_index.no | 12 |
| abstract_inverted_index.of | 15, 20, 44, 46 |
| abstract_inverted_index.us | 27 |
| abstract_inverted_index.605 | 30 |
| abstract_inverted_index.and | 49 |
| abstract_inverted_index.are | 17 |
| abstract_inverted_index.our | 47 |
| abstract_inverted_index.5.2% | 14 |
| abstract_inverted_index.660k | 0 |
| abstract_inverted_index.form | 8, 13 |
| abstract_inverted_index.from | 2 |
| abstract_inverted_index.send | 26 |
| abstract_inverted_index.22.8% | 19 |
| abstract_inverted_index.23.6% | 9 |
| abstract_inverted_index.25.7% | 5 |
| abstract_inverted_index.33.9k | 24 |
| abstract_inverted_index.50.7% | 11 |
| abstract_inverted_index.crawl | 36 |
| abstract_inverted_index.forms | 16, 21 |
| abstract_inverted_index.found | 6 |
| abstract_inverted_index.steps | 45 |
| abstract_inverted_index.study | 48 |
| abstract_inverted_index.Tranco | 3 |
| abstract_inverted_index.emails | 28 |
| abstract_inverted_index.errors | 10 |
| abstract_inverted_index.(37.2%) | 31 |
| abstract_inverted_index.senders | 34 |
| abstract_inverted_index.ML-based | 39 |
| abstract_inverted_index.Overview | 43 |
| abstract_inverted_index.insecure | 18 |
| abstract_inverted_index.results. | 50 |
| abstract_inverted_index.websites | 1, 25 |
| abstract_inverted_index.Automated | 35, 37 |
| abstract_inverted_index.submitted | 22 |
| abstract_inverted_index.violation | 40 |
| abstract_inverted_index.potentially | 32 |
| abstract_inverted_index.registration | 7, 38 |
| abstract_inverted_index.successfully | 23 |
| abstract_inverted_index.non-compliant | 33 |
| abstract_inverted_index.detetectionFigure | 41 |
| cited_by_percentile_year.max | 95 |
| cited_by_percentile_year.min | 91 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/8 |
| sustainable_development_goals[0].score | 0.6800000071525574 |
| sustainable_development_goals[0].display_name | Decent work and economic growth |
| citation_normalized_percentile.value | 0.8168877 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |