SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.5281/zenodo.11173874
AI systems rely on extensive training on large datasets to address various tasks. However, image-based systems, particularly those used for demographic attribute prediction, face significant challenges. Many current face image datasets primarily focus on demographic factors such as age, gender, and skin tone, overlooking other crucial facial attributes like hairstyle and accessories. This narrow focus limits the diversity of the data and consequently the robustness of AI systems trained on them. This work aims to address this limitation by proposing a methodology for generating synthetic face image datasets that capture a broader spectrum of facial diversity. Specifically, our approach integrates a systematic prompt formulation strategy, encompassing not only demographics and biometrics but also non-permanent traits like make-up, hairstyle, and accessories. These prompts guide a state-of-the-art text-to-image model in generating a comprehensive dataset of high-quality realistic images and can be used as an evaluation set in face analysis systems. Compared to existing datasets, our proposed dataset proves equally or more challenging in image classification tasks while being much smaller in size.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2404.17255
- https://arxiv.org/pdf/2404.17255
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4396818387
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4396818387Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.5281/zenodo.11173874Digital Object Identifier
- Title
-
SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse AttributesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-05-10Full publication date if available
- Authors
-
Georgia Baltsou, Ioannis Sarridis, Christos Koutlis, Symeon PapadopoulosList of authors in order
- Landing page
-
https://arxiv.org/abs/2404.17255Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2404.17255Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2404.17255Direct OA link when available
- Concepts
-
Face (sociological concept), Image (mathematics), Computer science, Artificial intelligence, Pattern recognition (psychology), Data science, Sociology, Social scienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4396818387 |
|---|---|
| doi | https://doi.org/10.5281/zenodo.11173874 |
| ids.doi | https://doi.org/10.5281/zenodo.11173874 |
| ids.openalex | https://openalex.org/W4396818387 |
| fwci | |
| type | preprint |
| title | SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11448 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9769999980926514 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Face recognition and analysis |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2779304628 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6824175119400024 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q3503480 |
| concepts[0].display_name | Face (sociological concept) |
| concepts[1].id | https://openalex.org/C115961682 |
| concepts[1].level | 2 |
| concepts[1].score | 0.5375131964683533 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[1].display_name | Image (mathematics) |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5175904035568237 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.41849544644355774 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C153180895 |
| concepts[4].level | 2 |
| concepts[4].score | 0.3533876836299896 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[4].display_name | Pattern recognition (psychology) |
| concepts[5].id | https://openalex.org/C2522767166 |
| concepts[5].level | 1 |
| concepts[5].score | 0.33058303594589233 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2374463 |
| concepts[5].display_name | Data science |
| concepts[6].id | https://openalex.org/C144024400 |
| concepts[6].level | 0 |
| concepts[6].score | 0.0911131203174591 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q21201 |
| concepts[6].display_name | Sociology |
| concepts[7].id | https://openalex.org/C36289849 |
| concepts[7].level | 1 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q34749 |
| concepts[7].display_name | Social science |
| keywords[0].id | https://openalex.org/keywords/face |
| keywords[0].score | 0.6824175119400024 |
| keywords[0].display_name | Face (sociological concept) |
| keywords[1].id | https://openalex.org/keywords/image |
| keywords[1].score | 0.5375131964683533 |
| keywords[1].display_name | Image (mathematics) |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5175904035568237 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.41849544644355774 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/pattern-recognition |
| keywords[4].score | 0.3533876836299896 |
| keywords[4].display_name | Pattern recognition (psychology) |
| keywords[5].id | https://openalex.org/keywords/data-science |
| keywords[5].score | 0.33058303594589233 |
| keywords[5].display_name | Data science |
| keywords[6].id | https://openalex.org/keywords/sociology |
| keywords[6].score | 0.0911131203174591 |
| keywords[6].display_name | Sociology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2404.17255 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by-sa |
| locations[0].pdf_url | https://arxiv.org/pdf/2404.17255 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | https://openalex.org/licenses/cc-by-sa |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2404.17255 |
| locations[1].id | doi:10.48550/arxiv.2404.17255 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2404.17255 |
| locations[2].id | doi:10.5281/zenodo.11173874 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306400562 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | True |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | Zenodo (CERN European Organization for Nuclear Research) |
| locations[2].source.host_organization | https://openalex.org/I67311998 |
| locations[2].source.host_organization_name | European Organization for Nuclear Research |
| locations[2].source.host_organization_lineage | https://openalex.org/I67311998 |
| locations[2].license | cc-by |
| locations[2].pdf_url | |
| locations[2].version | |
| locations[2].raw_type | article |
| locations[2].license_id | https://openalex.org/licenses/cc-by |
| locations[2].is_accepted | False |
| locations[2].is_published | |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://doi.org/10.5281/zenodo.11173874 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5033391308 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-7042-8876 |
| authorships[0].author.display_name | Georgia Baltsou |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Baltsou, Georgia |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5005706607 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Ioannis Sarridis |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Sarridis, Ioannis |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5024874767 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-3682-408X |
| authorships[2].author.display_name | Christos Koutlis |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Koutlis, Christos |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5013616365 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-5441-7341 |
| authorships[3].author.display_name | Symeon Papadopoulos |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Papadopoulos, Symeon |
| authorships[3].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2404.17255 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-05-11T00:00:00 |
| display_name | SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11448 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9769999980926514 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Face recognition and analysis |
| related_works | https://openalex.org/W3188962172, https://openalex.org/W2772917594, https://openalex.org/W4306742369, https://openalex.org/W4303457083, https://openalex.org/W2131146434, https://openalex.org/W2951359407, https://openalex.org/W4376623224, https://openalex.org/W2033914206, https://openalex.org/W2042327336, https://openalex.org/W4311360467 |
| cited_by_count | 0 |
| locations_count | 3 |
| best_oa_location.id | pmh:oai:arXiv.org:2404.17255 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by-sa |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2404.17255 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-sa |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2404.17255 |
| primary_location.id | pmh:oai:arXiv.org:2404.17255 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by-sa |
| primary_location.pdf_url | https://arxiv.org/pdf/2404.17255 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | https://openalex.org/licenses/cc-by-sa |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2404.17255 |
| publication_date | 2024-05-10 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 80, 90, 100, 123, 129 |
| abstract_inverted_index.AI | 0, 66 |
| abstract_inverted_index.an | 141 |
| abstract_inverted_index.as | 37, 140 |
| abstract_inverted_index.be | 138 |
| abstract_inverted_index.by | 78 |
| abstract_inverted_index.in | 127, 144, 160, 168 |
| abstract_inverted_index.of | 58, 65, 93, 132 |
| abstract_inverted_index.on | 3, 6, 33, 69 |
| abstract_inverted_index.or | 157 |
| abstract_inverted_index.to | 9, 74, 149 |
| abstract_inverted_index.and | 40, 50, 61, 109, 118, 136 |
| abstract_inverted_index.but | 111 |
| abstract_inverted_index.can | 137 |
| abstract_inverted_index.for | 19, 82 |
| abstract_inverted_index.not | 106 |
| abstract_inverted_index.our | 97, 152 |
| abstract_inverted_index.set | 143 |
| abstract_inverted_index.the | 56, 59, 63 |
| abstract_inverted_index.Many | 26 |
| abstract_inverted_index.This | 52, 71 |
| abstract_inverted_index.age, | 38 |
| abstract_inverted_index.aims | 73 |
| abstract_inverted_index.also | 112 |
| abstract_inverted_index.data | 60 |
| abstract_inverted_index.face | 23, 28, 85, 145 |
| abstract_inverted_index.like | 48, 115 |
| abstract_inverted_index.more | 158 |
| abstract_inverted_index.much | 166 |
| abstract_inverted_index.only | 107 |
| abstract_inverted_index.rely | 2 |
| abstract_inverted_index.skin | 41 |
| abstract_inverted_index.such | 36 |
| abstract_inverted_index.that | 88 |
| abstract_inverted_index.this | 76 |
| abstract_inverted_index.used | 18, 139 |
| abstract_inverted_index.work | 72 |
| abstract_inverted_index.These | 120 |
| abstract_inverted_index.being | 165 |
| abstract_inverted_index.focus | 32, 54 |
| abstract_inverted_index.guide | 122 |
| abstract_inverted_index.image | 29, 86, 161 |
| abstract_inverted_index.large | 7 |
| abstract_inverted_index.model | 126 |
| abstract_inverted_index.other | 44 |
| abstract_inverted_index.size. | 169 |
| abstract_inverted_index.tasks | 163 |
| abstract_inverted_index.them. | 70 |
| abstract_inverted_index.those | 17 |
| abstract_inverted_index.tone, | 42 |
| abstract_inverted_index.while | 164 |
| abstract_inverted_index.facial | 46, 94 |
| abstract_inverted_index.images | 135 |
| abstract_inverted_index.limits | 55 |
| abstract_inverted_index.narrow | 53 |
| abstract_inverted_index.prompt | 102 |
| abstract_inverted_index.proves | 155 |
| abstract_inverted_index.tasks. | 12 |
| abstract_inverted_index.traits | 114 |
| abstract_inverted_index.address | 10, 75 |
| abstract_inverted_index.broader | 91 |
| abstract_inverted_index.capture | 89 |
| abstract_inverted_index.crucial | 45 |
| abstract_inverted_index.current | 27 |
| abstract_inverted_index.dataset | 131, 154 |
| abstract_inverted_index.equally | 156 |
| abstract_inverted_index.factors | 35 |
| abstract_inverted_index.gender, | 39 |
| abstract_inverted_index.prompts | 121 |
| abstract_inverted_index.smaller | 167 |
| abstract_inverted_index.systems | 1, 67 |
| abstract_inverted_index.trained | 68 |
| abstract_inverted_index.various | 11 |
| abstract_inverted_index.Compared | 148 |
| abstract_inverted_index.However, | 13 |
| abstract_inverted_index.analysis | 146 |
| abstract_inverted_index.approach | 98 |
| abstract_inverted_index.datasets | 8, 30, 87 |
| abstract_inverted_index.existing | 150 |
| abstract_inverted_index.make-up, | 116 |
| abstract_inverted_index.proposed | 153 |
| abstract_inverted_index.spectrum | 92 |
| abstract_inverted_index.systems, | 15 |
| abstract_inverted_index.systems. | 147 |
| abstract_inverted_index.training | 5 |
| abstract_inverted_index.attribute | 21 |
| abstract_inverted_index.datasets, | 151 |
| abstract_inverted_index.diversity | 57 |
| abstract_inverted_index.extensive | 4 |
| abstract_inverted_index.hairstyle | 49 |
| abstract_inverted_index.primarily | 31 |
| abstract_inverted_index.proposing | 79 |
| abstract_inverted_index.realistic | 134 |
| abstract_inverted_index.strategy, | 104 |
| abstract_inverted_index.synthetic | 84 |
| abstract_inverted_index.attributes | 47 |
| abstract_inverted_index.biometrics | 110 |
| abstract_inverted_index.diversity. | 95 |
| abstract_inverted_index.evaluation | 142 |
| abstract_inverted_index.generating | 83, 128 |
| abstract_inverted_index.hairstyle, | 117 |
| abstract_inverted_index.integrates | 99 |
| abstract_inverted_index.limitation | 77 |
| abstract_inverted_index.robustness | 64 |
| abstract_inverted_index.systematic | 101 |
| abstract_inverted_index.challenges. | 25 |
| abstract_inverted_index.challenging | 159 |
| abstract_inverted_index.demographic | 20, 34 |
| abstract_inverted_index.formulation | 103 |
| abstract_inverted_index.image-based | 14 |
| abstract_inverted_index.methodology | 81 |
| abstract_inverted_index.overlooking | 43 |
| abstract_inverted_index.prediction, | 22 |
| abstract_inverted_index.significant | 24 |
| abstract_inverted_index.accessories. | 51, 119 |
| abstract_inverted_index.consequently | 62 |
| abstract_inverted_index.demographics | 108 |
| abstract_inverted_index.encompassing | 105 |
| abstract_inverted_index.high-quality | 133 |
| abstract_inverted_index.particularly | 16 |
| abstract_inverted_index.Specifically, | 96 |
| abstract_inverted_index.comprehensive | 130 |
| abstract_inverted_index.non-permanent | 113 |
| abstract_inverted_index.text-to-image | 125 |
| abstract_inverted_index.classification | 162 |
| abstract_inverted_index.state-of-the-art | 124 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |