Transformer-based models for hate speech classification Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1063/5.0198822
This research paper explores the application of text classification and natural language processing techniques for enhancing hate speech detection. The study employs machine learning (ML) and deep learning models, including transformer models such as BERT, RoBERTa, and DistilBERT, to improve the accuracy of hate speech classifiers. Through a comprehensive empirical analysis on three diverse datasets (Data-ICWSM, Data-ALW2, Data-OLID), the study demonstrates the effectiveness of these models in accurately identifying hate speech. Compared to traditional baselines, the BERT models exhibit a significant performance boost in macro and weighted F1-scores. Additionally, the study addresses the challenge of imbalanced class distributions in the datasets by employing sampling techniques during training. Overall, the research highlights the potential of transformer models for hate speech detection and provides insights for future exploration in this domain.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1063/5.0198822
- https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdf
- OA Status
- bronze
- Cited By
- 1
- References
- 23
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4392947462
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4392947462Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1063/5.0198822Digital Object Identifier
- Title
-
Transformer-based models for hate speech classificationWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-01-01Full publication date if available
- Authors
-
Deepti Jain, Sandhya Arora, C. K. Jha, Garima MalikList of authors in order
- Landing page
-
https://doi.org/10.1063/5.0198822Publisher landing page
- PDF URL
-
https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
bronzeOpen access status per OpenAlex
- OA URL
-
https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdfDirect OA link when available
- Concepts
-
Computer science, Transformer, Speech recognition, Artificial intelligence, Natural language processing, Engineering, Electrical engineering, VoltageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2024: 1Per-year citation counts (last 5 years)
- References (count)
-
23Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4392947462 |
|---|---|
| doi | https://doi.org/10.1063/5.0198822 |
| ids.doi | https://doi.org/10.1063/5.0198822 |
| ids.openalex | https://openalex.org/W4392947462 |
| fwci | 0.63877855 |
| type | article |
| title | Transformer-based models for hate speech classification |
| biblio.issue | |
| biblio.volume | 3072 |
| biblio.last_page | 020017 |
| biblio.first_page | 020017 |
| topics[0].id | https://openalex.org/T12262 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 1.0 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Hate Speech and Cyberbullying Detection |
| topics[1].id | https://openalex.org/T10485 |
| topics[1].field.id | https://openalex.org/fields/32 |
| topics[1].field.display_name | Psychology |
| topics[1].score | 0.9520999789237976 |
| topics[1].domain.id | https://openalex.org/domains/2 |
| topics[1].domain.display_name | Social Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/3207 |
| topics[1].subfield.display_name | Social Psychology |
| topics[1].display_name | Bullying, Victimization, and Aggression |
| topics[2].id | https://openalex.org/T11241 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9118000268936157 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1711 |
| topics[2].subfield.display_name | Signal Processing |
| topics[2].display_name | Advanced Malware Detection Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.682299017906189 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C66322947 |
| concepts[1].level | 3 |
| concepts[1].score | 0.5117864012718201 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q11658 |
| concepts[1].display_name | Transformer |
| concepts[2].id | https://openalex.org/C28490314 |
| concepts[2].level | 1 |
| concepts[2].score | 0.47940894961357117 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q189436 |
| concepts[2].display_name | Speech recognition |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.35628044605255127 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C204321447 |
| concepts[4].level | 1 |
| concepts[4].score | 0.353563129901886 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[4].display_name | Natural language processing |
| concepts[5].id | https://openalex.org/C127413603 |
| concepts[5].level | 0 |
| concepts[5].score | 0.1607958972454071 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[5].display_name | Engineering |
| concepts[6].id | https://openalex.org/C119599485 |
| concepts[6].level | 1 |
| concepts[6].score | 0.10060051083564758 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q43035 |
| concepts[6].display_name | Electrical engineering |
| concepts[7].id | https://openalex.org/C165801399 |
| concepts[7].level | 2 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q25428 |
| concepts[7].display_name | Voltage |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.682299017906189 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/transformer |
| keywords[1].score | 0.5117864012718201 |
| keywords[1].display_name | Transformer |
| keywords[2].id | https://openalex.org/keywords/speech-recognition |
| keywords[2].score | 0.47940894961357117 |
| keywords[2].display_name | Speech recognition |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.35628044605255127 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/natural-language-processing |
| keywords[4].score | 0.353563129901886 |
| keywords[4].display_name | Natural language processing |
| keywords[5].id | https://openalex.org/keywords/engineering |
| keywords[5].score | 0.1607958972454071 |
| keywords[5].display_name | Engineering |
| keywords[6].id | https://openalex.org/keywords/electrical-engineering |
| keywords[6].score | 0.10060051083564758 |
| keywords[6].display_name | Electrical engineering |
| language | en |
| locations[0].id | doi:10.1063/5.0198822 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S2764696622 |
| locations[0].source.issn | 0094-243X, 1551-7616, 1935-0465 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 0094-243X |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | AIP conference proceedings |
| locations[0].source.host_organization | https://openalex.org/P4310320257 |
| locations[0].source.host_organization_name | American Institute of Physics |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310320257 |
| locations[0].source.host_organization_lineage_names | American Institute of Physics |
| locations[0].license | |
| locations[0].pdf_url | https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | proceedings-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | AIP Conference Proceedings |
| locations[0].landing_page_url | https://doi.org/10.1063/5.0198822 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5037280953 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8631-7230 |
| authorships[0].author.display_name | Deepti Jain |
| authorships[0].countries | IN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I102117144 |
| authorships[0].affiliations[0].raw_affiliation_string | Department of Computer Science, AIM and ACT, Banasthali Vidyapith, Banasthali-304022 (Rajasthan), India |
| authorships[0].institutions[0].id | https://openalex.org/I102117144 |
| authorships[0].institutions[0].ror | https://ror.org/05ycegt40 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I102117144 |
| authorships[0].institutions[0].country_code | IN |
| authorships[0].institutions[0].display_name | Banasthali University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Deepti Jain |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Department of Computer Science, AIM and ACT, Banasthali Vidyapith, Banasthali-304022 (Rajasthan), India |
| authorships[1].author.id | https://openalex.org/A5083559291 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-6412-7831 |
| authorships[1].author.display_name | Sandhya Arora |
| authorships[1].countries | IN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I878213199 |
| authorships[1].affiliations[0].raw_affiliation_string | Department of Computer Engineering, Cummins College of Engineering, Pune-411052, India |
| authorships[1].institutions[0].id | https://openalex.org/I878213199 |
| authorships[1].institutions[0].ror | https://ror.org/044g6d731 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I878213199 |
| authorships[1].institutions[0].country_code | IN |
| authorships[1].institutions[0].display_name | Savitribai Phule Pune University |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Sandhya Arora |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Department of Computer Engineering, Cummins College of Engineering, Pune-411052, India |
| authorships[2].author.id | https://openalex.org/A5029350886 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5002-3134 |
| authorships[2].author.display_name | C. K. Jha |
| authorships[2].countries | IN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I102117144 |
| authorships[2].affiliations[0].raw_affiliation_string | Department of Computer Science, AIM and ACT, Banasthali Vidyapith, Banasthali-304022 (Rajasthan), India |
| authorships[2].institutions[0].id | https://openalex.org/I102117144 |
| authorships[2].institutions[0].ror | https://ror.org/05ycegt40 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I102117144 |
| authorships[2].institutions[0].country_code | IN |
| authorships[2].institutions[0].display_name | Banasthali University |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | C. K. Jha |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Department of Computer Science, AIM and ACT, Banasthali Vidyapith, Banasthali-304022 (Rajasthan), India |
| authorships[3].author.id | https://openalex.org/A5090292639 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-3892-8299 |
| authorships[3].author.display_name | Garima Malik |
| authorships[3].countries | CA |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I530967 |
| authorships[3].affiliations[0].raw_affiliation_string | Department of Mechanical and Industrial Engineering, Toronto Metropolitan University, 350 Victoria st, Toronto, M4C2C3, Canada |
| authorships[3].institutions[0].id | https://openalex.org/I530967 |
| authorships[3].institutions[0].ror | https://ror.org/05g13zd79 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I530967 |
| authorships[3].institutions[0].country_code | CA |
| authorships[3].institutions[0].display_name | Toronto Metropolitan University |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Garima Malik |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Department of Mechanical and Industrial Engineering, Toronto Metropolitan University, 350 Victoria st, Toronto, M4C2C3, Canada |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdf |
| open_access.oa_status | bronze |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Transformer-based models for hate speech classification |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T12262 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 1.0 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Hate Speech and Cyberbullying Detection |
| related_works | https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W2382290278, https://openalex.org/W2478288626, https://openalex.org/W4391913857, https://openalex.org/W2350741829, https://openalex.org/W3204019825 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2024 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 1 |
| best_oa_location.id | doi:10.1063/5.0198822 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2764696622 |
| best_oa_location.source.issn | 0094-243X, 1551-7616, 1935-0465 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | 0094-243X |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | AIP conference proceedings |
| best_oa_location.source.host_organization | https://openalex.org/P4310320257 |
| best_oa_location.source.host_organization_name | American Institute of Physics |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310320257 |
| best_oa_location.source.host_organization_lineage_names | American Institute of Physics |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | proceedings-article |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | AIP Conference Proceedings |
| best_oa_location.landing_page_url | https://doi.org/10.1063/5.0198822 |
| primary_location.id | doi:10.1063/5.0198822 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S2764696622 |
| primary_location.source.issn | 0094-243X, 1551-7616, 1935-0465 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 0094-243X |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | AIP conference proceedings |
| primary_location.source.host_organization | https://openalex.org/P4310320257 |
| primary_location.source.host_organization_name | American Institute of Physics |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310320257 |
| primary_location.source.host_organization_lineage_names | American Institute of Physics |
| primary_location.license | |
| primary_location.pdf_url | https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/5.0198822/19833345/020017_1_5.0198822.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | proceedings-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | AIP Conference Proceedings |
| primary_location.landing_page_url | https://doi.org/10.1063/5.0198822 |
| publication_date | 2024-01-01 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W2613977835, https://openalex.org/W3122907670, https://openalex.org/W4213377284, https://openalex.org/W4361732328, https://openalex.org/W2595653137, https://openalex.org/W2887782043, https://openalex.org/W3210610489, https://openalex.org/W2989817717, https://openalex.org/W2948937448, https://openalex.org/W4213016227, https://openalex.org/W3125162565, https://openalex.org/W3175462521, https://openalex.org/W2971050273, https://openalex.org/W2954709811, https://openalex.org/W2797630160, https://openalex.org/W2903893278, https://openalex.org/W2805705517, https://openalex.org/W2807883874, https://openalex.org/W2729412353, https://openalex.org/W2956090150, https://openalex.org/W2962977603, https://openalex.org/W2896457183, https://openalex.org/W2965373594 |
| referenced_works_count | 23 |
| abstract_inverted_index.a | 47, 79 |
| abstract_inverted_index.as | 33 |
| abstract_inverted_index.by | 101 |
| abstract_inverted_index.in | 66, 83, 98, 126 |
| abstract_inverted_index.of | 6, 42, 63, 94, 113 |
| abstract_inverted_index.on | 51 |
| abstract_inverted_index.to | 38, 72 |
| abstract_inverted_index.The | 19 |
| abstract_inverted_index.and | 9, 25, 36, 85, 120 |
| abstract_inverted_index.for | 14, 116, 123 |
| abstract_inverted_index.the | 4, 40, 58, 61, 75, 89, 92, 99, 108, 111 |
| abstract_inverted_index.(ML) | 24 |
| abstract_inverted_index.BERT | 76 |
| abstract_inverted_index.This | 0 |
| abstract_inverted_index.deep | 26 |
| abstract_inverted_index.hate | 16, 43, 69, 117 |
| abstract_inverted_index.such | 32 |
| abstract_inverted_index.text | 7 |
| abstract_inverted_index.this | 127 |
| abstract_inverted_index.BERT, | 34 |
| abstract_inverted_index.boost | 82 |
| abstract_inverted_index.class | 96 |
| abstract_inverted_index.macro | 84 |
| abstract_inverted_index.paper | 2 |
| abstract_inverted_index.study | 20, 59, 90 |
| abstract_inverted_index.these | 64 |
| abstract_inverted_index.three | 52 |
| abstract_inverted_index.during | 105 |
| abstract_inverted_index.future | 124 |
| abstract_inverted_index.models | 31, 65, 77, 115 |
| abstract_inverted_index.speech | 17, 44, 118 |
| abstract_inverted_index.Through | 46 |
| abstract_inverted_index.diverse | 53 |
| abstract_inverted_index.domain. | 128 |
| abstract_inverted_index.employs | 21 |
| abstract_inverted_index.exhibit | 78 |
| abstract_inverted_index.improve | 39 |
| abstract_inverted_index.machine | 22 |
| abstract_inverted_index.models, | 28 |
| abstract_inverted_index.natural | 10 |
| abstract_inverted_index.speech. | 70 |
| abstract_inverted_index.Compared | 71 |
| abstract_inverted_index.Overall, | 107 |
| abstract_inverted_index.RoBERTa, | 35 |
| abstract_inverted_index.accuracy | 41 |
| abstract_inverted_index.analysis | 50 |
| abstract_inverted_index.datasets | 54, 100 |
| abstract_inverted_index.explores | 3 |
| abstract_inverted_index.insights | 122 |
| abstract_inverted_index.language | 11 |
| abstract_inverted_index.learning | 23, 27 |
| abstract_inverted_index.provides | 121 |
| abstract_inverted_index.research | 1, 109 |
| abstract_inverted_index.sampling | 103 |
| abstract_inverted_index.weighted | 86 |
| abstract_inverted_index.addresses | 91 |
| abstract_inverted_index.challenge | 93 |
| abstract_inverted_index.detection | 119 |
| abstract_inverted_index.empirical | 49 |
| abstract_inverted_index.employing | 102 |
| abstract_inverted_index.enhancing | 15 |
| abstract_inverted_index.including | 29 |
| abstract_inverted_index.potential | 112 |
| abstract_inverted_index.training. | 106 |
| abstract_inverted_index.Data-ALW2, | 56 |
| abstract_inverted_index.F1-scores. | 87 |
| abstract_inverted_index.accurately | 67 |
| abstract_inverted_index.baselines, | 74 |
| abstract_inverted_index.detection. | 18 |
| abstract_inverted_index.highlights | 110 |
| abstract_inverted_index.imbalanced | 95 |
| abstract_inverted_index.processing | 12 |
| abstract_inverted_index.techniques | 13, 104 |
| abstract_inverted_index.Data-OLID), | 57 |
| abstract_inverted_index.DistilBERT, | 37 |
| abstract_inverted_index.application | 5 |
| abstract_inverted_index.exploration | 125 |
| abstract_inverted_index.identifying | 68 |
| abstract_inverted_index.performance | 81 |
| abstract_inverted_index.significant | 80 |
| abstract_inverted_index.traditional | 73 |
| abstract_inverted_index.transformer | 30, 114 |
| abstract_inverted_index.(Data-ICWSM, | 55 |
| abstract_inverted_index.classifiers. | 45 |
| abstract_inverted_index.demonstrates | 60 |
| abstract_inverted_index.Additionally, | 88 |
| abstract_inverted_index.comprehensive | 48 |
| abstract_inverted_index.distributions | 97 |
| abstract_inverted_index.effectiveness | 62 |
| abstract_inverted_index.classification | 8 |
| cited_by_percentile_year.max | 94 |
| cited_by_percentile_year.min | 90 |
| countries_distinct_count | 2 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/5 |
| sustainable_development_goals[0].score | 0.6600000262260437 |
| sustainable_development_goals[0].display_name | Gender equality |
| citation_normalized_percentile.value | 0.65485184 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |