Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2508.08274
The rapid increase in hate speech on social media has exposed an unprecedented impact on society, making automated methods for detecting such content important. Unlike prior black-box models, we propose a novel transparent method for automated hate and counter speech recognition, i.e., "Speech Concept Bottleneck Model" (SCBM), using adjectives as human-interpretable bottleneck concepts. SCBM leverages large language models (LLMs) to map input texts to an abstract adjective-based representation, which is then sent to a light-weight classifier for downstream tasks. Across five benchmark datasets spanning multiple languages and platforms (e.g., Twitter, Reddit, YouTube), SCBM achieves an average macro-F1 score of 0.69 which outperforms the most recently reported results from the literature on four out of five datasets. Aside from high recognition accuracy, SCBM provides a high level of both local and global interpretability. Furthermore, fusing our adjective-based concept representation with transformer embeddings, leads to a 1.8% performance increase on average across all datasets, showing that the proposed representation captures complementary information. Our results demonstrate that adjective-based concept representations can serve as compact, interpretable, and effective encodings for hate and counter speech recognition. With adapted adjectives, our method can also be applied to other NLP tasks.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2508.08274
- https://arxiv.org/pdf/2508.08274
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416854721
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416854721Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2508.08274Digital Object Identifier
- Title
-
Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech RecognitionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-07-30Full publication date if available
- Authors
-
Djordje Slijepčević, Xihui Chen, Adrian Jaques Böck, Andreas BabicList of authors in order
- Landing page
-
https://arxiv.org/abs/2508.08274Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2508.08274Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2508.08274Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416854721 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2508.08274 |
| ids.doi | https://doi.org/10.48550/arxiv.2508.08274 |
| ids.openalex | https://openalex.org/W4416854721 |
| fwci | |
| type | preprint |
| title | Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2508.08274 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2508.08274 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2508.08274 |
| locations[1].id | doi:10.48550/arxiv.2508.08274 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2508.08274 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5055814440 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-2295-7466 |
| authorships[0].author.display_name | Djordje Slijepčević |
| authorships[0].author_position | middle |
| authorships[0].raw_author_name | Slijepčević, Djordje |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5101950313 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-8131-5092 |
| authorships[1].author.display_name | Xihui Chen |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Chen, Xihui |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5109570079 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1972-0473 |
| authorships[2].author.display_name | Adrian Jaques Böck |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Böck, Adrian Jaques |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5110476333 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Andreas Babic |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Babic, Andreas |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2508.08274 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Distilling Knowledge from Large Language Models: A Concept Bottleneck Model for Hate and Counter Speech Recognition |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-12-01T12:55:20.459004 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2508.08274 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2508.08274 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2508.08274 |
| primary_location.id | pmh:oai:arXiv.org:2508.08274 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2508.08274 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2508.08274 |
| publication_date | 2025-07-30 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 30, 73, 123, 143 |
| abstract_inverted_index.an | 11, 64, 94 |
| abstract_inverted_index.as | 49, 169 |
| abstract_inverted_index.be | 188 |
| abstract_inverted_index.in | 3 |
| abstract_inverted_index.is | 69 |
| abstract_inverted_index.of | 98, 113, 126 |
| abstract_inverted_index.on | 6, 14, 110, 147 |
| abstract_inverted_index.to | 59, 63, 72, 142, 190 |
| abstract_inverted_index.we | 28 |
| abstract_inverted_index.NLP | 192 |
| abstract_inverted_index.Our | 160 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.all | 150 |
| abstract_inverted_index.and | 37, 86, 129, 172, 177 |
| abstract_inverted_index.can | 167, 186 |
| abstract_inverted_index.for | 19, 34, 76, 175 |
| abstract_inverted_index.has | 9 |
| abstract_inverted_index.map | 60 |
| abstract_inverted_index.our | 134, 184 |
| abstract_inverted_index.out | 112 |
| abstract_inverted_index.the | 102, 108, 154 |
| abstract_inverted_index.0.69 | 99 |
| abstract_inverted_index.1.8% | 144 |
| abstract_inverted_index.SCBM | 53, 92, 121 |
| abstract_inverted_index.With | 181 |
| abstract_inverted_index.also | 187 |
| abstract_inverted_index.both | 127 |
| abstract_inverted_index.five | 80, 114 |
| abstract_inverted_index.four | 111 |
| abstract_inverted_index.from | 107, 117 |
| abstract_inverted_index.hate | 4, 36, 176 |
| abstract_inverted_index.high | 118, 124 |
| abstract_inverted_index.most | 103 |
| abstract_inverted_index.sent | 71 |
| abstract_inverted_index.such | 21 |
| abstract_inverted_index.that | 153, 163 |
| abstract_inverted_index.then | 70 |
| abstract_inverted_index.with | 138 |
| abstract_inverted_index.Aside | 116 |
| abstract_inverted_index.i.e., | 41 |
| abstract_inverted_index.input | 61 |
| abstract_inverted_index.large | 55 |
| abstract_inverted_index.leads | 141 |
| abstract_inverted_index.level | 125 |
| abstract_inverted_index.local | 128 |
| abstract_inverted_index.media | 8 |
| abstract_inverted_index.novel | 31 |
| abstract_inverted_index.other | 191 |
| abstract_inverted_index.prior | 25 |
| abstract_inverted_index.rapid | 1 |
| abstract_inverted_index.score | 97 |
| abstract_inverted_index.serve | 168 |
| abstract_inverted_index.texts | 62 |
| abstract_inverted_index.using | 47 |
| abstract_inverted_index.which | 68, 100 |
| abstract_inverted_index.(LLMs) | 58 |
| abstract_inverted_index.(e.g., | 88 |
| abstract_inverted_index.Across | 79 |
| abstract_inverted_index.Model" | 45 |
| abstract_inverted_index.Unlike | 24 |
| abstract_inverted_index.across | 149 |
| abstract_inverted_index.fusing | 133 |
| abstract_inverted_index.global | 130 |
| abstract_inverted_index.impact | 13 |
| abstract_inverted_index.making | 16 |
| abstract_inverted_index.method | 33, 185 |
| abstract_inverted_index.models | 57 |
| abstract_inverted_index.social | 7 |
| abstract_inverted_index.speech | 5, 39, 179 |
| abstract_inverted_index.tasks. | 78, 193 |
| abstract_inverted_index."Speech | 42 |
| abstract_inverted_index.(SCBM), | 46 |
| abstract_inverted_index.Concept | 43 |
| abstract_inverted_index.Reddit, | 90 |
| abstract_inverted_index.adapted | 182 |
| abstract_inverted_index.applied | 189 |
| abstract_inverted_index.average | 95, 148 |
| abstract_inverted_index.concept | 136, 165 |
| abstract_inverted_index.content | 22 |
| abstract_inverted_index.counter | 38, 178 |
| abstract_inverted_index.exposed | 10 |
| abstract_inverted_index.methods | 18 |
| abstract_inverted_index.models, | 27 |
| abstract_inverted_index.propose | 29 |
| abstract_inverted_index.results | 106, 161 |
| abstract_inverted_index.showing | 152 |
| abstract_inverted_index.Twitter, | 89 |
| abstract_inverted_index.abstract | 65 |
| abstract_inverted_index.achieves | 93 |
| abstract_inverted_index.captures | 157 |
| abstract_inverted_index.compact, | 170 |
| abstract_inverted_index.datasets | 82 |
| abstract_inverted_index.increase | 2, 146 |
| abstract_inverted_index.language | 56 |
| abstract_inverted_index.macro-F1 | 96 |
| abstract_inverted_index.multiple | 84 |
| abstract_inverted_index.proposed | 155 |
| abstract_inverted_index.provides | 122 |
| abstract_inverted_index.recently | 104 |
| abstract_inverted_index.reported | 105 |
| abstract_inverted_index.society, | 15 |
| abstract_inverted_index.spanning | 83 |
| abstract_inverted_index.YouTube), | 91 |
| abstract_inverted_index.accuracy, | 120 |
| abstract_inverted_index.automated | 17, 35 |
| abstract_inverted_index.benchmark | 81 |
| abstract_inverted_index.black-box | 26 |
| abstract_inverted_index.concepts. | 52 |
| abstract_inverted_index.datasets, | 151 |
| abstract_inverted_index.datasets. | 115 |
| abstract_inverted_index.detecting | 20 |
| abstract_inverted_index.effective | 173 |
| abstract_inverted_index.encodings | 174 |
| abstract_inverted_index.languages | 85 |
| abstract_inverted_index.leverages | 54 |
| abstract_inverted_index.platforms | 87 |
| abstract_inverted_index.Bottleneck | 44 |
| abstract_inverted_index.adjectives | 48 |
| abstract_inverted_index.bottleneck | 51 |
| abstract_inverted_index.classifier | 75 |
| abstract_inverted_index.downstream | 77 |
| abstract_inverted_index.important. | 23 |
| abstract_inverted_index.literature | 109 |
| abstract_inverted_index.adjectives, | 183 |
| abstract_inverted_index.demonstrate | 162 |
| abstract_inverted_index.embeddings, | 140 |
| abstract_inverted_index.outperforms | 101 |
| abstract_inverted_index.performance | 145 |
| abstract_inverted_index.recognition | 119 |
| abstract_inverted_index.transformer | 139 |
| abstract_inverted_index.transparent | 32 |
| abstract_inverted_index.Furthermore, | 132 |
| abstract_inverted_index.information. | 159 |
| abstract_inverted_index.light-weight | 74 |
| abstract_inverted_index.recognition, | 40 |
| abstract_inverted_index.recognition. | 180 |
| abstract_inverted_index.complementary | 158 |
| abstract_inverted_index.unprecedented | 12 |
| abstract_inverted_index.interpretable, | 171 |
| abstract_inverted_index.representation | 137, 156 |
| abstract_inverted_index.adjective-based | 66, 135, 164 |
| abstract_inverted_index.representation, | 67 |
| abstract_inverted_index.representations | 166 |
| abstract_inverted_index.interpretability. | 131 |
| abstract_inverted_index.human-interpretable | 50 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |