English text topic classification using BERT-based model Article Swipe
The rapid development of big data and artificial intelligence has made text topic classification an important part of natural language processing research, and it has also promoted the optimization of pre-trained model performance. In order to better promote the application of pre-trained models and improve the effect of text topic classification, this paper introduces the BERT (Bidirectional Encoder Representations from Transformer) model to conduct an in-depth exploration of English text topic classification. The text preprocesses the English text dataset through operations such as denoising, converting to lowercase, and removing stops, and then uses synonymous substitution to enhance the English text data. Subsequently, the BERT model was pre-trained, and the model was optimized and a BERT-based model structure was designed, followed by the construction of a topic classifier. Finally, this article also evaluated the practical effectiveness of the BERT-based model in English text topic classification. The research results show that when the classification number is 5, the BERT-based model can achieve the highest accuracy of 96.49%; when the number of tests is 50, the recall rate and F1 value of the BERT-based model are 96.10% and 91.66%, respectively, when the classification number is 5. The research results indicate that applying the BERT-based model to English text topic classification is completely feasible. It can improve its accuracy and recall, reduce classification time, and improve classification performance. Applying it to text classification can better improve the efficiency of text classification.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1177/14727978251321982
- https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982
- OA Status
- bronze
- References
- 23
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4408128886
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4408128886Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1177/14727978251321982Digital Object Identifier
- Title
-
English text topic classification using BERT-based modelWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-01-01Full publication date if available
- Authors
-
Xi Li, Lili JiaList of authors in order
- Landing page
-
https://doi.org/10.1177/14727978251321982Publisher landing page
- PDF URL
-
https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
bronzeOpen access status per OpenAlex
- OA URL
-
https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982Direct OA link when available
- Concepts
-
Computer science, Natural language processing, Artificial intelligence, Information retrievalTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
23Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4408128886 |
|---|---|
| doi | https://doi.org/10.1177/14727978251321982 |
| ids.doi | https://doi.org/10.1177/14727978251321982 |
| ids.openalex | https://openalex.org/W4408128886 |
| fwci | 0.0 |
| type | article |
| title | English text topic classification using BERT-based model |
| biblio.issue | 1 |
| biblio.volume | 25 |
| biblio.last_page | 684 |
| biblio.first_page | 669 |
| topics[0].id | https://openalex.org/T13083 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.991100013256073 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Advanced Text Analysis Techniques |
| topics[1].id | https://openalex.org/T10028 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9902999997138977 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Topic Modeling |
| topics[2].id | https://openalex.org/T11550 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9901000261306763 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Text and Document Classification Technologies |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.674817681312561 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C204321447 |
| concepts[1].level | 1 |
| concepts[1].score | 0.6637598276138306 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[1].display_name | Natural language processing |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.5644122362136841 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C23123220 |
| concepts[3].level | 1 |
| concepts[3].score | 0.32303550839424133 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[3].display_name | Information retrieval |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.674817681312561 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/natural-language-processing |
| keywords[1].score | 0.6637598276138306 |
| keywords[1].display_name | Natural language processing |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.5644122362136841 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/information-retrieval |
| keywords[3].score | 0.32303550839424133 |
| keywords[3].display_name | Information retrieval |
| language | en |
| locations[0].id | doi:10.1177/14727978251321982 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S2765058733 |
| locations[0].source.issn | 1472-7978, 1875-8983 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 1472-7978 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Journal of Computational Methods in Sciences and Engineering |
| locations[0].source.host_organization | https://openalex.org/P4310318577 |
| locations[0].source.host_organization_name | IOS Press |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310318577 |
| locations[0].source.host_organization_lineage_names | IOS Press |
| locations[0].license | |
| locations[0].pdf_url | https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982 |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Journal of Computational Methods in Sciences and Engineering |
| locations[0].landing_page_url | https://doi.org/10.1177/14727978251321982 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5100407758 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-3023-1662 |
| authorships[0].author.display_name | Xi Li |
| authorships[0].countries | CN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I206324368 |
| authorships[0].affiliations[0].raw_affiliation_string | School of Foreign Languages, Shaoyang University, Shaoyang, China |
| authorships[0].institutions[0].id | https://openalex.org/I206324368 |
| authorships[0].institutions[0].ror | https://ror.org/03fx09x73 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I206324368 |
| authorships[0].institutions[0].country_code | CN |
| authorships[0].institutions[0].display_name | Shaoyang University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Xi Li |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | School of Foreign Languages, Shaoyang University, Shaoyang, China |
| authorships[1].author.id | https://openalex.org/A5101599363 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2336-6446 |
| authorships[1].author.display_name | Lili Jia |
| authorships[1].countries | CN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I4210131125 |
| authorships[1].affiliations[0].raw_affiliation_string | School of Foreign Languages, Huanghuai University, Zhumadian, China |
| authorships[1].institutions[0].id | https://openalex.org/I4210131125 |
| authorships[1].institutions[0].ror | https://ror.org/02k92ks68 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I4210131125 |
| authorships[1].institutions[0].country_code | CN |
| authorships[1].institutions[0].display_name | Huanghuai University |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Lili Jia |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | School of Foreign Languages, Huanghuai University, Zhumadian, China |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982 |
| open_access.oa_status | bronze |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | English text topic classification using BERT-based model |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T13083 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.991100013256073 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Advanced Text Analysis Techniques |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.1177/14727978251321982 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2765058733 |
| best_oa_location.source.issn | 1472-7978, 1875-8983 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | 1472-7978 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Journal of Computational Methods in Sciences and Engineering |
| best_oa_location.source.host_organization | https://openalex.org/P4310318577 |
| best_oa_location.source.host_organization_name | IOS Press |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310318577 |
| best_oa_location.source.host_organization_lineage_names | IOS Press |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982 |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Journal of Computational Methods in Sciences and Engineering |
| best_oa_location.landing_page_url | https://doi.org/10.1177/14727978251321982 |
| primary_location.id | doi:10.1177/14727978251321982 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S2765058733 |
| primary_location.source.issn | 1472-7978, 1875-8983 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 1472-7978 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Journal of Computational Methods in Sciences and Engineering |
| primary_location.source.host_organization | https://openalex.org/P4310318577 |
| primary_location.source.host_organization_name | IOS Press |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310318577 |
| primary_location.source.host_organization_lineage_names | IOS Press |
| primary_location.license | |
| primary_location.pdf_url | https://journals.sagepub.com/doi/pdf/10.1177/14727978251321982 |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Journal of Computational Methods in Sciences and Engineering |
| primary_location.landing_page_url | https://doi.org/10.1177/14727978251321982 |
| publication_date | 2025-01-01 |
| publication_year | 2025 |
| referenced_works | https://openalex.org/W4226418765, https://openalex.org/W3021053579, https://openalex.org/W4309217541, https://openalex.org/W3093573261, https://openalex.org/W4225139287, https://openalex.org/W3157815101, https://openalex.org/W3156333129, https://openalex.org/W3168278108, https://openalex.org/W2909182718, https://openalex.org/W2898339904, https://openalex.org/W4224135931, https://openalex.org/W4281756552, https://openalex.org/W4366698663, https://openalex.org/W3109416014, https://openalex.org/W4282591925, https://openalex.org/W4317664001, https://openalex.org/W4239564559, https://openalex.org/W3168170345, https://openalex.org/W4224957132, https://openalex.org/W3207838037, https://openalex.org/W3048338659, https://openalex.org/W3128513378, https://openalex.org/W4281673713 |
| referenced_works_count | 23 |
| abstract_inverted_index.a | 113, 124 |
| abstract_inverted_index.5, | 154 |
| abstract_inverted_index.5. | 192 |
| abstract_inverted_index.F1 | 176 |
| abstract_inverted_index.In | 33 |
| abstract_inverted_index.It | 210 |
| abstract_inverted_index.an | 14, 64 |
| abstract_inverted_index.as | 82 |
| abstract_inverted_index.by | 120 |
| abstract_inverted_index.in | 139 |
| abstract_inverted_index.is | 153, 170, 191, 207 |
| abstract_inverted_index.it | 23, 225 |
| abstract_inverted_index.of | 3, 17, 29, 40, 47, 67, 123, 135, 163, 168, 178, 234 |
| abstract_inverted_index.to | 35, 62, 85, 95, 202, 226 |
| abstract_inverted_index.50, | 171 |
| abstract_inverted_index.The | 0, 72, 144, 193 |
| abstract_inverted_index.and | 6, 22, 43, 87, 90, 107, 112, 175, 184, 215, 220 |
| abstract_inverted_index.are | 182 |
| abstract_inverted_index.big | 4 |
| abstract_inverted_index.can | 158, 211, 229 |
| abstract_inverted_index.has | 9, 24 |
| abstract_inverted_index.its | 213 |
| abstract_inverted_index.the | 27, 38, 45, 54, 75, 97, 102, 108, 121, 132, 136, 150, 155, 160, 166, 172, 179, 188, 199, 232 |
| abstract_inverted_index.was | 105, 110, 117 |
| abstract_inverted_index.BERT | 55, 103 |
| abstract_inverted_index.also | 25, 130 |
| abstract_inverted_index.data | 5 |
| abstract_inverted_index.from | 59 |
| abstract_inverted_index.made | 10 |
| abstract_inverted_index.part | 16 |
| abstract_inverted_index.rate | 174 |
| abstract_inverted_index.show | 147 |
| abstract_inverted_index.such | 81 |
| abstract_inverted_index.text | 11, 48, 69, 73, 77, 99, 141, 204, 227, 235 |
| abstract_inverted_index.that | 148, 197 |
| abstract_inverted_index.then | 91 |
| abstract_inverted_index.this | 51, 128 |
| abstract_inverted_index.uses | 92 |
| abstract_inverted_index.when | 149, 165, 187 |
| abstract_inverted_index.data. | 100 |
| abstract_inverted_index.model | 31, 61, 104, 109, 115, 138, 157, 181, 201 |
| abstract_inverted_index.order | 34 |
| abstract_inverted_index.paper | 52 |
| abstract_inverted_index.rapid | 1 |
| abstract_inverted_index.tests | 169 |
| abstract_inverted_index.time, | 219 |
| abstract_inverted_index.topic | 12, 49, 70, 125, 142, 205 |
| abstract_inverted_index.value | 177 |
| abstract_inverted_index.96.10% | 183 |
| abstract_inverted_index.better | 36, 230 |
| abstract_inverted_index.effect | 46 |
| abstract_inverted_index.models | 42 |
| abstract_inverted_index.number | 152, 167, 190 |
| abstract_inverted_index.recall | 173 |
| abstract_inverted_index.reduce | 217 |
| abstract_inverted_index.stops, | 89 |
| abstract_inverted_index.91.66%, | 185 |
| abstract_inverted_index.96.49%; | 164 |
| abstract_inverted_index.Encoder | 57 |
| abstract_inverted_index.English | 68, 76, 98, 140, 203 |
| abstract_inverted_index.achieve | 159 |
| abstract_inverted_index.article | 129 |
| abstract_inverted_index.conduct | 63 |
| abstract_inverted_index.dataset | 78 |
| abstract_inverted_index.enhance | 96 |
| abstract_inverted_index.highest | 161 |
| abstract_inverted_index.improve | 44, 212, 221, 231 |
| abstract_inverted_index.natural | 18 |
| abstract_inverted_index.promote | 37 |
| abstract_inverted_index.recall, | 216 |
| abstract_inverted_index.results | 146, 195 |
| abstract_inverted_index.through | 79 |
| abstract_inverted_index.Applying | 224 |
| abstract_inverted_index.Finally, | 127 |
| abstract_inverted_index.accuracy | 162, 214 |
| abstract_inverted_index.applying | 198 |
| abstract_inverted_index.followed | 119 |
| abstract_inverted_index.in-depth | 65 |
| abstract_inverted_index.indicate | 196 |
| abstract_inverted_index.language | 19 |
| abstract_inverted_index.promoted | 26 |
| abstract_inverted_index.removing | 88 |
| abstract_inverted_index.research | 145, 194 |
| abstract_inverted_index.designed, | 118 |
| abstract_inverted_index.evaluated | 131 |
| abstract_inverted_index.feasible. | 209 |
| abstract_inverted_index.important | 15 |
| abstract_inverted_index.optimized | 111 |
| abstract_inverted_index.practical | 133 |
| abstract_inverted_index.research, | 21 |
| abstract_inverted_index.structure | 116 |
| abstract_inverted_index.BERT-based | 114, 137, 156, 180, 200 |
| abstract_inverted_index.artificial | 7 |
| abstract_inverted_index.completely | 208 |
| abstract_inverted_index.converting | 84 |
| abstract_inverted_index.denoising, | 83 |
| abstract_inverted_index.efficiency | 233 |
| abstract_inverted_index.introduces | 53 |
| abstract_inverted_index.lowercase, | 86 |
| abstract_inverted_index.operations | 80 |
| abstract_inverted_index.processing | 20 |
| abstract_inverted_index.synonymous | 93 |
| abstract_inverted_index.application | 39 |
| abstract_inverted_index.classifier. | 126 |
| abstract_inverted_index.development | 2 |
| abstract_inverted_index.exploration | 66 |
| abstract_inverted_index.pre-trained | 30, 41 |
| abstract_inverted_index.Transformer) | 60 |
| abstract_inverted_index.construction | 122 |
| abstract_inverted_index.intelligence | 8 |
| abstract_inverted_index.optimization | 28 |
| abstract_inverted_index.performance. | 32, 223 |
| abstract_inverted_index.pre-trained, | 106 |
| abstract_inverted_index.preprocesses | 74 |
| abstract_inverted_index.substitution | 94 |
| abstract_inverted_index.Subsequently, | 101 |
| abstract_inverted_index.effectiveness | 134 |
| abstract_inverted_index.respectively, | 186 |
| abstract_inverted_index.(Bidirectional | 56 |
| abstract_inverted_index.classification | 13, 151, 189, 206, 218, 222, 228 |
| abstract_inverted_index.Representations | 58 |
| abstract_inverted_index.classification, | 50 |
| abstract_inverted_index.classification. | 71, 143, 236 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile.value | 0.02410784 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |