Entity Boundary Detection in Social Texts Using BiLSTM-CRF with Integrated Social Features Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.20944/preprints202505.1069.v1
This study addresses the challenges of unstructured expressions, semantic ambiguity, and noise interference in named entity recognition tasks on social texts. A recognition method is proposed that integrates the BiLSTM-CRF model with multi-source social features. The method uses a bidirectional long short-term memory network to extract contextual semantic information and applies a conditional random field for globally optimal sequence labeling. On this basis, social semantic features such as user interaction relations and topic labels are incorporated through feature concatenation. This enhances the model's ability to distinguish entity boundaries and categories. Experiments are conducted on the Twitter NER dataset. A systematic comparison is performed across different word embedding strategies, multi-source feature fusion settings, and input sequence lengths. The results show that the proposed method outperforms the baseline models in accuracy, precision, recall, and F1 score. In particular, it demonstrates stronger robustness and recognition ability when dealing with non-standard social texts. The model framework and experimental analysis presented in this paper offer effective technical support and methodological reference for named entity recognition in social text environments.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.20944/preprints202505.1069.v1
- https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pub
- OA Status
- green
- Cited By
- 3
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4410454738
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4410454738Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.20944/preprints202505.1069.v1Digital Object Identifier
- Title
-
Entity Boundary Detection in Social Texts Using BiLSTM-CRF with Integrated Social FeaturesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-14Full publication date if available
- Authors
-
Yufan Zhao, Weidong Zhang, Yu Cheng, Zhaoyang Xu, Yexin Tian, Zijing WeiList of authors in order
- Landing page
-
https://doi.org/10.20944/preprints202505.1069.v1Publisher landing page
- PDF URL
-
https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pubDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pubDirect OA link when available
- Concepts
-
Boundary (topology), Computer science, Artificial intelligence, Information retrieval, Natural language processing, Data science, Mathematics, Mathematical analysisTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
3Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 3Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4410454738 |
|---|---|
| doi | https://doi.org/10.20944/preprints202505.1069.v1 |
| ids.doi | https://doi.org/10.20944/preprints202505.1069.v1 |
| ids.openalex | https://openalex.org/W4410454738 |
| fwci | 16.52046332 |
| type | preprint |
| title | Entity Boundary Detection in Social Texts Using BiLSTM-CRF with Integrated Social Features |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11719 |
| topics[0].field.id | https://openalex.org/fields/18 |
| topics[0].field.display_name | Decision Sciences |
| topics[0].score | 0.9855999946594238 |
| topics[0].domain.id | https://openalex.org/domains/2 |
| topics[0].domain.display_name | Social Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1803 |
| topics[0].subfield.display_name | Management Science and Operations Research |
| topics[0].display_name | Data Quality and Management |
| topics[1].id | https://openalex.org/T12016 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9555000066757202 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1710 |
| topics[1].subfield.display_name | Information Systems |
| topics[1].display_name | Web Data Mining and Analysis |
| topics[2].id | https://openalex.org/T10679 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9531000256538391 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1710 |
| topics[2].subfield.display_name | Information Systems |
| topics[2].display_name | Service-Oriented Architecture and Web Services |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C62354387 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5692992210388184 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q875399 |
| concepts[0].display_name | Boundary (topology) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.45729079842567444 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.4105333387851715 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C23123220 |
| concepts[3].level | 1 |
| concepts[3].score | 0.4001203775405884 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[3].display_name | Information retrieval |
| concepts[4].id | https://openalex.org/C204321447 |
| concepts[4].level | 1 |
| concepts[4].score | 0.39897093176841736 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[4].display_name | Natural language processing |
| concepts[5].id | https://openalex.org/C2522767166 |
| concepts[5].level | 1 |
| concepts[5].score | 0.3747330605983734 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2374463 |
| concepts[5].display_name | Data science |
| concepts[6].id | https://openalex.org/C33923547 |
| concepts[6].level | 0 |
| concepts[6].score | 0.10740143060684204 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[6].display_name | Mathematics |
| concepts[7].id | https://openalex.org/C134306372 |
| concepts[7].level | 1 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[7].display_name | Mathematical analysis |
| keywords[0].id | https://openalex.org/keywords/boundary |
| keywords[0].score | 0.5692992210388184 |
| keywords[0].display_name | Boundary (topology) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.45729079842567444 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.4105333387851715 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/information-retrieval |
| keywords[3].score | 0.4001203775405884 |
| keywords[3].display_name | Information retrieval |
| keywords[4].id | https://openalex.org/keywords/natural-language-processing |
| keywords[4].score | 0.39897093176841736 |
| keywords[4].display_name | Natural language processing |
| keywords[5].id | https://openalex.org/keywords/data-science |
| keywords[5].score | 0.3747330605983734 |
| keywords[5].display_name | Data science |
| keywords[6].id | https://openalex.org/keywords/mathematics |
| keywords[6].score | 0.10740143060684204 |
| keywords[6].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.20944/preprints202505.1069.v1 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S6309402219 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Preprints.org |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310310987 |
| locations[0].source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pub |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.20944/preprints202505.1069.v1 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5007228024 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Yufan Zhao |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Yufan Zhao |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100428737 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-2433-6823 |
| authorships[1].author.display_name | Weidong Zhang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wuyang Zhang |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5102392489 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Yu Cheng |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Yu Cheng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5038456345 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Zhaoyang Xu |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhaoyang Xu |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5113315307 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Yexin Tian |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Yexin Tian |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5114244474 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Zijing Wei |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Zijing Wei |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pub |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Entity Boundary Detection in Social Texts Using BiLSTM-CRF with Integrated Social Features |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T11719 |
| primary_topic.field.id | https://openalex.org/fields/18 |
| primary_topic.field.display_name | Decision Sciences |
| primary_topic.score | 0.9855999946594238 |
| primary_topic.domain.id | https://openalex.org/domains/2 |
| primary_topic.domain.display_name | Social Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1803 |
| primary_topic.subfield.display_name | Management Science and Operations Research |
| primary_topic.display_name | Data Quality and Management |
| related_works | https://openalex.org/W3188962172, https://openalex.org/W2772917594, https://openalex.org/W4312825515, https://openalex.org/W4306742369, https://openalex.org/W4303457083, https://openalex.org/W2131146434, https://openalex.org/W2951359407, https://openalex.org/W4376623224, https://openalex.org/W4387849428, https://openalex.org/W3204019825 |
| cited_by_count | 3 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 3 |
| locations_count | 1 |
| best_oa_location.id | doi:10.20944/preprints202505.1069.v1 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S6309402219 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Preprints.org |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310310987 |
| best_oa_location.source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pub |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | posted-content |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.20944/preprints202505.1069.v1 |
| primary_location.id | doi:10.20944/preprints202505.1069.v1 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S6309402219 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Preprints.org |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310310987 |
| primary_location.source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.preprints.org/frontend/manuscript/704c0a8609ba0d3f43929efebfa5c2b5/download_pub |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.20944/preprints202505.1069.v1 |
| publication_date | 2025-05-14 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.A | 21, 98 |
| abstract_inverted_index.a | 38, 51 |
| abstract_inverted_index.F1 | 132 |
| abstract_inverted_index.In | 134 |
| abstract_inverted_index.On | 60 |
| abstract_inverted_index.as | 67 |
| abstract_inverted_index.in | 13, 127, 156, 170 |
| abstract_inverted_index.is | 24, 101 |
| abstract_inverted_index.it | 136 |
| abstract_inverted_index.of | 5 |
| abstract_inverted_index.on | 18, 93 |
| abstract_inverted_index.to | 44, 84 |
| abstract_inverted_index.NER | 96 |
| abstract_inverted_index.The | 35, 116, 149 |
| abstract_inverted_index.and | 10, 49, 71, 88, 112, 131, 140, 152, 163 |
| abstract_inverted_index.are | 74, 91 |
| abstract_inverted_index.for | 55, 166 |
| abstract_inverted_index.the | 3, 28, 81, 94, 120, 124 |
| abstract_inverted_index.This | 0, 79 |
| abstract_inverted_index.long | 40 |
| abstract_inverted_index.show | 118 |
| abstract_inverted_index.such | 66 |
| abstract_inverted_index.text | 172 |
| abstract_inverted_index.that | 26, 119 |
| abstract_inverted_index.this | 61, 157 |
| abstract_inverted_index.user | 68 |
| abstract_inverted_index.uses | 37 |
| abstract_inverted_index.when | 143 |
| abstract_inverted_index.with | 31, 145 |
| abstract_inverted_index.word | 105 |
| abstract_inverted_index.field | 54 |
| abstract_inverted_index.input | 113 |
| abstract_inverted_index.model | 30, 150 |
| abstract_inverted_index.named | 14, 167 |
| abstract_inverted_index.noise | 11 |
| abstract_inverted_index.offer | 159 |
| abstract_inverted_index.paper | 158 |
| abstract_inverted_index.study | 1 |
| abstract_inverted_index.tasks | 17 |
| abstract_inverted_index.topic | 72 |
| abstract_inverted_index.across | 103 |
| abstract_inverted_index.basis, | 62 |
| abstract_inverted_index.entity | 15, 86, 168 |
| abstract_inverted_index.fusion | 110 |
| abstract_inverted_index.labels | 73 |
| abstract_inverted_index.memory | 42 |
| abstract_inverted_index.method | 23, 36, 122 |
| abstract_inverted_index.models | 126 |
| abstract_inverted_index.random | 53 |
| abstract_inverted_index.score. | 133 |
| abstract_inverted_index.social | 19, 33, 63, 147, 171 |
| abstract_inverted_index.texts. | 20, 148 |
| abstract_inverted_index.Twitter | 95 |
| abstract_inverted_index.ability | 83, 142 |
| abstract_inverted_index.applies | 50 |
| abstract_inverted_index.dealing | 144 |
| abstract_inverted_index.extract | 45 |
| abstract_inverted_index.feature | 77, 109 |
| abstract_inverted_index.model's | 82 |
| abstract_inverted_index.network | 43 |
| abstract_inverted_index.optimal | 57 |
| abstract_inverted_index.recall, | 130 |
| abstract_inverted_index.results | 117 |
| abstract_inverted_index.support | 162 |
| abstract_inverted_index.through | 76 |
| abstract_inverted_index.analysis | 154 |
| abstract_inverted_index.baseline | 125 |
| abstract_inverted_index.dataset. | 97 |
| abstract_inverted_index.enhances | 80 |
| abstract_inverted_index.features | 65 |
| abstract_inverted_index.globally | 56 |
| abstract_inverted_index.lengths. | 115 |
| abstract_inverted_index.proposed | 25, 121 |
| abstract_inverted_index.semantic | 8, 47, 64 |
| abstract_inverted_index.sequence | 58, 114 |
| abstract_inverted_index.stronger | 138 |
| abstract_inverted_index.accuracy, | 128 |
| abstract_inverted_index.addresses | 2 |
| abstract_inverted_index.conducted | 92 |
| abstract_inverted_index.different | 104 |
| abstract_inverted_index.effective | 160 |
| abstract_inverted_index.embedding | 106 |
| abstract_inverted_index.features. | 34 |
| abstract_inverted_index.framework | 151 |
| abstract_inverted_index.labeling. | 59 |
| abstract_inverted_index.performed | 102 |
| abstract_inverted_index.presented | 155 |
| abstract_inverted_index.reference | 165 |
| abstract_inverted_index.relations | 70 |
| abstract_inverted_index.settings, | 111 |
| abstract_inverted_index.technical | 161 |
| abstract_inverted_index.BiLSTM-CRF | 29 |
| abstract_inverted_index.ambiguity, | 9 |
| abstract_inverted_index.boundaries | 87 |
| abstract_inverted_index.challenges | 4 |
| abstract_inverted_index.comparison | 100 |
| abstract_inverted_index.contextual | 46 |
| abstract_inverted_index.integrates | 27 |
| abstract_inverted_index.precision, | 129 |
| abstract_inverted_index.robustness | 139 |
| abstract_inverted_index.short-term | 41 |
| abstract_inverted_index.systematic | 99 |
| abstract_inverted_index.Experiments | 90 |
| abstract_inverted_index.categories. | 89 |
| abstract_inverted_index.conditional | 52 |
| abstract_inverted_index.distinguish | 85 |
| abstract_inverted_index.information | 48 |
| abstract_inverted_index.interaction | 69 |
| abstract_inverted_index.outperforms | 123 |
| abstract_inverted_index.particular, | 135 |
| abstract_inverted_index.recognition | 16, 22, 141, 169 |
| abstract_inverted_index.strategies, | 107 |
| abstract_inverted_index.demonstrates | 137 |
| abstract_inverted_index.experimental | 153 |
| abstract_inverted_index.expressions, | 7 |
| abstract_inverted_index.incorporated | 75 |
| abstract_inverted_index.interference | 12 |
| abstract_inverted_index.multi-source | 32, 108 |
| abstract_inverted_index.non-standard | 146 |
| abstract_inverted_index.unstructured | 6 |
| abstract_inverted_index.bidirectional | 39 |
| abstract_inverted_index.environments. | 173 |
| abstract_inverted_index.concatenation. | 78 |
| abstract_inverted_index.methodological | 164 |
| cited_by_percentile_year.max | 97 |
| cited_by_percentile_year.min | 96 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| citation_normalized_percentile.value | 0.97697924 |
| citation_normalized_percentile.is_in_top_1_percent | True |
| citation_normalized_percentile.is_in_top_10_percent | True |