A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.1145/3483524
By defining the computable word segmentation unit and studying its probability characteristics, we establish an unsupervised statistical language model (SLM) for a new pre-trained sequence labeling framework in this article. The proposed SLM is an optimization model, and its objective is to maximize the total binding force of all candidate word segmentation units in sentences under the condition of no annotated datasets and vocabularies. To solve SLM, we design a recursive divide-and-conquer dynamic programming algorithm. By integrating SLM with the popular sequence labeling models, Vietnamese word segmentation, part-of-speech tagging and named entity recognition experiments are performed. The experimental results show that our SLM can effectively promote the performance of sequence labeling tasks. Just using less than 10% of training data and without using a dictionary, the performance of our sequence labeling framework is better than the state-of-the-art Vietnamese word segmentation toolkit VnCoreNLP on the cross-dataset test. SLM has no hyper-parameter to be tuned, and it is completely unsupervised and applicable to any other analytic language. Thus, it has good domain adaptability.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1145/3483524
- https://dl.acm.org/doi/pdf/10.1145/3483524
- OA Status
- bronze
- References
- 38
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4200144452
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4200144452Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1145/3483524Digital Object Identifier
- Title
-
A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on VietnameseWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-12-13Full publication date if available
- Authors
-
Xianwen Liao, Yongzhong Huang, Peng Yang, Lei ChenList of authors in order
- Landing page
-
https://doi.org/10.1145/3483524Publisher landing page
- PDF URL
-
https://dl.acm.org/doi/pdf/10.1145/3483524Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
bronzeOpen access status per OpenAlex
- OA URL
-
https://dl.acm.org/doi/pdf/10.1145/3483524Direct OA link when available
- Concepts
-
Sequence labeling, Computer science, Word (group theory), Vietnamese, Segmentation, Artificial intelligence, Language model, Sequence (biology), Text segmentation, Natural language processing, Speech recognition, Pattern recognition (psychology), Mathematics, Linguistics, Philosophy, Genetics, Task (project management), Geometry, Management, Biology, EconomicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
38Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4200144452 |
|---|---|
| doi | https://doi.org/10.1145/3483524 |
| ids.doi | https://doi.org/10.1145/3483524 |
| ids.openalex | https://openalex.org/W4200144452 |
| fwci | 0.0 |
| type | article |
| title | A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese |
| awards[0].id | https://openalex.org/G4689664160 |
| awards[0].funder_id | https://openalex.org/F4320321001 |
| awards[0].display_name | |
| awards[0].funder_award_id | 61066008 and 61862011 |
| awards[0].funder_display_name | National Natural Science Foundation of China |
| biblio.issue | 3 |
| biblio.volume | 21 |
| biblio.last_page | 21 |
| biblio.first_page | 1 |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 1.0 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T10028 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 1.0 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Topic Modeling |
| topics[2].id | https://openalex.org/T11714 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9939000010490417 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Multimodal Machine Learning Applications |
| funders[0].id | https://openalex.org/F4320321001 |
| funders[0].ror | https://ror.org/01h0zpd94 |
| funders[0].display_name | National Natural Science Foundation of China |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C35639132 |
| concepts[0].level | 3 |
| concepts[0].score | 0.8636274337768555 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q7452468 |
| concepts[0].display_name | Sequence labeling |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.8277449607849121 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C90805587 |
| concepts[2].level | 2 |
| concepts[2].score | 0.669354259967804 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q10944557 |
| concepts[2].display_name | Word (group theory) |
| concepts[3].id | https://openalex.org/C103621254 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6512935757637024 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q9199 |
| concepts[3].display_name | Vietnamese |
| concepts[4].id | https://openalex.org/C89600930 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6453977823257446 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q1423946 |
| concepts[4].display_name | Segmentation |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.6402838230133057 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C137293760 |
| concepts[6].level | 2 |
| concepts[6].score | 0.6336132884025574 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q3621696 |
| concepts[6].display_name | Language model |
| concepts[7].id | https://openalex.org/C2778112365 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5939465165138245 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q3511065 |
| concepts[7].display_name | Sequence (biology) |
| concepts[8].id | https://openalex.org/C98501671 |
| concepts[8].level | 3 |
| concepts[8].score | 0.5462827086448669 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q1948408 |
| concepts[8].display_name | Text segmentation |
| concepts[9].id | https://openalex.org/C204321447 |
| concepts[9].level | 1 |
| concepts[9].score | 0.5406365990638733 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[9].display_name | Natural language processing |
| concepts[10].id | https://openalex.org/C28490314 |
| concepts[10].level | 1 |
| concepts[10].score | 0.46471866965293884 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q189436 |
| concepts[10].display_name | Speech recognition |
| concepts[11].id | https://openalex.org/C153180895 |
| concepts[11].level | 2 |
| concepts[11].score | 0.3334161043167114 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[11].display_name | Pattern recognition (psychology) |
| concepts[12].id | https://openalex.org/C33923547 |
| concepts[12].level | 0 |
| concepts[12].score | 0.10607311129570007 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[12].display_name | Mathematics |
| concepts[13].id | https://openalex.org/C41895202 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[13].display_name | Linguistics |
| concepts[14].id | https://openalex.org/C138885662 |
| concepts[14].level | 0 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[14].display_name | Philosophy |
| concepts[15].id | https://openalex.org/C54355233 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q7162 |
| concepts[15].display_name | Genetics |
| concepts[16].id | https://openalex.org/C2780451532 |
| concepts[16].level | 2 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q759676 |
| concepts[16].display_name | Task (project management) |
| concepts[17].id | https://openalex.org/C2524010 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[17].display_name | Geometry |
| concepts[18].id | https://openalex.org/C187736073 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q2920921 |
| concepts[18].display_name | Management |
| concepts[19].id | https://openalex.org/C86803240 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[19].display_name | Biology |
| concepts[20].id | https://openalex.org/C162324750 |
| concepts[20].level | 0 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[20].display_name | Economics |
| keywords[0].id | https://openalex.org/keywords/sequence-labeling |
| keywords[0].score | 0.8636274337768555 |
| keywords[0].display_name | Sequence labeling |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.8277449607849121 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/word |
| keywords[2].score | 0.669354259967804 |
| keywords[2].display_name | Word (group theory) |
| keywords[3].id | https://openalex.org/keywords/vietnamese |
| keywords[3].score | 0.6512935757637024 |
| keywords[3].display_name | Vietnamese |
| keywords[4].id | https://openalex.org/keywords/segmentation |
| keywords[4].score | 0.6453977823257446 |
| keywords[4].display_name | Segmentation |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.6402838230133057 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/language-model |
| keywords[6].score | 0.6336132884025574 |
| keywords[6].display_name | Language model |
| keywords[7].id | https://openalex.org/keywords/sequence |
| keywords[7].score | 0.5939465165138245 |
| keywords[7].display_name | Sequence (biology) |
| keywords[8].id | https://openalex.org/keywords/text-segmentation |
| keywords[8].score | 0.5462827086448669 |
| keywords[8].display_name | Text segmentation |
| keywords[9].id | https://openalex.org/keywords/natural-language-processing |
| keywords[9].score | 0.5406365990638733 |
| keywords[9].display_name | Natural language processing |
| keywords[10].id | https://openalex.org/keywords/speech-recognition |
| keywords[10].score | 0.46471866965293884 |
| keywords[10].display_name | Speech recognition |
| keywords[11].id | https://openalex.org/keywords/pattern-recognition |
| keywords[11].score | 0.3334161043167114 |
| keywords[11].display_name | Pattern recognition (psychology) |
| keywords[12].id | https://openalex.org/keywords/mathematics |
| keywords[12].score | 0.10607311129570007 |
| keywords[12].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.1145/3483524 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306421405 |
| locations[0].source.issn | 2375-4699, 2375-4702 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 2375-4699 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | ACM Transactions on Asian and Low-Resource Language Information Processing |
| locations[0].source.host_organization | https://openalex.org/P4310319798 |
| locations[0].source.host_organization_name | Association for Computing Machinery |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310319798 |
| locations[0].source.host_organization_lineage_names | Association for Computing Machinery |
| locations[0].license | |
| locations[0].pdf_url | https://dl.acm.org/doi/pdf/10.1145/3483524 |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | ACM Transactions on Asian and Low-Resource Language Information Processing |
| locations[0].landing_page_url | https://doi.org/10.1145/3483524 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5084117790 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-5755-1944 |
| authorships[0].author.display_name | Xianwen Liao |
| authorships[0].countries | CN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I5343935 |
| authorships[0].affiliations[0].raw_affiliation_string | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[0].institutions[0].id | https://openalex.org/I5343935 |
| authorships[0].institutions[0].ror | https://ror.org/05arjae42 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I5343935 |
| authorships[0].institutions[0].country_code | CN |
| authorships[0].institutions[0].display_name | Guilin University of Electronic Technology |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Xianwen Liao |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[1].author.id | https://openalex.org/A5101669174 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-1620-3107 |
| authorships[1].author.display_name | Yongzhong Huang |
| authorships[1].countries | CN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I5343935 |
| authorships[1].affiliations[0].raw_affiliation_string | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[1].institutions[0].id | https://openalex.org/I5343935 |
| authorships[1].institutions[0].ror | https://ror.org/05arjae42 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I5343935 |
| authorships[1].institutions[0].country_code | CN |
| authorships[1].institutions[0].display_name | Guilin University of Electronic Technology |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Yongzhong Huang |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[2].author.id | https://openalex.org/A5025314679 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1184-8117 |
| authorships[2].author.display_name | Peng Yang |
| authorships[2].countries | CN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I5343935 |
| authorships[2].affiliations[0].raw_affiliation_string | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[2].institutions[0].id | https://openalex.org/I5343935 |
| authorships[2].institutions[0].ror | https://ror.org/05arjae42 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I5343935 |
| authorships[2].institutions[0].country_code | CN |
| authorships[2].institutions[0].display_name | Guilin University of Electronic Technology |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Peng Yang |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[3].author.id | https://openalex.org/A5100333436 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-0625-4808 |
| authorships[3].author.display_name | Lei Chen |
| authorships[3].countries | CN |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I5343935 |
| authorships[3].affiliations[0].raw_affiliation_string | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| authorships[3].institutions[0].id | https://openalex.org/I5343935 |
| authorships[3].institutions[0].ror | https://ror.org/05arjae42 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I5343935 |
| authorships[3].institutions[0].country_code | CN |
| authorships[3].institutions[0].display_name | Guilin University of Electronic Technology |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Lei Chen |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Guilin University of Electronic Technology, Guilin, Guangxi, China |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://dl.acm.org/doi/pdf/10.1145/3483524 |
| open_access.oa_status | bronze |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Statistical Language Model for Pre-Trained Sequence Labeling: A Case Study on Vietnamese |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 1.0 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| related_works | https://openalex.org/W2946128423, https://openalex.org/W2944691285, https://openalex.org/W2918555272, https://openalex.org/W2393940967, https://openalex.org/W2159591557, https://openalex.org/W2346578824, https://openalex.org/W2366925922, https://openalex.org/W2115592387, https://openalex.org/W2905950556, https://openalex.org/W2385598138 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.1145/3483524 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306421405 |
| best_oa_location.source.issn | 2375-4699, 2375-4702 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | 2375-4699 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | ACM Transactions on Asian and Low-Resource Language Information Processing |
| best_oa_location.source.host_organization | https://openalex.org/P4310319798 |
| best_oa_location.source.host_organization_name | Association for Computing Machinery |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310319798 |
| best_oa_location.source.host_organization_lineage_names | Association for Computing Machinery |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://dl.acm.org/doi/pdf/10.1145/3483524 |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | ACM Transactions on Asian and Low-Resource Language Information Processing |
| best_oa_location.landing_page_url | https://doi.org/10.1145/3483524 |
| primary_location.id | doi:10.1145/3483524 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306421405 |
| primary_location.source.issn | 2375-4699, 2375-4702 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 2375-4699 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | ACM Transactions on Asian and Low-Resource Language Information Processing |
| primary_location.source.host_organization | https://openalex.org/P4310319798 |
| primary_location.source.host_organization_name | Association for Computing Machinery |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310319798 |
| primary_location.source.host_organization_lineage_names | Association for Computing Machinery |
| primary_location.license | |
| primary_location.pdf_url | https://dl.acm.org/doi/pdf/10.1145/3483524 |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | ACM Transactions on Asian and Low-Resource Language Information Processing |
| primary_location.landing_page_url | https://doi.org/10.1145/3483524 |
| publication_date | 2021-12-13 |
| publication_year | 2021 |
| referenced_works | https://openalex.org/W2998566943, https://openalex.org/W2998448492, https://openalex.org/W2963628345, https://openalex.org/W2964352165, https://openalex.org/W2250739653, https://openalex.org/W2963341956, https://openalex.org/W2564413773, https://openalex.org/W2998153464, https://openalex.org/W2064675550, https://openalex.org/W2997919746, https://openalex.org/W2358307482, https://openalex.org/W2296283641, https://openalex.org/W1757859293, https://openalex.org/W2998635301, https://openalex.org/W2911855264, https://openalex.org/W2962950859, https://openalex.org/W2998017003, https://openalex.org/W2962902328, https://openalex.org/W2971254483, https://openalex.org/W2998320111, https://openalex.org/W2945210837, https://openalex.org/W3169250374, https://openalex.org/W2250539671, https://openalex.org/W2962739339, https://openalex.org/W2997918300, https://openalex.org/W2997792775, https://openalex.org/W3009619352, https://openalex.org/W2996160789, https://openalex.org/W2998456908, https://openalex.org/W2518150831, https://openalex.org/W2782238183, https://openalex.org/W2997124358, https://openalex.org/W6683955732, https://openalex.org/W2964093505, https://openalex.org/W3145501851, https://openalex.org/W3104453603, https://openalex.org/W4365799947, https://openalex.org/W1524281572 |
| referenced_works_count | 38 |
| abstract_inverted_index.a | 21, 69, 123 |
| abstract_inverted_index.By | 0, 75 |
| abstract_inverted_index.To | 64 |
| abstract_inverted_index.an | 14, 34 |
| abstract_inverted_index.be | 151 |
| abstract_inverted_index.in | 27, 53 |
| abstract_inverted_index.is | 33, 40, 132, 155 |
| abstract_inverted_index.it | 154, 166 |
| abstract_inverted_index.no | 59, 148 |
| abstract_inverted_index.of | 47, 58, 108, 117, 127 |
| abstract_inverted_index.on | 142 |
| abstract_inverted_index.to | 41, 150, 160 |
| abstract_inverted_index.we | 12, 67 |
| abstract_inverted_index.10% | 116 |
| abstract_inverted_index.SLM | 32, 77, 102, 146 |
| abstract_inverted_index.The | 30, 96 |
| abstract_inverted_index.all | 48 |
| abstract_inverted_index.and | 7, 37, 62, 89, 120, 153, 158 |
| abstract_inverted_index.any | 161 |
| abstract_inverted_index.are | 94 |
| abstract_inverted_index.can | 103 |
| abstract_inverted_index.for | 20 |
| abstract_inverted_index.has | 147, 167 |
| abstract_inverted_index.its | 9, 38 |
| abstract_inverted_index.new | 22 |
| abstract_inverted_index.our | 101, 128 |
| abstract_inverted_index.the | 2, 43, 56, 79, 106, 125, 135, 143 |
| abstract_inverted_index.Just | 112 |
| abstract_inverted_index.SLM, | 66 |
| abstract_inverted_index.data | 119 |
| abstract_inverted_index.good | 168 |
| abstract_inverted_index.less | 114 |
| abstract_inverted_index.show | 99 |
| abstract_inverted_index.than | 115, 134 |
| abstract_inverted_index.that | 100 |
| abstract_inverted_index.this | 28 |
| abstract_inverted_index.unit | 6 |
| abstract_inverted_index.with | 78 |
| abstract_inverted_index.word | 4, 50, 85, 138 |
| abstract_inverted_index.(SLM) | 19 |
| abstract_inverted_index.Thus, | 165 |
| abstract_inverted_index.force | 46 |
| abstract_inverted_index.model | 18 |
| abstract_inverted_index.named | 90 |
| abstract_inverted_index.other | 162 |
| abstract_inverted_index.solve | 65 |
| abstract_inverted_index.test. | 145 |
| abstract_inverted_index.total | 44 |
| abstract_inverted_index.under | 55 |
| abstract_inverted_index.units | 52 |
| abstract_inverted_index.using | 113, 122 |
| abstract_inverted_index.better | 133 |
| abstract_inverted_index.design | 68 |
| abstract_inverted_index.domain | 169 |
| abstract_inverted_index.entity | 91 |
| abstract_inverted_index.model, | 36 |
| abstract_inverted_index.tasks. | 111 |
| abstract_inverted_index.tuned, | 152 |
| abstract_inverted_index.binding | 45 |
| abstract_inverted_index.dynamic | 72 |
| abstract_inverted_index.models, | 83 |
| abstract_inverted_index.popular | 80 |
| abstract_inverted_index.promote | 105 |
| abstract_inverted_index.results | 98 |
| abstract_inverted_index.tagging | 88 |
| abstract_inverted_index.toolkit | 140 |
| abstract_inverted_index.without | 121 |
| abstract_inverted_index.analytic | 163 |
| abstract_inverted_index.article. | 29 |
| abstract_inverted_index.datasets | 61 |
| abstract_inverted_index.defining | 1 |
| abstract_inverted_index.labeling | 25, 82, 110, 130 |
| abstract_inverted_index.language | 17 |
| abstract_inverted_index.maximize | 42 |
| abstract_inverted_index.proposed | 31 |
| abstract_inverted_index.sequence | 24, 81, 109, 129 |
| abstract_inverted_index.studying | 8 |
| abstract_inverted_index.training | 118 |
| abstract_inverted_index.VnCoreNLP | 141 |
| abstract_inverted_index.annotated | 60 |
| abstract_inverted_index.candidate | 49 |
| abstract_inverted_index.condition | 57 |
| abstract_inverted_index.establish | 13 |
| abstract_inverted_index.framework | 26, 131 |
| abstract_inverted_index.language. | 164 |
| abstract_inverted_index.objective | 39 |
| abstract_inverted_index.recursive | 70 |
| abstract_inverted_index.sentences | 54 |
| abstract_inverted_index.Vietnamese | 84, 137 |
| abstract_inverted_index.algorithm. | 74 |
| abstract_inverted_index.applicable | 159 |
| abstract_inverted_index.completely | 156 |
| abstract_inverted_index.computable | 3 |
| abstract_inverted_index.performed. | 95 |
| abstract_inverted_index.dictionary, | 124 |
| abstract_inverted_index.effectively | 104 |
| abstract_inverted_index.experiments | 93 |
| abstract_inverted_index.integrating | 76 |
| abstract_inverted_index.performance | 107, 126 |
| abstract_inverted_index.pre-trained | 23 |
| abstract_inverted_index.probability | 10 |
| abstract_inverted_index.programming | 73 |
| abstract_inverted_index.recognition | 92 |
| abstract_inverted_index.statistical | 16 |
| abstract_inverted_index.experimental | 97 |
| abstract_inverted_index.optimization | 35 |
| abstract_inverted_index.segmentation | 5, 51, 139 |
| abstract_inverted_index.unsupervised | 15, 157 |
| abstract_inverted_index.adaptability. | 170 |
| abstract_inverted_index.cross-dataset | 144 |
| abstract_inverted_index.segmentation, | 86 |
| abstract_inverted_index.vocabularies. | 63 |
| abstract_inverted_index.part-of-speech | 87 |
| abstract_inverted_index.hyper-parameter | 149 |
| abstract_inverted_index.characteristics, | 11 |
| abstract_inverted_index.state-of-the-art | 136 |
| abstract_inverted_index.divide-and-conquer | 71 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.7799999713897705 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile.value | 0.18565527 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |