A Comparative Study of Lexical Substitution Approaches based on Neural Language Models Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2006.00031
Lexical substitution in context is an extremely powerful technology that can be used as a backbone of various NLP applications, such as word sense induction, lexical relation extraction, data augmentation, etc. In this paper, we present a large-scale comparative study of popular neural language and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, XLNet, applied to the task of lexical substitution. We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly, and compare several target injection methods. In addition, we provide analysis of the types of semantic relations between the target and substitutes generated by different models providing insights into what kind of words are really generated or given by annotators as substitutes.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2006.00031
- https://arxiv.org/pdf/2006.00031
- OA Status
- green
- Cited By
- 4
- References
- 12
- Related Works
- 19
- OpenAlex ID
- https://openalex.org/W3031959796
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3031959796Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2006.00031Digital Object Identifier
- Title
-
A Comparative Study of Lexical Substitution Approaches based on Neural Language ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-05-29Full publication date if available
- Authors
-
Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander PanchenkoList of authors in order
- Landing page
-
https://arxiv.org/abs/2006.00031Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2006.00031Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2006.00031Direct OA link when available
- Concepts
-
Substitution (logic), Computer science, Natural language processing, Artificial intelligence, Task (project management), Word (group theory), Context (archaeology), Language model, Linguistics, Programming language, Biology, Paleontology, Economics, Management, PhilosophyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2022: 1, 2021: 3Per-year citation counts (last 5 years)
- References (count)
-
12Number of works referenced by this work
- Related works (count)
-
19Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3031959796 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2006.00031 |
| ids.doi | https://doi.org/10.48550/arxiv.2006.00031 |
| ids.mag | 3031959796 |
| ids.openalex | https://openalex.org/W3031959796 |
| fwci | |
| type | preprint |
| title | A Comparative Study of Lexical Substitution Approaches based on Neural Language Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10028 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 1.0 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Topic Modeling |
| topics[1].id | https://openalex.org/T10181 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9998999834060669 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Natural Language Processing Techniques |
| topics[2].id | https://openalex.org/T11714 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9973999857902527 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Multimodal Machine Learning Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2778220771 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8227494955062866 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1522579 |
| concepts[0].display_name | Substitution (logic) |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.8201954364776611 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6997715830802917 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6448139548301697 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C2780451532 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6051037311553955 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q759676 |
| concepts[4].display_name | Task (project management) |
| concepts[5].id | https://openalex.org/C90805587 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5523218512535095 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q10944557 |
| concepts[5].display_name | Word (group theory) |
| concepts[6].id | https://openalex.org/C2779343474 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5519659519195557 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q3109175 |
| concepts[6].display_name | Context (archaeology) |
| concepts[7].id | https://openalex.org/C137293760 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4513194262981415 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q3621696 |
| concepts[7].display_name | Language model |
| concepts[8].id | https://openalex.org/C41895202 |
| concepts[8].level | 1 |
| concepts[8].score | 0.18097326159477234 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[8].display_name | Linguistics |
| concepts[9].id | https://openalex.org/C199360897 |
| concepts[9].level | 1 |
| concepts[9].score | 0.05939725041389465 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[9].display_name | Programming language |
| concepts[10].id | https://openalex.org/C86803240 |
| concepts[10].level | 0 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[10].display_name | Biology |
| concepts[11].id | https://openalex.org/C151730666 |
| concepts[11].level | 1 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7205 |
| concepts[11].display_name | Paleontology |
| concepts[12].id | https://openalex.org/C162324750 |
| concepts[12].level | 0 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[12].display_name | Economics |
| concepts[13].id | https://openalex.org/C187736073 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2920921 |
| concepts[13].display_name | Management |
| concepts[14].id | https://openalex.org/C138885662 |
| concepts[14].level | 0 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[14].display_name | Philosophy |
| keywords[0].id | https://openalex.org/keywords/substitution |
| keywords[0].score | 0.8227494955062866 |
| keywords[0].display_name | Substitution (logic) |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.8201954364776611 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.6997715830802917 |
| keywords[2].display_name | Natural language processing |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.6448139548301697 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/task |
| keywords[4].score | 0.6051037311553955 |
| keywords[4].display_name | Task (project management) |
| keywords[5].id | https://openalex.org/keywords/word |
| keywords[5].score | 0.5523218512535095 |
| keywords[5].display_name | Word (group theory) |
| keywords[6].id | https://openalex.org/keywords/context |
| keywords[6].score | 0.5519659519195557 |
| keywords[6].display_name | Context (archaeology) |
| keywords[7].id | https://openalex.org/keywords/language-model |
| keywords[7].score | 0.4513194262981415 |
| keywords[7].display_name | Language model |
| keywords[8].id | https://openalex.org/keywords/linguistics |
| keywords[8].score | 0.18097326159477234 |
| keywords[8].display_name | Linguistics |
| keywords[9].id | https://openalex.org/keywords/programming-language |
| keywords[9].score | 0.05939725041389465 |
| keywords[9].display_name | Programming language |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2006.00031 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2006.00031 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2006.00031 |
| locations[1].id | mag:3031959796 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | arXiv (Cornell University) |
| locations[1].landing_page_url | https://aps.arxiv.org/pdf/2006.00031 |
| locations[2].id | doi:10.48550/arxiv.2006.00031 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306400194 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | True |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | arXiv (Cornell University) |
| locations[2].source.host_organization | https://openalex.org/I205783295 |
| locations[2].source.host_organization_name | Cornell University |
| locations[2].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[2].license | cc-by |
| locations[2].pdf_url | |
| locations[2].version | |
| locations[2].raw_type | article |
| locations[2].license_id | https://openalex.org/licenses/cc-by |
| locations[2].is_accepted | False |
| locations[2].is_published | |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://doi.org/10.48550/arxiv.2006.00031 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5047490968 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Nikolay Arefyev |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Nikolay Arefyev |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5043031382 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Boris Sheludko |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Boris Sheludko |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5076392792 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-2892-7356 |
| authorships[2].author.display_name | Alexander Podolskiy |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Alexander Podolskiy |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5026157285 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-6097-6118 |
| authorships[3].author.display_name | Alexander Panchenko |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Alexander Panchenko |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2006.00031 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Comparative Study of Lexical Substitution Approaches based on Neural Language Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10028 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 1.0 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Topic Modeling |
| related_works | https://openalex.org/W107400853, https://openalex.org/W2985963672, https://openalex.org/W2251056711, https://openalex.org/W2953949198, https://openalex.org/W2121337471, https://openalex.org/W2951652751, https://openalex.org/W2997396640, https://openalex.org/W2804677078, https://openalex.org/W2954758923, https://openalex.org/W3115729981, https://openalex.org/W2152799930, https://openalex.org/W2602957597, https://openalex.org/W3165244476, https://openalex.org/W2942688450, https://openalex.org/W2995133078, https://openalex.org/W2164019165, https://openalex.org/W2137638032, https://openalex.org/W3104806203, https://openalex.org/W2913293333 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2022 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2021 |
| counts_by_year[1].cited_by_count | 3 |
| locations_count | 3 |
| best_oa_location.id | pmh:oai:arXiv.org:2006.00031 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2006.00031 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2006.00031 |
| primary_location.id | pmh:oai:arXiv.org:2006.00031 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2006.00031 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2006.00031 |
| publication_date | 2020-05-29 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W2469565805, https://openalex.org/W2081580037, https://openalex.org/W2953868689, https://openalex.org/W2117805747, https://openalex.org/W2963545917, https://openalex.org/W2950813464, https://openalex.org/W2985963672, https://openalex.org/W2153579005, https://openalex.org/W2803609229, https://openalex.org/W2940152587, https://openalex.org/W2951652751, https://openalex.org/W2896457183 |
| referenced_works_count | 12 |
| abstract_inverted_index.a | 14, 36 |
| abstract_inverted_index.In | 31, 93 |
| abstract_inverted_index.We | 64 |
| abstract_inverted_index.an | 5 |
| abstract_inverted_index.as | 13, 21, 52, 127 |
| abstract_inverted_index.be | 11, 75 |
| abstract_inverted_index.by | 71, 110, 125 |
| abstract_inverted_index.if | 78 |
| abstract_inverted_index.in | 2 |
| abstract_inverted_index.is | 4, 84 |
| abstract_inverted_index.of | 16, 40, 61, 98, 101, 118 |
| abstract_inverted_index.or | 123 |
| abstract_inverted_index.to | 58 |
| abstract_inverted_index.we | 34, 95 |
| abstract_inverted_index.NLP | 18 |
| abstract_inverted_index.and | 44, 49, 87, 107 |
| abstract_inverted_index.are | 120 |
| abstract_inverted_index.can | 10, 74 |
| abstract_inverted_index.the | 59, 81, 99, 105 |
| abstract_inverted_index.(LMs | 48 |
| abstract_inverted_index.SOTA | 72 |
| abstract_inverted_index.data | 28 |
| abstract_inverted_index.etc. | 30 |
| abstract_inverted_index.into | 115 |
| abstract_inverted_index.kind | 117 |
| abstract_inverted_index.show | 65 |
| abstract_inverted_index.such | 20, 51 |
| abstract_inverted_index.task | 60 |
| abstract_inverted_index.that | 9, 66 |
| abstract_inverted_index.this | 32 |
| abstract_inverted_index.used | 12 |
| abstract_inverted_index.what | 116 |
| abstract_inverted_index.word | 22, 83 |
| abstract_inverted_index.BERT, | 55 |
| abstract_inverted_index.ELMo, | 54 |
| abstract_inverted_index.about | 80 |
| abstract_inverted_index.given | 124 |
| abstract_inverted_index.sense | 23 |
| abstract_inverted_index.study | 39 |
| abstract_inverted_index.types | 100 |
| abstract_inverted_index.words | 119 |
| abstract_inverted_index.MLMs), | 50 |
| abstract_inverted_index.XLNet, | 56 |
| abstract_inverted_index.masked | 45 |
| abstract_inverted_index.models | 47, 112 |
| abstract_inverted_index.neural | 42 |
| abstract_inverted_index.paper, | 33 |
| abstract_inverted_index.really | 121 |
| abstract_inverted_index.target | 82, 90, 106 |
| abstract_inverted_index.Lexical | 0 |
| abstract_inverted_index.already | 67 |
| abstract_inverted_index.applied | 57 |
| abstract_inverted_index.between | 104 |
| abstract_inverted_index.compare | 88 |
| abstract_inverted_index.context | 3 |
| abstract_inverted_index.further | 76 |
| abstract_inverted_index.lexical | 25, 62 |
| abstract_inverted_index.popular | 41 |
| abstract_inverted_index.present | 35 |
| abstract_inverted_index.provide | 96 |
| abstract_inverted_index.results | 69 |
| abstract_inverted_index.several | 89 |
| abstract_inverted_index.various | 17 |
| abstract_inverted_index.LMs/MLMs | 73 |
| abstract_inverted_index.achieved | 70 |
| abstract_inverted_index.analysis | 97 |
| abstract_inverted_index.backbone | 15 |
| abstract_inverted_index.improved | 77 |
| abstract_inverted_index.injected | 85 |
| abstract_inverted_index.insights | 114 |
| abstract_inverted_index.language | 43, 46 |
| abstract_inverted_index.methods. | 92 |
| abstract_inverted_index.powerful | 7 |
| abstract_inverted_index.relation | 26 |
| abstract_inverted_index.semantic | 102 |
| abstract_inverted_index.addition, | 94 |
| abstract_inverted_index.different | 111 |
| abstract_inverted_index.extremely | 6 |
| abstract_inverted_index.generated | 109, 122 |
| abstract_inverted_index.injection | 91 |
| abstract_inverted_index.properly, | 86 |
| abstract_inverted_index.providing | 113 |
| abstract_inverted_index.relations | 103 |
| abstract_inverted_index.annotators | 126 |
| abstract_inverted_index.induction, | 24 |
| abstract_inverted_index.technology | 8 |
| abstract_inverted_index.comparative | 38 |
| abstract_inverted_index.competitive | 68 |
| abstract_inverted_index.extraction, | 27 |
| abstract_inverted_index.information | 79 |
| abstract_inverted_index.large-scale | 37 |
| abstract_inverted_index.substitutes | 108 |
| abstract_inverted_index.context2vec, | 53 |
| abstract_inverted_index.substitutes. | 128 |
| abstract_inverted_index.substitution | 1 |
| abstract_inverted_index.applications, | 19 |
| abstract_inverted_index.augmentation, | 29 |
| abstract_inverted_index.substitution. | 63 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.5299999713897705 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |