Aggregating residue-level protein language model embeddings with optimal transport Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1093/bioadv/vbaf060
Motivation Protein language models (PLMs) have emerged as powerful approaches for mapping protein sequences into embeddings suitable for various applications. As protein representation schemes, PLMs generate per-token (i.e. per-residue) representations, resulting in variable-sized outputs based on protein length. This variability poses a challenge for protein-level prediction tasks that require uniform-sized embeddings for consistent analysis across different proteins. Previous work has typically used average pooling to summarize token-level PLM outputs, but it is unclear whether this method effectively prioritizes the relevant information across token-level representations. Results We introduce a novel method utilizing optimal transport to convert variable-length PLM outputs into fixed-length representations. We conceptualize per-token PLM outputs as samples from a probabilistic distribution and employ sliced-Wasserstein distances to map these samples against a reference set, creating a Euclidean embedding in the output space. The resulting embedding is agnostic to the length of the input and represents the entire protein. We demonstrate the superiority of our method over average pooling for several downstream prediction tasks, particularly with constrained PLM sizes, enabling smaller-scale PLMs to match or exceed the performance of average-pooled larger-scale PLMs. Our aggregation scheme is especially effective for longer protein sequences by capturing essential information that might be lost through average pooling. Availability and implementation Our implementation code can be found at https://github.com/navid-naderi/PLM_SWE.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1093/bioadv/vbaf060
- https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdf
- OA Status
- gold
- Cited By
- 3
- References
- 35
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4408717486
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4408717486Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1093/bioadv/vbaf060Digital Object Identifier
- Title
-
Aggregating residue-level protein language model embeddings with optimal transportWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-26Full publication date if available
- Authors
-
Navid Naderializadeh, Rohit SinghList of authors in order
- Landing page
-
https://doi.org/10.1093/bioadv/vbaf060Publisher landing page
- PDF URL
-
https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdfDirect OA link when available
- Concepts
-
Pooling, Embedding, Computer science, Security token, Variable (mathematics), Representation (politics), Theoretical computer science, Data mining, Artificial intelligence, Mathematics, Political science, Politics, Law, Computer security, Mathematical analysisTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
3Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 3Per-year citation counts (last 5 years)
- References (count)
-
35Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4408717486 |
|---|---|
| doi | https://doi.org/10.1093/bioadv/vbaf060 |
| ids.doi | https://doi.org/10.1093/bioadv/vbaf060 |
| ids.pmid | https://pubmed.ncbi.nlm.nih.gov/40170888 |
| ids.openalex | https://openalex.org/W4408717486 |
| fwci | 1.44074168 |
| type | article |
| title | Aggregating residue-level protein language model embeddings with optimal transport |
| biblio.issue | 1 |
| biblio.volume | 5 |
| biblio.last_page | vbaf060 |
| biblio.first_page | vbaf060 |
| topics[0].id | https://openalex.org/T12254 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 0.9991999864578247 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | Machine Learning in Bioinformatics |
| topics[1].id | https://openalex.org/T10015 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9940999746322632 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1312 |
| topics[1].subfield.display_name | Molecular Biology |
| topics[1].display_name | Genomics and Phylogenetic Studies |
| topics[2].id | https://openalex.org/T10521 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9927999973297119 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | RNA and protein synthesis mechanisms |
| is_xpac | False |
| apc_list.value | 1765 |
| apc_list.currency | GBP |
| apc_list.value_usd | 2164 |
| apc_paid.value | 1765 |
| apc_paid.currency | GBP |
| apc_paid.value_usd | 2164 |
| concepts[0].id | https://openalex.org/C70437156 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7530816793441772 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q7228652 |
| concepts[0].display_name | Pooling |
| concepts[1].id | https://openalex.org/C41608201 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7071640491485596 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q980509 |
| concepts[1].display_name | Embedding |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.6773527264595032 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C48145219 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5914897918701172 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1335365 |
| concepts[3].display_name | Security token |
| concepts[4].id | https://openalex.org/C182365436 |
| concepts[4].level | 2 |
| concepts[4].score | 0.4523504674434662 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q50701 |
| concepts[4].display_name | Variable (mathematics) |
| concepts[5].id | https://openalex.org/C2776359362 |
| concepts[5].level | 3 |
| concepts[5].score | 0.42664188146591187 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2145286 |
| concepts[5].display_name | Representation (politics) |
| concepts[6].id | https://openalex.org/C80444323 |
| concepts[6].level | 1 |
| concepts[6].score | 0.3783003091812134 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[6].display_name | Theoretical computer science |
| concepts[7].id | https://openalex.org/C124101348 |
| concepts[7].level | 1 |
| concepts[7].score | 0.35766589641571045 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[7].display_name | Data mining |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.30074983835220337 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C33923547 |
| concepts[9].level | 0 |
| concepts[9].score | 0.1897859275341034 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[9].display_name | Mathematics |
| concepts[10].id | https://openalex.org/C17744445 |
| concepts[10].level | 0 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[10].display_name | Political science |
| concepts[11].id | https://openalex.org/C94625758 |
| concepts[11].level | 2 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7163 |
| concepts[11].display_name | Politics |
| concepts[12].id | https://openalex.org/C199539241 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7748 |
| concepts[12].display_name | Law |
| concepts[13].id | https://openalex.org/C38652104 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q3510521 |
| concepts[13].display_name | Computer security |
| concepts[14].id | https://openalex.org/C134306372 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[14].display_name | Mathematical analysis |
| keywords[0].id | https://openalex.org/keywords/pooling |
| keywords[0].score | 0.7530816793441772 |
| keywords[0].display_name | Pooling |
| keywords[1].id | https://openalex.org/keywords/embedding |
| keywords[1].score | 0.7071640491485596 |
| keywords[1].display_name | Embedding |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.6773527264595032 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/security-token |
| keywords[3].score | 0.5914897918701172 |
| keywords[3].display_name | Security token |
| keywords[4].id | https://openalex.org/keywords/variable |
| keywords[4].score | 0.4523504674434662 |
| keywords[4].display_name | Variable (mathematics) |
| keywords[5].id | https://openalex.org/keywords/representation |
| keywords[5].score | 0.42664188146591187 |
| keywords[5].display_name | Representation (politics) |
| keywords[6].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[6].score | 0.3783003091812134 |
| keywords[6].display_name | Theoretical computer science |
| keywords[7].id | https://openalex.org/keywords/data-mining |
| keywords[7].score | 0.35766589641571045 |
| keywords[7].display_name | Data mining |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.30074983835220337 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/mathematics |
| keywords[9].score | 0.1897859275341034 |
| keywords[9].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.1093/bioadv/vbaf060 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4210234069 |
| locations[0].source.issn | 2635-0041 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 2635-0041 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | Bioinformatics Advances |
| locations[0].source.host_organization | https://openalex.org/P4310311648 |
| locations[0].source.host_organization_name | Oxford University Press |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310311648, https://openalex.org/P4310311647 |
| locations[0].source.host_organization_lineage_names | Oxford University Press, University of Oxford |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Bioinformatics Advances |
| locations[0].landing_page_url | https://doi.org/10.1093/bioadv/vbaf060 |
| locations[1].id | pmid:40170888 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306525036 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | PubMed |
| locations[1].source.host_organization | https://openalex.org/I1299303238 |
| locations[1].source.host_organization_name | National Institutes of Health |
| locations[1].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | publishedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | True |
| locations[1].is_published | True |
| locations[1].raw_source_name | Bioinformatics advances |
| locations[1].landing_page_url | https://pubmed.ncbi.nlm.nih.gov/40170888 |
| locations[2].id | pmh:oai:europepmc.org:10774818 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306400806 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | Europe PMC (PubMed Central) |
| locations[2].source.host_organization | https://openalex.org/I1303153112 |
| locations[2].source.host_organization_name | European Bioinformatics Institute |
| locations[2].source.host_organization_lineage | https://openalex.org/I1303153112 |
| locations[2].license | other-oa |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | Text |
| locations[2].license_id | https://openalex.org/licenses/other-oa |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11961220 |
| indexed_in | crossref, doaj, pubmed |
| authorships[0].author.id | https://openalex.org/A5065389164 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4891-6726 |
| authorships[0].author.display_name | Navid Naderializadeh |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I170897317 |
| authorships[0].affiliations[0].raw_affiliation_string | Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27705, USA |
| authorships[0].institutions[0].id | https://openalex.org/I170897317 |
| authorships[0].institutions[0].ror | https://ror.org/00py81415 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I170897317 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Duke University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Navid NaderiAlizadeh |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27705, USA |
| authorships[1].author.id | https://openalex.org/A5081688191 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-4084-7340 |
| authorships[1].author.display_name | Rohit Singh |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I170897317 |
| authorships[1].affiliations[0].raw_affiliation_string | Department of Cell Biology, Duke University, Durham, NC 27705, USA |
| authorships[1].affiliations[1].institution_ids | https://openalex.org/I170897317 |
| authorships[1].affiliations[1].raw_affiliation_string | Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27705, USA |
| authorships[1].institutions[0].id | https://openalex.org/I170897317 |
| authorships[1].institutions[0].ror | https://ror.org/00py81415 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I170897317 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | Duke University |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Rohit Singh |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Department of Biostatistics and Bioinformatics, Duke University, Durham, NC 27705, USA, Department of Cell Biology, Duke University, Durham, NC 27705, USA |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdf |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Aggregating residue-level protein language model embeddings with optimal transport |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-23T05:10:03.516525 |
| primary_topic.id | https://openalex.org/T12254 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 0.9991999864578247 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | Machine Learning in Bioinformatics |
| related_works | https://openalex.org/W2953234277, https://openalex.org/W2626256601, https://openalex.org/W147410782, https://openalex.org/W2900413183, https://openalex.org/W3022252430, https://openalex.org/W4390975304, https://openalex.org/W4287804464, https://openalex.org/W3103989898, https://openalex.org/W1497619009, https://openalex.org/W3124312031 |
| cited_by_count | 3 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 3 |
| locations_count | 3 |
| best_oa_location.id | doi:10.1093/bioadv/vbaf060 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4210234069 |
| best_oa_location.source.issn | 2635-0041 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 2635-0041 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | Bioinformatics Advances |
| best_oa_location.source.host_organization | https://openalex.org/P4310311648 |
| best_oa_location.source.host_organization_name | Oxford University Press |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310311648, https://openalex.org/P4310311647 |
| best_oa_location.source.host_organization_lineage_names | Oxford University Press, University of Oxford |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Bioinformatics Advances |
| best_oa_location.landing_page_url | https://doi.org/10.1093/bioadv/vbaf060 |
| primary_location.id | doi:10.1093/bioadv/vbaf060 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4210234069 |
| primary_location.source.issn | 2635-0041 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 2635-0041 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | Bioinformatics Advances |
| primary_location.source.host_organization | https://openalex.org/P4310311648 |
| primary_location.source.host_organization_name | Oxford University Press |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310311648, https://openalex.org/P4310311647 |
| primary_location.source.host_organization_lineage_names | Oxford University Press, University of Oxford |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbaf060/62493443/vbaf060.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Bioinformatics Advances |
| primary_location.landing_page_url | https://doi.org/10.1093/bioadv/vbaf060 |
| publication_date | 2024-12-26 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W4407273337, https://openalex.org/W2730472814, https://openalex.org/W3166142427, https://openalex.org/W4385737488, https://openalex.org/W4390897322, https://openalex.org/W2957436444, https://openalex.org/W2086286404, https://openalex.org/W4223644783, https://openalex.org/W3177500196, https://openalex.org/W4288066876, https://openalex.org/W3164046276, https://openalex.org/W4406440058, https://openalex.org/W4387012333, https://openalex.org/W4386860638, https://openalex.org/W6759401939, https://openalex.org/W4327550249, https://openalex.org/W4225438928, https://openalex.org/W2096864392, https://openalex.org/W6803535088, https://openalex.org/W4388024559, https://openalex.org/W3146944767, https://openalex.org/W4401966068, https://openalex.org/W4406204888, https://openalex.org/W4404420622, https://openalex.org/W4405909579, https://openalex.org/W4379932151, https://openalex.org/W4399849668, https://openalex.org/W3199468887, https://openalex.org/W4242765109, https://openalex.org/W4220991280, https://openalex.org/W4307684269, https://openalex.org/W6739901393, https://openalex.org/W4233762729, https://openalex.org/W4225891318, https://openalex.org/W6744580074 |
| referenced_works_count | 35 |
| abstract_inverted_index.a | 42, 88, 110, 122, 126 |
| abstract_inverted_index.As | 21 |
| abstract_inverted_index.We | 86, 102, 149 |
| abstract_inverted_index.as | 8, 107 |
| abstract_inverted_index.at | 212 |
| abstract_inverted_index.be | 198, 210 |
| abstract_inverted_index.by | 192 |
| abstract_inverted_index.in | 32, 129 |
| abstract_inverted_index.is | 72, 136, 185 |
| abstract_inverted_index.it | 71 |
| abstract_inverted_index.of | 141, 153, 178 |
| abstract_inverted_index.on | 36 |
| abstract_inverted_index.or | 174 |
| abstract_inverted_index.to | 65, 94, 117, 138, 172 |
| abstract_inverted_index.Our | 182, 206 |
| abstract_inverted_index.PLM | 68, 97, 105, 167 |
| abstract_inverted_index.The | 133 |
| abstract_inverted_index.and | 113, 144, 204 |
| abstract_inverted_index.but | 70 |
| abstract_inverted_index.can | 209 |
| abstract_inverted_index.for | 11, 18, 44, 52, 159, 188 |
| abstract_inverted_index.has | 60 |
| abstract_inverted_index.map | 118 |
| abstract_inverted_index.our | 154 |
| abstract_inverted_index.the | 79, 130, 139, 142, 146, 151, 176 |
| abstract_inverted_index.PLMs | 25, 171 |
| abstract_inverted_index.This | 39 |
| abstract_inverted_index.code | 208 |
| abstract_inverted_index.from | 109 |
| abstract_inverted_index.have | 6 |
| abstract_inverted_index.into | 15, 99 |
| abstract_inverted_index.lost | 199 |
| abstract_inverted_index.over | 156 |
| abstract_inverted_index.set, | 124 |
| abstract_inverted_index.that | 48, 196 |
| abstract_inverted_index.this | 75 |
| abstract_inverted_index.used | 62 |
| abstract_inverted_index.with | 165 |
| abstract_inverted_index.work | 59 |
| abstract_inverted_index.(i.e. | 28 |
| abstract_inverted_index.PLMs. | 181 |
| abstract_inverted_index.based | 35 |
| abstract_inverted_index.found | 211 |
| abstract_inverted_index.input | 143 |
| abstract_inverted_index.match | 173 |
| abstract_inverted_index.might | 197 |
| abstract_inverted_index.novel | 89 |
| abstract_inverted_index.poses | 41 |
| abstract_inverted_index.tasks | 47 |
| abstract_inverted_index.these | 119 |
| abstract_inverted_index.(PLMs) | 5 |
| abstract_inverted_index.across | 55, 82 |
| abstract_inverted_index.employ | 114 |
| abstract_inverted_index.entire | 147 |
| abstract_inverted_index.exceed | 175 |
| abstract_inverted_index.length | 140 |
| abstract_inverted_index.longer | 189 |
| abstract_inverted_index.method | 76, 90, 155 |
| abstract_inverted_index.models | 4 |
| abstract_inverted_index.output | 131 |
| abstract_inverted_index.scheme | 184 |
| abstract_inverted_index.sizes, | 168 |
| abstract_inverted_index.space. | 132 |
| abstract_inverted_index.tasks, | 163 |
| abstract_inverted_index.Protein | 2 |
| abstract_inverted_index.Results | 85 |
| abstract_inverted_index.against | 121 |
| abstract_inverted_index.average | 63, 157, 201 |
| abstract_inverted_index.convert | 95 |
| abstract_inverted_index.emerged | 7 |
| abstract_inverted_index.length. | 38 |
| abstract_inverted_index.mapping | 12 |
| abstract_inverted_index.optimal | 92 |
| abstract_inverted_index.outputs | 34, 98, 106 |
| abstract_inverted_index.pooling | 64, 158 |
| abstract_inverted_index.protein | 13, 22, 37, 190 |
| abstract_inverted_index.require | 49 |
| abstract_inverted_index.samples | 108, 120 |
| abstract_inverted_index.several | 160 |
| abstract_inverted_index.through | 200 |
| abstract_inverted_index.unclear | 73 |
| abstract_inverted_index.various | 19 |
| abstract_inverted_index.whether | 74 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.Previous | 58 |
| abstract_inverted_index.agnostic | 137 |
| abstract_inverted_index.analysis | 54 |
| abstract_inverted_index.creating | 125 |
| abstract_inverted_index.enabling | 169 |
| abstract_inverted_index.generate | 26 |
| abstract_inverted_index.language | 3 |
| abstract_inverted_index.outputs, | 69 |
| abstract_inverted_index.pooling. | 202 |
| abstract_inverted_index.powerful | 9 |
| abstract_inverted_index.protein. | 148 |
| abstract_inverted_index.relevant | 80 |
| abstract_inverted_index.schemes, | 24 |
| abstract_inverted_index.suitable | 17 |
| abstract_inverted_index.Euclidean | 127 |
| abstract_inverted_index.capturing | 193 |
| abstract_inverted_index.challenge | 43 |
| abstract_inverted_index.different | 56 |
| abstract_inverted_index.distances | 116 |
| abstract_inverted_index.effective | 187 |
| abstract_inverted_index.embedding | 128, 135 |
| abstract_inverted_index.essential | 194 |
| abstract_inverted_index.introduce | 87 |
| abstract_inverted_index.per-token | 27, 104 |
| abstract_inverted_index.proteins. | 57 |
| abstract_inverted_index.reference | 123 |
| abstract_inverted_index.resulting | 31, 134 |
| abstract_inverted_index.sequences | 14, 191 |
| abstract_inverted_index.summarize | 66 |
| abstract_inverted_index.transport | 93 |
| abstract_inverted_index.typically | 61 |
| abstract_inverted_index.utilizing | 91 |
| abstract_inverted_index.Motivation | 1 |
| abstract_inverted_index.approaches | 10 |
| abstract_inverted_index.consistent | 53 |
| abstract_inverted_index.downstream | 161 |
| abstract_inverted_index.embeddings | 16, 51 |
| abstract_inverted_index.especially | 186 |
| abstract_inverted_index.prediction | 46, 162 |
| abstract_inverted_index.represents | 145 |
| abstract_inverted_index.aggregation | 183 |
| abstract_inverted_index.constrained | 166 |
| abstract_inverted_index.demonstrate | 150 |
| abstract_inverted_index.effectively | 77 |
| abstract_inverted_index.information | 81, 195 |
| abstract_inverted_index.performance | 177 |
| abstract_inverted_index.prioritizes | 78 |
| abstract_inverted_index.superiority | 152 |
| abstract_inverted_index.token-level | 67, 83 |
| abstract_inverted_index.variability | 40 |
| abstract_inverted_index.Availability | 203 |
| abstract_inverted_index.distribution | 112 |
| abstract_inverted_index.fixed-length | 100 |
| abstract_inverted_index.larger-scale | 180 |
| abstract_inverted_index.particularly | 164 |
| abstract_inverted_index.per-residue) | 29 |
| abstract_inverted_index.applications. | 20 |
| abstract_inverted_index.conceptualize | 103 |
| abstract_inverted_index.probabilistic | 111 |
| abstract_inverted_index.protein-level | 45 |
| abstract_inverted_index.smaller-scale | 170 |
| abstract_inverted_index.uniform-sized | 50 |
| abstract_inverted_index.average-pooled | 179 |
| abstract_inverted_index.implementation | 205, 207 |
| abstract_inverted_index.representation | 23 |
| abstract_inverted_index.variable-sized | 33 |
| abstract_inverted_index.variable-length | 96 |
| abstract_inverted_index.representations, | 30 |
| abstract_inverted_index.representations. | 84, 101 |
| abstract_inverted_index.sliced-Wasserstein | 115 |
| abstract_inverted_index.https://github.com/navid-naderi/PLM_SWE. | 213 |
| cited_by_percentile_year.max | 97 |
| cited_by_percentile_year.min | 96 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile.value | 0.7747445 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |