Human-in-the-loop approach to identify functionally important residues of proteins from literature Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1101/2024.03.09.583700
We present a novel system that leverages curators in the loop to develop a dataset and model for detecting residue-level functional annotations and other protein structure features from standard publication text. Our approach involves the integration of data from multiple resources, including PDBe, EuropePMC, PubMedCentral, and PubMed, combined with annotation guidelines from UniProt, while employing LitSuggest and Huggingface models as tools in the annotation process. A team of seven annotators manually curated ten articles for named entities, which we utilized to train a starting PubmedBert model from Huggingface. Using a human-in-the-loop annotation system, we developed the best model with commendable performance metrics of 0.90 for precision, 0.92 for recall, and 0.91 for F1-measure. Our proposed system showcases a successful synergy of machine learning techniques and human expertise in curating a dataset for residue-level functional annotations and protein structure features. The results demonstrate the potential for broader applications in protein research, bridging the gap between advanced machine learning models and the indispensable insights of domain experts.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.1101/2024.03.09.583700
- https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdf
- OA Status
- green
- References
- 36
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4392761558
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4392761558Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1101/2024.03.09.583700Digital Object Identifier
- Title
-
Human-in-the-loop approach to identify functionally important residues of proteins from literatureWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-03-13Full publication date if available
- Authors
-
M. Vollmar, Santosh Tirunagari, Déborah Harrus, David Armstrong, Romana Gáborová, Deepti Gupta, Marcelo Querino Lima Afonso, Genevieve L. Evans, Sameer VelankarList of authors in order
- Landing page
-
https://doi.org/10.1101/2024.03.09.583700Publisher landing page
- PDF URL
-
https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdfDirect OA link when available
- Concepts
-
Computer science, Annotation, UniProt, Bridging (networking), Artificial intelligence, Precision and recall, Domain (mathematical analysis), Machine learning, Information retrieval, Biology, Computer network, Mathematics, Mathematical analysis, Gene, BiochemistryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
36Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4392761558 |
|---|---|
| doi | https://doi.org/10.1101/2024.03.09.583700 |
| ids.doi | https://doi.org/10.1101/2024.03.09.583700 |
| ids.openalex | https://openalex.org/W4392761558 |
| fwci | |
| type | preprint |
| title | Human-in-the-loop approach to identify functionally important residues of proteins from literature |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11710 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 0.9973000288009644 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | Biomedical Text Mining and Ontologies |
| topics[1].id | https://openalex.org/T11986 |
| topics[1].field.id | https://openalex.org/fields/18 |
| topics[1].field.display_name | Decision Sciences |
| topics[1].score | 0.9739000201225281 |
| topics[1].domain.id | https://openalex.org/domains/2 |
| topics[1].domain.display_name | Social Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1802 |
| topics[1].subfield.display_name | Information Systems and Management |
| topics[1].display_name | Scientific Computing and Data Management |
| topics[2].id | https://openalex.org/T10887 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9363999962806702 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Bioinformatics and Genomic Networks |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.7469708323478699 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C2776321320 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7459747195243835 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q857525 |
| concepts[1].display_name | Annotation |
| concepts[2].id | https://openalex.org/C202264299 |
| concepts[2].level | 3 |
| concepts[2].score | 0.7373471260070801 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q905695 |
| concepts[2].display_name | UniProt |
| concepts[3].id | https://openalex.org/C174348530 |
| concepts[3].level | 2 |
| concepts[3].score | 0.7145683169364929 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q188635 |
| concepts[3].display_name | Bridging (networking) |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.4879247844219208 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C81669768 |
| concepts[5].level | 2 |
| concepts[5].score | 0.45510849356651306 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2359161 |
| concepts[5].display_name | Precision and recall |
| concepts[6].id | https://openalex.org/C36503486 |
| concepts[6].level | 2 |
| concepts[6].score | 0.42323756217956543 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q11235244 |
| concepts[6].display_name | Domain (mathematical analysis) |
| concepts[7].id | https://openalex.org/C119857082 |
| concepts[7].level | 1 |
| concepts[7].score | 0.3828672170639038 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[7].display_name | Machine learning |
| concepts[8].id | https://openalex.org/C23123220 |
| concepts[8].level | 1 |
| concepts[8].score | 0.3822616934776306 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[8].display_name | Information retrieval |
| concepts[9].id | https://openalex.org/C86803240 |
| concepts[9].level | 0 |
| concepts[9].score | 0.08494091033935547 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[9].display_name | Biology |
| concepts[10].id | https://openalex.org/C31258907 |
| concepts[10].level | 1 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[10].display_name | Computer network |
| concepts[11].id | https://openalex.org/C33923547 |
| concepts[11].level | 0 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[11].display_name | Mathematics |
| concepts[12].id | https://openalex.org/C134306372 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[12].display_name | Mathematical analysis |
| concepts[13].id | https://openalex.org/C104317684 |
| concepts[13].level | 2 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q7187 |
| concepts[13].display_name | Gene |
| concepts[14].id | https://openalex.org/C55493867 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[14].display_name | Biochemistry |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.7469708323478699 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/annotation |
| keywords[1].score | 0.7459747195243835 |
| keywords[1].display_name | Annotation |
| keywords[2].id | https://openalex.org/keywords/uniprot |
| keywords[2].score | 0.7373471260070801 |
| keywords[2].display_name | UniProt |
| keywords[3].id | https://openalex.org/keywords/bridging |
| keywords[3].score | 0.7145683169364929 |
| keywords[3].display_name | Bridging (networking) |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.4879247844219208 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/precision-and-recall |
| keywords[5].score | 0.45510849356651306 |
| keywords[5].display_name | Precision and recall |
| keywords[6].id | https://openalex.org/keywords/domain |
| keywords[6].score | 0.42323756217956543 |
| keywords[6].display_name | Domain (mathematical analysis) |
| keywords[7].id | https://openalex.org/keywords/machine-learning |
| keywords[7].score | 0.3828672170639038 |
| keywords[7].display_name | Machine learning |
| keywords[8].id | https://openalex.org/keywords/information-retrieval |
| keywords[8].score | 0.3822616934776306 |
| keywords[8].display_name | Information retrieval |
| keywords[9].id | https://openalex.org/keywords/biology |
| keywords[9].score | 0.08494091033935547 |
| keywords[9].display_name | Biology |
| language | en |
| locations[0].id | doi:10.1101/2024.03.09.583700 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306402567 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| locations[0].source.host_organization | https://openalex.org/I2750212522 |
| locations[0].source.host_organization_name | Cold Spring Harbor Laboratory |
| locations[0].source.host_organization_lineage | https://openalex.org/I2750212522 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdf |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.1101/2024.03.09.583700 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5016859647 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-9162-9159 |
| authorships[0].author.display_name | M. Vollmar |
| authorships[0].countries | GB |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[0].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[0].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[0].institutions[0].ror | https://ror.org/02catss52 |
| authorships[0].institutions[0].type | facility |
| authorships[0].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[0].institutions[0].country_code | GB |
| authorships[0].institutions[0].display_name | European Bioinformatics Institute |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Melanie Vollmar |
| authorships[0].is_corresponding | True |
| authorships[0].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[1].author.id | https://openalex.org/A5013246415 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-9064-1965 |
| authorships[1].author.display_name | Santosh Tirunagari |
| authorships[1].countries | GB |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[1].affiliations[0].raw_affiliation_string | Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK |
| authorships[1].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[1].institutions[0].ror | https://ror.org/02catss52 |
| authorships[1].institutions[0].type | facility |
| authorships[1].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[1].institutions[0].country_code | GB |
| authorships[1].institutions[0].display_name | European Bioinformatics Institute |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Santosh Tirunagari |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Literature Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK |
| authorships[2].author.id | https://openalex.org/A5074638543 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-7651-672X |
| authorships[2].author.display_name | Déborah Harrus |
| authorships[2].countries | GB |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[2].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[2].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[2].institutions[0].ror | https://ror.org/02catss52 |
| authorships[2].institutions[0].type | facility |
| authorships[2].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[2].institutions[0].country_code | GB |
| authorships[2].institutions[0].display_name | European Bioinformatics Institute |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Deborah Harrus |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[3].author.id | https://openalex.org/A5015459016 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4986-1229 |
| authorships[3].author.display_name | David Armstrong |
| authorships[3].countries | GB |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[3].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[3].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[3].institutions[0].ror | https://ror.org/02catss52 |
| authorships[3].institutions[0].type | facility |
| authorships[3].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[3].institutions[0].country_code | GB |
| authorships[3].institutions[0].display_name | European Bioinformatics Institute |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | David Armstrong |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[4].author.id | https://openalex.org/A5049223954 |
| authorships[4].author.orcid | https://orcid.org/0009-0009-5900-9513 |
| authorships[4].author.display_name | Romana Gáborová |
| authorships[4].countries | GB |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[4].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[4].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[4].institutions[0].ror | https://ror.org/02catss52 |
| authorships[4].institutions[0].type | facility |
| authorships[4].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[4].institutions[0].country_code | GB |
| authorships[4].institutions[0].display_name | European Bioinformatics Institute |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Romana Gaborova |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[5].author.id | https://openalex.org/A5072651997 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-9999-5607 |
| authorships[5].author.display_name | Deepti Gupta |
| authorships[5].countries | GB |
| authorships[5].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[5].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[5].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[5].institutions[0].ror | https://ror.org/02catss52 |
| authorships[5].institutions[0].type | facility |
| authorships[5].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[5].institutions[0].country_code | GB |
| authorships[5].institutions[0].display_name | European Bioinformatics Institute |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Deepti Gupta |
| authorships[5].is_corresponding | False |
| authorships[5].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[6].author.id | https://openalex.org/A5057489258 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-9966-8187 |
| authorships[6].author.display_name | Marcelo Querino Lima Afonso |
| authorships[6].countries | GB |
| authorships[6].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[6].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[6].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[6].institutions[0].ror | https://ror.org/02catss52 |
| authorships[6].institutions[0].type | facility |
| authorships[6].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[6].institutions[0].country_code | GB |
| authorships[6].institutions[0].display_name | European Bioinformatics Institute |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Marcelo Querino Lima Afonso |
| authorships[6].is_corresponding | False |
| authorships[6].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[7].author.id | https://openalex.org/A5003934048 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-8612-9539 |
| authorships[7].author.display_name | Genevieve L. Evans |
| authorships[7].countries | GB |
| authorships[7].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[7].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[7].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[7].institutions[0].ror | https://ror.org/02catss52 |
| authorships[7].institutions[0].type | facility |
| authorships[7].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[7].institutions[0].country_code | GB |
| authorships[7].institutions[0].display_name | European Bioinformatics Institute |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Genevieve Laura Evans |
| authorships[7].is_corresponding | False |
| authorships[7].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[8].author.id | https://openalex.org/A5042460017 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-8439-5964 |
| authorships[8].author.display_name | Sameer Velankar |
| authorships[8].countries | GB |
| authorships[8].affiliations[0].institution_ids | https://openalex.org/I1303153112 |
| authorships[8].affiliations[0].raw_affiliation_string | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| authorships[8].institutions[0].id | https://openalex.org/I1303153112 |
| authorships[8].institutions[0].ror | https://ror.org/02catss52 |
| authorships[8].institutions[0].type | facility |
| authorships[8].institutions[0].lineage | https://openalex.org/I1303153112, https://openalex.org/I4210138560 |
| authorships[8].institutions[0].country_code | GB |
| authorships[8].institutions[0].display_name | European Bioinformatics Institute |
| authorships[8].author_position | last |
| authorships[8].raw_author_name | Sameer Velankar |
| authorships[8].is_corresponding | False |
| authorships[8].raw_affiliation_strings | Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB0 SD, UK |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdf |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Human-in-the-loop approach to identify functionally important residues of proteins from literature |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T11710 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 0.9973000288009644 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | Biomedical Text Mining and Ontologies |
| related_works | https://openalex.org/W24553703, https://openalex.org/W4211000692, https://openalex.org/W2102027645, https://openalex.org/W2107785922, https://openalex.org/W2035528219, https://openalex.org/W2402478170, https://openalex.org/W2946410450, https://openalex.org/W2901823680, https://openalex.org/W2405355225, https://openalex.org/W2146638336 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.1101/2024.03.09.583700 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306402567 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| best_oa_location.source.host_organization | https://openalex.org/I2750212522 |
| best_oa_location.source.host_organization_name | Cold Spring Harbor Laboratory |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I2750212522 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdf |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | posted-content |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.1101/2024.03.09.583700 |
| primary_location.id | doi:10.1101/2024.03.09.583700 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306402567 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | bioRxiv (Cold Spring Harbor Laboratory) |
| primary_location.source.host_organization | https://openalex.org/I2750212522 |
| primary_location.source.host_organization_name | Cold Spring Harbor Laboratory |
| primary_location.source.host_organization_lineage | https://openalex.org/I2750212522 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.biorxiv.org/content/biorxiv/early/2024/03/13/2024.03.09.583700.full.pdf |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.1101/2024.03.09.583700 |
| publication_date | 2024-03-13 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W2898210859, https://openalex.org/W2982977147, https://openalex.org/W4365137095, https://openalex.org/W4309506674, https://openalex.org/W3177828909, https://openalex.org/W3186179742, https://openalex.org/W3211795435, https://openalex.org/W2128484254, https://openalex.org/W4311287562, https://openalex.org/W2302501749, https://openalex.org/W3157132663, https://openalex.org/W2029017735, https://openalex.org/W183866246, https://openalex.org/W3023249475, https://openalex.org/W2915895751, https://openalex.org/W2104148262, https://openalex.org/W2126276057, https://openalex.org/W2103017472, https://openalex.org/W2123171788, https://openalex.org/W1935434993, https://openalex.org/W2107580398, https://openalex.org/W2557399158, https://openalex.org/W4311288223, https://openalex.org/W1623072288, https://openalex.org/W6739901393, https://openalex.org/W2911489562, https://openalex.org/W3046375318, https://openalex.org/W4319323476, https://openalex.org/W6930143165, https://openalex.org/W3115095335, https://openalex.org/W3173727191, https://openalex.org/W2148488766, https://openalex.org/W2257876130, https://openalex.org/W4385245566, https://openalex.org/W2951562155, https://openalex.org/W4323036511 |
| referenced_works_count | 36 |
| abstract_inverted_index.A | 65 |
| abstract_inverted_index.a | 2, 13, 82, 89, 117, 129 |
| abstract_inverted_index.We | 0 |
| abstract_inverted_index.as | 59 |
| abstract_inverted_index.in | 8, 61, 127, 147 |
| abstract_inverted_index.of | 36, 67, 102, 120, 162 |
| abstract_inverted_index.to | 11, 80 |
| abstract_inverted_index.we | 78, 93 |
| abstract_inverted_index.Our | 31, 113 |
| abstract_inverted_index.The | 139 |
| abstract_inverted_index.and | 15, 22, 45, 56, 109, 124, 135, 158 |
| abstract_inverted_index.for | 17, 74, 104, 107, 111, 131, 144 |
| abstract_inverted_index.gap | 152 |
| abstract_inverted_index.ten | 72 |
| abstract_inverted_index.the | 9, 34, 62, 95, 142, 151, 159 |
| abstract_inverted_index.0.90 | 103 |
| abstract_inverted_index.0.91 | 110 |
| abstract_inverted_index.0.92 | 106 |
| abstract_inverted_index.best | 96 |
| abstract_inverted_index.data | 37 |
| abstract_inverted_index.from | 27, 38, 51, 86 |
| abstract_inverted_index.loop | 10 |
| abstract_inverted_index.team | 66 |
| abstract_inverted_index.that | 5 |
| abstract_inverted_index.with | 48, 98 |
| abstract_inverted_index.PDBe, | 42 |
| abstract_inverted_index.Using | 88 |
| abstract_inverted_index.human | 125 |
| abstract_inverted_index.model | 16, 85, 97 |
| abstract_inverted_index.named | 75 |
| abstract_inverted_index.novel | 3 |
| abstract_inverted_index.other | 23 |
| abstract_inverted_index.seven | 68 |
| abstract_inverted_index.text. | 30 |
| abstract_inverted_index.tools | 60 |
| abstract_inverted_index.train | 81 |
| abstract_inverted_index.which | 77 |
| abstract_inverted_index.while | 53 |
| abstract_inverted_index.domain | 163 |
| abstract_inverted_index.models | 58, 157 |
| abstract_inverted_index.system | 4, 115 |
| abstract_inverted_index.PubMed, | 46 |
| abstract_inverted_index.between | 153 |
| abstract_inverted_index.broader | 145 |
| abstract_inverted_index.curated | 71 |
| abstract_inverted_index.dataset | 14, 130 |
| abstract_inverted_index.develop | 12 |
| abstract_inverted_index.machine | 121, 155 |
| abstract_inverted_index.metrics | 101 |
| abstract_inverted_index.present | 1 |
| abstract_inverted_index.protein | 24, 136, 148 |
| abstract_inverted_index.recall, | 108 |
| abstract_inverted_index.results | 140 |
| abstract_inverted_index.synergy | 119 |
| abstract_inverted_index.system, | 92 |
| abstract_inverted_index.UniProt, | 52 |
| abstract_inverted_index.advanced | 154 |
| abstract_inverted_index.approach | 32 |
| abstract_inverted_index.articles | 73 |
| abstract_inverted_index.bridging | 150 |
| abstract_inverted_index.combined | 47 |
| abstract_inverted_index.curating | 128 |
| abstract_inverted_index.curators | 7 |
| abstract_inverted_index.experts. | 164 |
| abstract_inverted_index.features | 26 |
| abstract_inverted_index.insights | 161 |
| abstract_inverted_index.involves | 33 |
| abstract_inverted_index.learning | 122, 156 |
| abstract_inverted_index.manually | 70 |
| abstract_inverted_index.multiple | 39 |
| abstract_inverted_index.process. | 64 |
| abstract_inverted_index.proposed | 114 |
| abstract_inverted_index.standard | 28 |
| abstract_inverted_index.starting | 83 |
| abstract_inverted_index.utilized | 79 |
| abstract_inverted_index.detecting | 18 |
| abstract_inverted_index.developed | 94 |
| abstract_inverted_index.employing | 54 |
| abstract_inverted_index.entities, | 76 |
| abstract_inverted_index.expertise | 126 |
| abstract_inverted_index.features. | 138 |
| abstract_inverted_index.including | 41 |
| abstract_inverted_index.leverages | 6 |
| abstract_inverted_index.potential | 143 |
| abstract_inverted_index.research, | 149 |
| abstract_inverted_index.showcases | 116 |
| abstract_inverted_index.structure | 25, 137 |
| abstract_inverted_index.EuropePMC, | 43 |
| abstract_inverted_index.LitSuggest | 55 |
| abstract_inverted_index.PubmedBert | 84 |
| abstract_inverted_index.annotation | 49, 63, 91 |
| abstract_inverted_index.annotators | 69 |
| abstract_inverted_index.functional | 20, 133 |
| abstract_inverted_index.guidelines | 50 |
| abstract_inverted_index.precision, | 105 |
| abstract_inverted_index.resources, | 40 |
| abstract_inverted_index.successful | 118 |
| abstract_inverted_index.techniques | 123 |
| abstract_inverted_index.F1-measure. | 112 |
| abstract_inverted_index.Huggingface | 57 |
| abstract_inverted_index.annotations | 21, 134 |
| abstract_inverted_index.commendable | 99 |
| abstract_inverted_index.demonstrate | 141 |
| abstract_inverted_index.integration | 35 |
| abstract_inverted_index.performance | 100 |
| abstract_inverted_index.publication | 29 |
| abstract_inverted_index.Huggingface. | 87 |
| abstract_inverted_index.applications | 146 |
| abstract_inverted_index.indispensable | 160 |
| abstract_inverted_index.residue-level | 19, 132 |
| abstract_inverted_index.PubMedCentral, | 44 |
| abstract_inverted_index.human-in-the-loop | 90 |
| cited_by_percentile_year | |
| corresponding_author_ids | https://openalex.org/A5016859647 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 9 |
| corresponding_institution_ids | https://openalex.org/I1303153112 |
| citation_normalized_percentile.value | 0.04036489 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |