Machine learning‐based prediction of enzyme substrate scope: Application to bacterial nitrilases Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.1002/prot.26019
Predicting the range of substrates accepted by an enzyme from its amino acid sequence is challenging. Although sequence‐ and structure‐based annotation approaches are often accurate for predicting broad categories of substrate specificity, they generally cannot predict which specific molecules will be accepted as substrates for a given enzyme, particularly within a class of closely related molecules. Combining targeted experimental activity data with structural modeling, ligand docking, and physicochemical properties of proteins and ligands with various machine learning models provides complementary information that can lead to accurate predictions of substrate scope for related enzymes. Here we describe such an approach that can predict the substrate scope of bacterial nitrilases, which catalyze the hydrolysis of nitrile compounds to the corresponding carboxylic acids and ammonia. Each of the four machine learning models (logistic regression, random forest, gradient‐boosted decision trees, and support vector machines) performed similarly (average ROC = 0.9, average accuracy = ~82%) for predicting substrate scope for this dataset, although random forest offers some advantages. This approach is intended to be highly modular with respect to physicochemical property calculations and software used for structural modeling and docking.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1002/prot.26019
- OA Status
- green
- Cited By
- 63
- References
- 89
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3096046766
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3096046766Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1002/prot.26019Digital Object Identifier
- Title
-
Machine learning‐based prediction of enzyme substrate scope: Application to bacterial nitrilasesWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-10-29Full publication date if available
- Authors
-
Zhongyu Mou, Jason Eakes, Connor J. Cooper, Carmen M. Foster, Robert F. Standaert, Mircea Podar, Mitchel J. Doktycz, Jerry M. ParksList of authors in order
- Landing page
-
https://doi.org/10.1002/prot.26019Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.osti.gov/biblio/1765501Direct OA link when available
- Concepts
-
Scope (computer science), Substrate (aquarium), Enzyme, Computer science, Substrate specificity, Chemistry, Biochemical engineering, Computational biology, Biochemistry, Biology, Engineering, Ecology, Programming languageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
63Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 20, 2024: 17, 2023: 14, 2022: 7, 2021: 5Per-year citation counts (last 5 years)
- References (count)
-
89Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3096046766 |
|---|---|
| doi | https://doi.org/10.1002/prot.26019 |
| ids.doi | https://doi.org/10.1002/prot.26019 |
| ids.mag | 3096046766 |
| ids.pmid | https://pubmed.ncbi.nlm.nih.gov/33118210 |
| ids.openalex | https://openalex.org/W3096046766 |
| fwci | 2.5357013 |
| mesh[0].qualifier_ui | Q000737 |
| mesh[0].descriptor_ui | D000619 |
| mesh[0].is_major_topic | True |
| mesh[0].qualifier_name | chemistry |
| mesh[0].descriptor_name | Aminohydrolases |
| mesh[1].qualifier_ui | Q000235 |
| mesh[1].descriptor_ui | D000619 |
| mesh[1].is_major_topic | True |
| mesh[1].qualifier_name | genetics |
| mesh[1].descriptor_name | Aminohydrolases |
| mesh[2].qualifier_ui | Q000378 |
| mesh[2].descriptor_ui | D000619 |
| mesh[2].is_major_topic | True |
| mesh[2].qualifier_name | metabolism |
| mesh[2].descriptor_name | Aminohydrolases |
| mesh[3].qualifier_ui | Q000737 |
| mesh[3].descriptor_ui | D001426 |
| mesh[3].is_major_topic | True |
| mesh[3].qualifier_name | chemistry |
| mesh[3].descriptor_name | Bacterial Proteins |
| mesh[4].qualifier_ui | Q000235 |
| mesh[4].descriptor_ui | D001426 |
| mesh[4].is_major_topic | True |
| mesh[4].qualifier_name | genetics |
| mesh[4].descriptor_name | Bacterial Proteins |
| mesh[5].qualifier_ui | Q000378 |
| mesh[5].descriptor_ui | D001426 |
| mesh[5].is_major_topic | True |
| mesh[5].qualifier_name | metabolism |
| mesh[5].descriptor_name | Bacterial Proteins |
| mesh[6].qualifier_ui | |
| mesh[6].descriptor_ui | D020134 |
| mesh[6].is_major_topic | False |
| mesh[6].qualifier_name | |
| mesh[6].descriptor_name | Catalytic Domain |
| mesh[7].qualifier_ui | |
| mesh[7].descriptor_ui | D055598 |
| mesh[7].is_major_topic | False |
| mesh[7].qualifier_name | |
| mesh[7].descriptor_name | Chemical Phenomena |
| mesh[8].qualifier_ui | |
| mesh[8].descriptor_ui | D008024 |
| mesh[8].is_major_topic | False |
| mesh[8].qualifier_name | |
| mesh[8].descriptor_name | Ligands |
| mesh[9].qualifier_ui | |
| mesh[9].descriptor_ui | D000069550 |
| mesh[9].is_major_topic | True |
| mesh[9].qualifier_name | |
| mesh[9].descriptor_name | Machine Learning |
| mesh[10].qualifier_ui | Q000379 |
| mesh[10].descriptor_ui | D062105 |
| mesh[10].is_major_topic | False |
| mesh[10].qualifier_name | methods |
| mesh[10].descriptor_name | Molecular Docking Simulation |
| mesh[11].qualifier_ui | Q000737 |
| mesh[11].descriptor_ui | D009570 |
| mesh[11].is_major_topic | False |
| mesh[11].qualifier_name | chemistry |
| mesh[11].descriptor_name | Nitriles |
| mesh[12].qualifier_ui | Q000378 |
| mesh[12].descriptor_ui | D009570 |
| mesh[12].is_major_topic | False |
| mesh[12].qualifier_name | metabolism |
| mesh[12].descriptor_name | Nitriles |
| mesh[13].qualifier_ui | |
| mesh[13].descriptor_ui | D011485 |
| mesh[13].is_major_topic | False |
| mesh[13].qualifier_name | |
| mesh[13].descriptor_name | Protein Binding |
| mesh[14].qualifier_ui | Q000737 |
| mesh[14].descriptor_ui | D000619 |
| mesh[14].is_major_topic | True |
| mesh[14].qualifier_name | chemistry |
| mesh[14].descriptor_name | Aminohydrolases |
| mesh[15].qualifier_ui | Q000235 |
| mesh[15].descriptor_ui | D000619 |
| mesh[15].is_major_topic | True |
| mesh[15].qualifier_name | genetics |
| mesh[15].descriptor_name | Aminohydrolases |
| mesh[16].qualifier_ui | Q000378 |
| mesh[16].descriptor_ui | D000619 |
| mesh[16].is_major_topic | True |
| mesh[16].qualifier_name | metabolism |
| mesh[16].descriptor_name | Aminohydrolases |
| mesh[17].qualifier_ui | Q000737 |
| mesh[17].descriptor_ui | D001426 |
| mesh[17].is_major_topic | True |
| mesh[17].qualifier_name | chemistry |
| mesh[17].descriptor_name | Bacterial Proteins |
| mesh[18].qualifier_ui | Q000235 |
| mesh[18].descriptor_ui | D001426 |
| mesh[18].is_major_topic | True |
| mesh[18].qualifier_name | genetics |
| mesh[18].descriptor_name | Bacterial Proteins |
| mesh[19].qualifier_ui | Q000378 |
| mesh[19].descriptor_ui | D001426 |
| mesh[19].is_major_topic | True |
| mesh[19].qualifier_name | metabolism |
| mesh[19].descriptor_name | Bacterial Proteins |
| mesh[20].qualifier_ui | |
| mesh[20].descriptor_ui | D020134 |
| mesh[20].is_major_topic | False |
| mesh[20].qualifier_name | |
| mesh[20].descriptor_name | Catalytic Domain |
| mesh[21].qualifier_ui | |
| mesh[21].descriptor_ui | D055598 |
| mesh[21].is_major_topic | False |
| mesh[21].qualifier_name | |
| mesh[21].descriptor_name | Chemical Phenomena |
| mesh[22].qualifier_ui | |
| mesh[22].descriptor_ui | D008024 |
| mesh[22].is_major_topic | False |
| mesh[22].qualifier_name | |
| mesh[22].descriptor_name | Ligands |
| mesh[23].qualifier_ui | |
| mesh[23].descriptor_ui | D000069550 |
| mesh[23].is_major_topic | True |
| mesh[23].qualifier_name | |
| mesh[23].descriptor_name | Machine Learning |
| mesh[24].qualifier_ui | Q000379 |
| mesh[24].descriptor_ui | D062105 |
| mesh[24].is_major_topic | False |
| mesh[24].qualifier_name | methods |
| mesh[24].descriptor_name | Molecular Docking Simulation |
| mesh[25].qualifier_ui | Q000737 |
| mesh[25].descriptor_ui | D009570 |
| mesh[25].is_major_topic | False |
| mesh[25].qualifier_name | chemistry |
| mesh[25].descriptor_name | Nitriles |
| mesh[26].qualifier_ui | Q000378 |
| mesh[26].descriptor_ui | D009570 |
| mesh[26].is_major_topic | False |
| mesh[26].qualifier_name | metabolism |
| mesh[26].descriptor_name | Nitriles |
| mesh[27].qualifier_ui | |
| mesh[27].descriptor_ui | D011485 |
| mesh[27].is_major_topic | False |
| mesh[27].qualifier_name | |
| mesh[27].descriptor_name | Protein Binding |
| type | article |
| title | Machine learning‐based prediction of enzyme substrate scope: Application to bacterial nitrilases |
| awards[0].id | https://openalex.org/G6308932649 |
| awards[0].funder_id | https://openalex.org/F4320338287 |
| awards[0].display_name | |
| awards[0].funder_award_id | DE‐AC05‐00OR22725 |
| awards[0].funder_display_name | Oak Ridge National Laboratory |
| awards[1].id | https://openalex.org/G3186066365 |
| awards[1].funder_id | https://openalex.org/F4320306076 |
| awards[1].display_name | |
| awards[1].funder_award_id | 2017219379 |
| awards[1].funder_display_name | National Science Foundation |
| biblio.issue | 3 |
| biblio.volume | 89 |
| biblio.last_page | 347 |
| biblio.first_page | 336 |
| topics[0].id | https://openalex.org/T10404 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 0.9939000010490417 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | Enzyme Catalysis and Immobilization |
| topics[1].id | https://openalex.org/T12254 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9718999862670898 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1312 |
| topics[1].subfield.display_name | Molecular Biology |
| topics[1].display_name | Machine Learning in Bioinformatics |
| topics[2].id | https://openalex.org/T10044 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9391000270843506 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Protein Structure and Dynamics |
| funders[0].id | https://openalex.org/F4320306076 |
| funders[0].ror | https://ror.org/021nxhr62 |
| funders[0].display_name | National Science Foundation |
| funders[1].id | https://openalex.org/F4320338287 |
| funders[1].ror | https://ror.org/01qz5mb56 |
| funders[1].display_name | Oak Ridge National Laboratory |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2778012447 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7958059906959534 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1034415 |
| concepts[0].display_name | Scope (computer science) |
| concepts[1].id | https://openalex.org/C2777289219 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6542609930038452 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q7632154 |
| concepts[1].display_name | Substrate (aquarium) |
| concepts[2].id | https://openalex.org/C181199279 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5053051114082336 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q8047 |
| concepts[2].display_name | Enzyme |
| concepts[3].id | https://openalex.org/C41008148 |
| concepts[3].level | 0 |
| concepts[3].score | 0.4924262464046478 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[3].display_name | Computer science |
| concepts[4].id | https://openalex.org/C2994592520 |
| concepts[4].level | 3 |
| concepts[4].score | 0.475847452878952 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q5090496 |
| concepts[4].display_name | Substrate specificity |
| concepts[5].id | https://openalex.org/C185592680 |
| concepts[5].level | 0 |
| concepts[5].score | 0.42657819390296936 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[5].display_name | Chemistry |
| concepts[6].id | https://openalex.org/C183696295 |
| concepts[6].level | 1 |
| concepts[6].score | 0.3848150670528412 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2487696 |
| concepts[6].display_name | Biochemical engineering |
| concepts[7].id | https://openalex.org/C70721500 |
| concepts[7].level | 1 |
| concepts[7].score | 0.32830333709716797 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q177005 |
| concepts[7].display_name | Computational biology |
| concepts[8].id | https://openalex.org/C55493867 |
| concepts[8].level | 1 |
| concepts[8].score | 0.27506524324417114 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[8].display_name | Biochemistry |
| concepts[9].id | https://openalex.org/C86803240 |
| concepts[9].level | 0 |
| concepts[9].score | 0.24200406670570374 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[9].display_name | Biology |
| concepts[10].id | https://openalex.org/C127413603 |
| concepts[10].level | 0 |
| concepts[10].score | 0.20292958617210388 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[10].display_name | Engineering |
| concepts[11].id | https://openalex.org/C18903297 |
| concepts[11].level | 1 |
| concepts[11].score | 0.0726434588432312 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q7150 |
| concepts[11].display_name | Ecology |
| concepts[12].id | https://openalex.org/C199360897 |
| concepts[12].level | 1 |
| concepts[12].score | 0.06325992941856384 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[12].display_name | Programming language |
| keywords[0].id | https://openalex.org/keywords/scope |
| keywords[0].score | 0.7958059906959534 |
| keywords[0].display_name | Scope (computer science) |
| keywords[1].id | https://openalex.org/keywords/substrate |
| keywords[1].score | 0.6542609930038452 |
| keywords[1].display_name | Substrate (aquarium) |
| keywords[2].id | https://openalex.org/keywords/enzyme |
| keywords[2].score | 0.5053051114082336 |
| keywords[2].display_name | Enzyme |
| keywords[3].id | https://openalex.org/keywords/computer-science |
| keywords[3].score | 0.4924262464046478 |
| keywords[3].display_name | Computer science |
| keywords[4].id | https://openalex.org/keywords/substrate-specificity |
| keywords[4].score | 0.475847452878952 |
| keywords[4].display_name | Substrate specificity |
| keywords[5].id | https://openalex.org/keywords/chemistry |
| keywords[5].score | 0.42657819390296936 |
| keywords[5].display_name | Chemistry |
| keywords[6].id | https://openalex.org/keywords/biochemical-engineering |
| keywords[6].score | 0.3848150670528412 |
| keywords[6].display_name | Biochemical engineering |
| keywords[7].id | https://openalex.org/keywords/computational-biology |
| keywords[7].score | 0.32830333709716797 |
| keywords[7].display_name | Computational biology |
| keywords[8].id | https://openalex.org/keywords/biochemistry |
| keywords[8].score | 0.27506524324417114 |
| keywords[8].display_name | Biochemistry |
| keywords[9].id | https://openalex.org/keywords/biology |
| keywords[9].score | 0.24200406670570374 |
| keywords[9].display_name | Biology |
| keywords[10].id | https://openalex.org/keywords/engineering |
| keywords[10].score | 0.20292958617210388 |
| keywords[10].display_name | Engineering |
| keywords[11].id | https://openalex.org/keywords/ecology |
| keywords[11].score | 0.0726434588432312 |
| keywords[11].display_name | Ecology |
| keywords[12].id | https://openalex.org/keywords/programming-language |
| keywords[12].score | 0.06325992941856384 |
| keywords[12].display_name | Programming language |
| language | en |
| locations[0].id | doi:10.1002/prot.26019 |
| locations[0].is_oa | False |
| locations[0].source.id | https://openalex.org/S121161810 |
| locations[0].source.issn | 0887-3585, 1097-0134 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 0887-3585 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Proteins Structure Function and Bioinformatics |
| locations[0].source.host_organization | https://openalex.org/P4310320595 |
| locations[0].source.host_organization_name | Wiley |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310320595 |
| locations[0].source.host_organization_lineage_names | Wiley |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Proteins: Structure, Function, and Bioinformatics |
| locations[0].landing_page_url | https://doi.org/10.1002/prot.26019 |
| locations[1].id | pmid:33118210 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306525036 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | PubMed |
| locations[1].source.host_organization | https://openalex.org/I1299303238 |
| locations[1].source.host_organization_name | National Institutes of Health |
| locations[1].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | publishedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | True |
| locations[1].is_published | True |
| locations[1].raw_source_name | Proteins |
| locations[1].landing_page_url | https://pubmed.ncbi.nlm.nih.gov/33118210 |
| locations[2].id | pmh:oai:osti.gov:1765501 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306402487 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) |
| locations[2].source.host_organization | https://openalex.org/I139351228 |
| locations[2].source.host_organization_name | Office of Scientific and Technical Information |
| locations[2].source.host_organization_lineage | https://openalex.org/I139351228 |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://www.osti.gov/biblio/1765501 |
| locations[3].id | pmh:oai:osti.gov:1804823 |
| locations[3].is_oa | True |
| locations[3].source.id | https://openalex.org/S4306402487 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | False |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) |
| locations[3].source.host_organization | https://openalex.org/I139351228 |
| locations[3].source.host_organization_name | Office of Scientific and Technical Information |
| locations[3].source.host_organization_lineage | https://openalex.org/I139351228 |
| locations[3].license | |
| locations[3].pdf_url | |
| locations[3].version | submittedVersion |
| locations[3].raw_type | |
| locations[3].license_id | |
| locations[3].is_accepted | False |
| locations[3].is_published | False |
| locations[3].raw_source_name | |
| locations[3].landing_page_url | https://www.osti.gov/biblio/1804823 |
| indexed_in | crossref, pubmed |
| authorships[0].author.id | https://openalex.org/A5074530006 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-2240-3129 |
| authorships[0].author.display_name | Zhongyu Mou |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[0].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[0].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[0].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[0].institutions[0].type | facility |
| authorships[0].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhongyu Mou |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[1].author.id | https://openalex.org/A5088151456 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-6356-0772 |
| authorships[1].author.display_name | Jason Eakes |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[1].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[1].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[1].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[1].institutions[0].type | facility |
| authorships[1].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Jason Eakes |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[2].author.id | https://openalex.org/A5028609610 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-5527-9948 |
| authorships[2].author.display_name | Connor J. Cooper |
| authorships[2].countries | US |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I75027704 |
| authorships[2].affiliations[0].raw_affiliation_string | Graduate School of Genome Science and Technology, University of TennesseeWalters Life Science, Knoxville, Tennessee, USA |
| authorships[2].institutions[0].id | https://openalex.org/I75027704 |
| authorships[2].institutions[0].ror | https://ror.org/020f3ap87 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I75027704 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | University of Tennessee at Knoxville |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Connor J. Cooper |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Graduate School of Genome Science and Technology, University of TennesseeWalters Life Science, Knoxville, Tennessee, USA |
| authorships[3].author.id | https://openalex.org/A5002632947 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-0927-9859 |
| authorships[3].author.display_name | Carmen M. Foster |
| authorships[3].countries | US |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[3].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[3].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[3].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[3].institutions[0].type | facility |
| authorships[3].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[3].institutions[0].country_code | US |
| authorships[3].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Carmen M. Foster |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[4].author.id | https://openalex.org/A5021060084 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-5684-1322 |
| authorships[4].author.display_name | Robert F. Standaert |
| authorships[4].countries | US |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[4].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[4].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[4].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[4].institutions[0].type | facility |
| authorships[4].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[4].institutions[0].country_code | US |
| authorships[4].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Robert F. Standaert |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[5].author.id | https://openalex.org/A5048218996 |
| authorships[5].author.orcid | https://orcid.org/0000-0003-2776-0205 |
| authorships[5].author.display_name | Mircea Podar |
| authorships[5].countries | US |
| authorships[5].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[5].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[5].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[5].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[5].institutions[0].type | facility |
| authorships[5].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[5].institutions[0].country_code | US |
| authorships[5].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Mircea Podar |
| authorships[5].is_corresponding | False |
| authorships[5].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[6].author.id | https://openalex.org/A5038289236 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-4856-8343 |
| authorships[6].author.display_name | Mitchel J. Doktycz |
| authorships[6].countries | US |
| authorships[6].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[6].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[6].affiliations[1].institution_ids | https://openalex.org/I75027704 |
| authorships[6].affiliations[1].raw_affiliation_string | Graduate School of Genome Science and Technology, University of TennesseeWalters Life Science, Knoxville, Tennessee, USA |
| authorships[6].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[6].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[6].institutions[0].type | facility |
| authorships[6].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[6].institutions[0].country_code | US |
| authorships[6].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[6].institutions[1].id | https://openalex.org/I75027704 |
| authorships[6].institutions[1].ror | https://ror.org/020f3ap87 |
| authorships[6].institutions[1].type | education |
| authorships[6].institutions[1].lineage | https://openalex.org/I75027704 |
| authorships[6].institutions[1].country_code | US |
| authorships[6].institutions[1].display_name | University of Tennessee at Knoxville |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Mitchel J. Doktycz |
| authorships[6].is_corresponding | False |
| authorships[6].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA, Graduate School of Genome Science and Technology, University of TennesseeWalters Life Science, Knoxville, Tennessee, USA |
| authorships[7].author.id | https://openalex.org/A5037417242 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-3103-9333 |
| authorships[7].author.display_name | Jerry M. Parks |
| authorships[7].countries | US |
| authorships[7].affiliations[0].institution_ids | https://openalex.org/I1289243028 |
| authorships[7].affiliations[0].raw_affiliation_string | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA |
| authorships[7].affiliations[1].institution_ids | https://openalex.org/I75027704 |
| authorships[7].affiliations[1].raw_affiliation_string | Graduate School of Genome Science and Technology, University of TennesseeWalters Life Science, Knoxville, Tennessee, USA |
| authorships[7].institutions[0].id | https://openalex.org/I1289243028 |
| authorships[7].institutions[0].ror | https://ror.org/01qz5mb56 |
| authorships[7].institutions[0].type | facility |
| authorships[7].institutions[0].lineage | https://openalex.org/I1289243028, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210159294 |
| authorships[7].institutions[0].country_code | US |
| authorships[7].institutions[0].display_name | Oak Ridge National Laboratory |
| authorships[7].institutions[1].id | https://openalex.org/I75027704 |
| authorships[7].institutions[1].ror | https://ror.org/020f3ap87 |
| authorships[7].institutions[1].type | education |
| authorships[7].institutions[1].lineage | https://openalex.org/I75027704 |
| authorships[7].institutions[1].country_code | US |
| authorships[7].institutions[1].display_name | University of Tennessee at Knoxville |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Jerry M. Parks |
| authorships[7].is_corresponding | True |
| authorships[7].raw_affiliation_strings | Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA, Graduate School of Genome Science and Technology, University of TennesseeWalters Life Science, Knoxville, Tennessee, USA |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.osti.gov/biblio/1765501 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Machine learning‐based prediction of enzyme substrate scope: Application to bacterial nitrilases |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-20T23:13:51.555489 |
| primary_topic.id | https://openalex.org/T10404 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 0.9939000010490417 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | Enzyme Catalysis and Immobilization |
| related_works | https://openalex.org/W4241523039, https://openalex.org/W2360028903, https://openalex.org/W4280543773, https://openalex.org/W178231042, https://openalex.org/W2366083136, https://openalex.org/W2387622493, https://openalex.org/W2117340465, https://openalex.org/W2046536786, https://openalex.org/W1981911623, https://openalex.org/W2587733587 |
| cited_by_count | 63 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 20 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 17 |
| counts_by_year[2].year | 2023 |
| counts_by_year[2].cited_by_count | 14 |
| counts_by_year[3].year | 2022 |
| counts_by_year[3].cited_by_count | 7 |
| counts_by_year[4].year | 2021 |
| counts_by_year[4].cited_by_count | 5 |
| locations_count | 4 |
| best_oa_location.id | pmh:oai:osti.gov:1765501 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306402487 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) |
| best_oa_location.source.host_organization | https://openalex.org/I139351228 |
| best_oa_location.source.host_organization_name | Office of Scientific and Technical Information |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I139351228 |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://www.osti.gov/biblio/1765501 |
| primary_location.id | doi:10.1002/prot.26019 |
| primary_location.is_oa | False |
| primary_location.source.id | https://openalex.org/S121161810 |
| primary_location.source.issn | 0887-3585, 1097-0134 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 0887-3585 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Proteins Structure Function and Bioinformatics |
| primary_location.source.host_organization | https://openalex.org/P4310320595 |
| primary_location.source.host_organization_name | Wiley |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310320595 |
| primary_location.source.host_organization_lineage_names | Wiley |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Proteins: Structure, Function, and Bioinformatics |
| primary_location.landing_page_url | https://doi.org/10.1002/prot.26019 |
| publication_date | 2020-10-29 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W2900359059, https://openalex.org/W2899903294, https://openalex.org/W2793366908, https://openalex.org/W2747225505, https://openalex.org/W2071811516, https://openalex.org/W2137780014, https://openalex.org/W2068991236, https://openalex.org/W2016679879, https://openalex.org/W2061664287, https://openalex.org/W2594183968, https://openalex.org/W2979343028, https://openalex.org/W2692027428, https://openalex.org/W2088065111, https://openalex.org/W2761231228, https://openalex.org/W2899677526, https://openalex.org/W2943689272, https://openalex.org/W1968942702, https://openalex.org/W2166696244, https://openalex.org/W4206223219, https://openalex.org/W1965794026, https://openalex.org/W2162544563, https://openalex.org/W2135151253, https://openalex.org/W2029243043, https://openalex.org/W2161042136, https://openalex.org/W1809713791, https://openalex.org/W1954179112, https://openalex.org/W2803771751, https://openalex.org/W2132926880, https://openalex.org/W2128880918, https://openalex.org/W2105329815, https://openalex.org/W2127322768, https://openalex.org/W2750791471, https://openalex.org/W2169478909, https://openalex.org/W2013425283, https://openalex.org/W3136918052, https://openalex.org/W2023368905, https://openalex.org/W2574496196, https://openalex.org/W2095036253, https://openalex.org/W2317863386, https://openalex.org/W1757990252, https://openalex.org/W2148284063, https://openalex.org/W1981745234, https://openalex.org/W2052573265, https://openalex.org/W1976161355, https://openalex.org/W2035199159, https://openalex.org/W2149263899, https://openalex.org/W2103994532, https://openalex.org/W2103591844, https://openalex.org/W2606439133, https://openalex.org/W1993383767, https://openalex.org/W2102537419, https://openalex.org/W2107029002, https://openalex.org/W1496257230, https://openalex.org/W6675354045, https://openalex.org/W2911964244, https://openalex.org/W2070493638, https://openalex.org/W4239510810, https://openalex.org/W2342249984, https://openalex.org/W303319180, https://openalex.org/W6728293375, https://openalex.org/W2011301426, https://openalex.org/W2148143831, https://openalex.org/W1967779445, https://openalex.org/W2931516320, https://openalex.org/W2162220273, https://openalex.org/W2037312364, https://openalex.org/W1987134040, https://openalex.org/W2094458603, https://openalex.org/W4211156111, https://openalex.org/W4211196614, https://openalex.org/W2114520383, https://openalex.org/W2099752571, https://openalex.org/W1980374511, https://openalex.org/W2804822363, https://openalex.org/W1803102843, https://openalex.org/W2050456292, https://openalex.org/W2102377211, https://openalex.org/W2000350990, https://openalex.org/W2134967712, https://openalex.org/W2009113274, https://openalex.org/W2241011319, https://openalex.org/W2159887157, https://openalex.org/W2791355014, https://openalex.org/W2774499496, https://openalex.org/W2783555909, https://openalex.org/W2101234009, https://openalex.org/W4211232835, https://openalex.org/W2242464395, https://openalex.org/W2531370029 |
| referenced_works_count | 89 |
| abstract_inverted_index.= | 145, 149 |
| abstract_inverted_index.a | 46, 51 |
| abstract_inverted_index.an | 8, 98 |
| abstract_inverted_index.as | 43 |
| abstract_inverted_index.be | 41, 169 |
| abstract_inverted_index.by | 7 |
| abstract_inverted_index.is | 15, 166 |
| abstract_inverted_index.of | 4, 30, 53, 70, 88, 106, 113, 124 |
| abstract_inverted_index.to | 85, 116, 168, 174 |
| abstract_inverted_index.we | 95 |
| abstract_inverted_index.ROC | 144 |
| abstract_inverted_index.and | 19, 67, 72, 121, 137, 178, 184 |
| abstract_inverted_index.are | 23 |
| abstract_inverted_index.can | 83, 101 |
| abstract_inverted_index.for | 26, 45, 91, 151, 155, 181 |
| abstract_inverted_index.its | 11 |
| abstract_inverted_index.the | 2, 103, 111, 117, 125 |
| abstract_inverted_index.0.9, | 146 |
| abstract_inverted_index.Each | 123 |
| abstract_inverted_index.Here | 94 |
| abstract_inverted_index.This | 164 |
| abstract_inverted_index.acid | 13 |
| abstract_inverted_index.data | 61 |
| abstract_inverted_index.four | 126 |
| abstract_inverted_index.from | 10 |
| abstract_inverted_index.lead | 84 |
| abstract_inverted_index.some | 162 |
| abstract_inverted_index.such | 97 |
| abstract_inverted_index.that | 82, 100 |
| abstract_inverted_index.they | 33 |
| abstract_inverted_index.this | 156 |
| abstract_inverted_index.used | 180 |
| abstract_inverted_index.will | 40 |
| abstract_inverted_index.with | 62, 74, 172 |
| abstract_inverted_index.acids | 120 |
| abstract_inverted_index.amino | 12 |
| abstract_inverted_index.broad | 28 |
| abstract_inverted_index.class | 52 |
| abstract_inverted_index.given | 47 |
| abstract_inverted_index.often | 24 |
| abstract_inverted_index.range | 3 |
| abstract_inverted_index.scope | 90, 105, 154 |
| abstract_inverted_index.which | 37, 109 |
| abstract_inverted_index.~82%) | 150 |
| abstract_inverted_index.cannot | 35 |
| abstract_inverted_index.enzyme | 9 |
| abstract_inverted_index.forest | 160 |
| abstract_inverted_index.highly | 170 |
| abstract_inverted_index.ligand | 65 |
| abstract_inverted_index.models | 78, 129 |
| abstract_inverted_index.offers | 161 |
| abstract_inverted_index.random | 132, 159 |
| abstract_inverted_index.trees, | 136 |
| abstract_inverted_index.vector | 139 |
| abstract_inverted_index.within | 50 |
| abstract_inverted_index.average | 147 |
| abstract_inverted_index.closely | 54 |
| abstract_inverted_index.enzyme, | 48 |
| abstract_inverted_index.forest, | 133 |
| abstract_inverted_index.ligands | 73 |
| abstract_inverted_index.machine | 76, 127 |
| abstract_inverted_index.modular | 171 |
| abstract_inverted_index.nitrile | 114 |
| abstract_inverted_index.predict | 36, 102 |
| abstract_inverted_index.related | 55, 92 |
| abstract_inverted_index.respect | 173 |
| abstract_inverted_index.support | 138 |
| abstract_inverted_index.various | 75 |
| abstract_inverted_index.(average | 143 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.Although | 17 |
| abstract_inverted_index.accepted | 6, 42 |
| abstract_inverted_index.accuracy | 148 |
| abstract_inverted_index.accurate | 25, 86 |
| abstract_inverted_index.activity | 60 |
| abstract_inverted_index.although | 158 |
| abstract_inverted_index.ammonia. | 122 |
| abstract_inverted_index.approach | 99, 165 |
| abstract_inverted_index.catalyze | 110 |
| abstract_inverted_index.dataset, | 157 |
| abstract_inverted_index.decision | 135 |
| abstract_inverted_index.describe | 96 |
| abstract_inverted_index.docking, | 66 |
| abstract_inverted_index.docking. | 185 |
| abstract_inverted_index.enzymes. | 93 |
| abstract_inverted_index.intended | 167 |
| abstract_inverted_index.learning | 77, 128 |
| abstract_inverted_index.modeling | 183 |
| abstract_inverted_index.property | 176 |
| abstract_inverted_index.proteins | 71 |
| abstract_inverted_index.provides | 79 |
| abstract_inverted_index.sequence | 14 |
| abstract_inverted_index.software | 179 |
| abstract_inverted_index.specific | 38 |
| abstract_inverted_index.targeted | 58 |
| abstract_inverted_index.(logistic | 130 |
| abstract_inverted_index.Combining | 57 |
| abstract_inverted_index.bacterial | 107 |
| abstract_inverted_index.compounds | 115 |
| abstract_inverted_index.generally | 34 |
| abstract_inverted_index.machines) | 140 |
| abstract_inverted_index.modeling, | 64 |
| abstract_inverted_index.molecules | 39 |
| abstract_inverted_index.performed | 141 |
| abstract_inverted_index.similarly | 142 |
| abstract_inverted_index.substrate | 31, 89, 104, 153 |
| abstract_inverted_index.Predicting | 1 |
| abstract_inverted_index.annotation | 21 |
| abstract_inverted_index.approaches | 22 |
| abstract_inverted_index.carboxylic | 119 |
| abstract_inverted_index.categories | 29 |
| abstract_inverted_index.hydrolysis | 112 |
| abstract_inverted_index.molecules. | 56 |
| abstract_inverted_index.predicting | 27, 152 |
| abstract_inverted_index.properties | 69 |
| abstract_inverted_index.structural | 63, 182 |
| abstract_inverted_index.substrates | 5, 44 |
| abstract_inverted_index.advantages. | 163 |
| abstract_inverted_index.information | 81 |
| abstract_inverted_index.nitrilases, | 108 |
| abstract_inverted_index.predictions | 87 |
| abstract_inverted_index.regression, | 131 |
| abstract_inverted_index.sequence‐ | 18 |
| abstract_inverted_index.calculations | 177 |
| abstract_inverted_index.challenging. | 16 |
| abstract_inverted_index.experimental | 59 |
| abstract_inverted_index.particularly | 49 |
| abstract_inverted_index.specificity, | 32 |
| abstract_inverted_index.complementary | 80 |
| abstract_inverted_index.corresponding | 118 |
| abstract_inverted_index.physicochemical | 68, 175 |
| abstract_inverted_index.structure‐based | 20 |
| abstract_inverted_index.gradient‐boosted | 134 |
| cited_by_percentile_year.max | 100 |
| cited_by_percentile_year.min | 97 |
| corresponding_author_ids | https://openalex.org/A5037417242 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 8 |
| corresponding_institution_ids | https://openalex.org/I1289243028, https://openalex.org/I75027704 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/15 |
| sustainable_development_goals[0].score | 0.44999998807907104 |
| sustainable_development_goals[0].display_name | Life in Land |
| citation_normalized_percentile.value | 0.89226933 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |