Translating the Molecules: Adapting Neural Machine Translation to Predict IUPAC Names from a Chemical Identifier Article Swipe
YOU?
·
· 2021
· Open Access
·
· DOI: https://doi.org/10.26434/chemrxiv.14170472.v1
We present a sequence-to-sequence machine learning model for predicting the IUPAC name of a chemical from its standard International Chemical Identifier (InChI). The model uses two stacks of transformers in an encoder-decoder architecture, a setup similar to the neural networks used in state-of-the-art machine translation. Unlike neural machine translation, which usually tokenizes input and output into words or sub-words, our model processes the InChI and predicts the 2 IUPAC name character by character. The model was trained on a dataset of 10 million InChI/IUPAC name pairs freely downloaded from the National Library of Medicine’s online PubChem service. Training took five days on a Tesla K80 GPU, and the model achieved test-set accuracies of 95% (character-level) and 91% (whole name). The model performed particularly well on organics, with the exception of macrocycles. The predictions were less accurate for inorganic compounds, with a character-level accuracy of 71%. This can be explained by inherent limitations in InChI for representing inorganics, as well as low coverage (1.4 %) of the training data.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.26434/chemrxiv.14170472.v1
- https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdf
- OA Status
- gold
- Cited By
- 4
- References
- 15
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4235912386
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4235912386Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.26434/chemrxiv.14170472.v1Digital Object Identifier
- Title
-
Translating the Molecules: Adapting Neural Machine Translation to Predict IUPAC Names from a Chemical IdentifierWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2021Year of publication
- Publication date
-
2021-03-08Full publication date if available
- Authors
-
Jennifer Handsel, Brian W. Matthews, Nicola Knight, Simon J. ColesList of authors in order
- Landing page
-
https://doi.org/10.26434/chemrxiv.14170472.v1Publisher landing page
- PDF URL
-
https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdfDirect OA link when available
- Concepts
-
Chemical nomenclature, Computer science, Identifier, Natural language processing, Artificial intelligence, Machine translation, Character (mathematics), Cheminformatics, Artificial neural network, Encoder, Chemical database, Training set, Machine learning, Chemistry, Programming language, Mathematics, Geometry, Operating system, Organic chemistry, Computational chemistryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2022: 1, 2021: 2, 2020: 1Per-year citation counts (last 5 years)
- References (count)
-
15Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4235912386 |
|---|---|
| doi | https://doi.org/10.26434/chemrxiv.14170472.v1 |
| ids.doi | https://doi.org/10.26434/chemrxiv.14170472.v1 |
| ids.openalex | https://openalex.org/W4235912386 |
| fwci | 0.48027636 |
| type | preprint |
| title | Translating the Molecules: Adapting Neural Machine Translation to Predict IUPAC Names from a Chemical Identifier |
| awards[0].id | https://openalex.org/G3684427489 |
| awards[0].funder_id | https://openalex.org/F4320334627 |
| awards[0].funder_award_id | EP/S020357/1 |
| awards[0].funder_display_name | Engineering and Physical Sciences Research Council |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| grants[0].funder | https://openalex.org/F4320334627 |
| grants[0].award_id | EP/S020357/1 |
| grants[0].funder_display_name | Engineering and Physical Sciences Research Council |
| topics[0].id | https://openalex.org/T10211 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9959999918937683 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1703 |
| topics[0].subfield.display_name | Computational Theory and Mathematics |
| topics[0].display_name | Computational Drug Discovery Methods |
| topics[1].id | https://openalex.org/T11710 |
| topics[1].field.id | https://openalex.org/fields/13 |
| topics[1].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[1].score | 0.9703999757766724 |
| topics[1].domain.id | https://openalex.org/domains/1 |
| topics[1].domain.display_name | Life Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1312 |
| topics[1].subfield.display_name | Molecular Biology |
| topics[1].display_name | Biomedical Text Mining and Ontologies |
| topics[2].id | https://openalex.org/T11948 |
| topics[2].field.id | https://openalex.org/fields/25 |
| topics[2].field.display_name | Materials Science |
| topics[2].score | 0.9645000100135803 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2505 |
| topics[2].subfield.display_name | Materials Chemistry |
| topics[2].display_name | Machine Learning in Materials Science |
| funders[0].id | https://openalex.org/F4320334627 |
| funders[0].ror | https://ror.org/0439y7842 |
| funders[0].display_name | Engineering and Physical Sciences Research Council |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C125832229 |
| concepts[0].level | 2 |
| concepts[0].score | 0.9715198874473572 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q6503924 |
| concepts[0].display_name | Chemical nomenclature |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6829936504364014 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154504017 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6537322402000427 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q853614 |
| concepts[2].display_name | Identifier |
| concepts[3].id | https://openalex.org/C204321447 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6530951261520386 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[3].display_name | Natural language processing |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.634408175945282 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C203005215 |
| concepts[5].level | 2 |
| concepts[5].score | 0.6257988214492798 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q79798 |
| concepts[5].display_name | Machine translation |
| concepts[6].id | https://openalex.org/C2780861071 |
| concepts[6].level | 2 |
| concepts[6].score | 0.6254986524581909 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1062934 |
| concepts[6].display_name | Character (mathematics) |
| concepts[7].id | https://openalex.org/C68762167 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5517058372497559 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q910164 |
| concepts[7].display_name | Cheminformatics |
| concepts[8].id | https://openalex.org/C50644808 |
| concepts[8].level | 2 |
| concepts[8].score | 0.5305265188217163 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[8].display_name | Artificial neural network |
| concepts[9].id | https://openalex.org/C118505674 |
| concepts[9].level | 2 |
| concepts[9].score | 0.43662768602371216 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q42586063 |
| concepts[9].display_name | Encoder |
| concepts[10].id | https://openalex.org/C203394866 |
| concepts[10].level | 2 |
| concepts[10].score | 0.4268842935562134 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q2881060 |
| concepts[10].display_name | Chemical database |
| concepts[11].id | https://openalex.org/C51632099 |
| concepts[11].level | 2 |
| concepts[11].score | 0.4253823757171631 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q3985153 |
| concepts[11].display_name | Training set |
| concepts[12].id | https://openalex.org/C119857082 |
| concepts[12].level | 1 |
| concepts[12].score | 0.41066834330558777 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[12].display_name | Machine learning |
| concepts[13].id | https://openalex.org/C185592680 |
| concepts[13].level | 0 |
| concepts[13].score | 0.23369953036308289 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[13].display_name | Chemistry |
| concepts[14].id | https://openalex.org/C199360897 |
| concepts[14].level | 1 |
| concepts[14].score | 0.16183626651763916 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[14].display_name | Programming language |
| concepts[15].id | https://openalex.org/C33923547 |
| concepts[15].level | 0 |
| concepts[15].score | 0.11742293834686279 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[15].display_name | Mathematics |
| concepts[16].id | https://openalex.org/C2524010 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[16].display_name | Geometry |
| concepts[17].id | https://openalex.org/C111919701 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[17].display_name | Operating system |
| concepts[18].id | https://openalex.org/C178790620 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q11351 |
| concepts[18].display_name | Organic chemistry |
| concepts[19].id | https://openalex.org/C147597530 |
| concepts[19].level | 1 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q369472 |
| concepts[19].display_name | Computational chemistry |
| keywords[0].id | https://openalex.org/keywords/chemical-nomenclature |
| keywords[0].score | 0.9715198874473572 |
| keywords[0].display_name | Chemical nomenclature |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6829936504364014 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/identifier |
| keywords[2].score | 0.6537322402000427 |
| keywords[2].display_name | Identifier |
| keywords[3].id | https://openalex.org/keywords/natural-language-processing |
| keywords[3].score | 0.6530951261520386 |
| keywords[3].display_name | Natural language processing |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.634408175945282 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/machine-translation |
| keywords[5].score | 0.6257988214492798 |
| keywords[5].display_name | Machine translation |
| keywords[6].id | https://openalex.org/keywords/character |
| keywords[6].score | 0.6254986524581909 |
| keywords[6].display_name | Character (mathematics) |
| keywords[7].id | https://openalex.org/keywords/cheminformatics |
| keywords[7].score | 0.5517058372497559 |
| keywords[7].display_name | Cheminformatics |
| keywords[8].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[8].score | 0.5305265188217163 |
| keywords[8].display_name | Artificial neural network |
| keywords[9].id | https://openalex.org/keywords/encoder |
| keywords[9].score | 0.43662768602371216 |
| keywords[9].display_name | Encoder |
| keywords[10].id | https://openalex.org/keywords/chemical-database |
| keywords[10].score | 0.4268842935562134 |
| keywords[10].display_name | Chemical database |
| keywords[11].id | https://openalex.org/keywords/training-set |
| keywords[11].score | 0.4253823757171631 |
| keywords[11].display_name | Training set |
| keywords[12].id | https://openalex.org/keywords/machine-learning |
| keywords[12].score | 0.41066834330558777 |
| keywords[12].display_name | Machine learning |
| keywords[13].id | https://openalex.org/keywords/chemistry |
| keywords[13].score | 0.23369953036308289 |
| keywords[13].display_name | Chemistry |
| keywords[14].id | https://openalex.org/keywords/programming-language |
| keywords[14].score | 0.16183626651763916 |
| keywords[14].display_name | Programming language |
| keywords[15].id | https://openalex.org/keywords/mathematics |
| keywords[15].score | 0.11742293834686279 |
| keywords[15].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.26434/chemrxiv.14170472.v1 |
| locations[0].is_oa | True |
| locations[0].source | |
| locations[0].license | cc-by-nc-nd |
| locations[0].pdf_url | https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdf |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | https://openalex.org/licenses/cc-by-nc-nd |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.26434/chemrxiv.14170472.v1 |
| locations[1].id | pmh:oai:figshare.com:article/14170472 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400572 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | OPAL (Open@LaTrobe) (La Trobe University) |
| locations[1].source.host_organization | https://openalex.org/I196829312 |
| locations[1].source.host_organization_name | La Trobe University |
| locations[1].source.host_organization_lineage | https://openalex.org/I196829312 |
| locations[1].license | cc-by-nc-nd |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | Text |
| locations[1].license_id | https://openalex.org/licenses/cc-by-nc-nd |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5064582083 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-1546-5911 |
| authorships[0].author.display_name | Jennifer Handsel |
| authorships[0].countries | GB |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I162524378 |
| authorships[0].affiliations[0].raw_affiliation_string | Science and Technology Facilities Council |
| authorships[0].institutions[0].id | https://openalex.org/I162524378 |
| authorships[0].institutions[0].ror | https://ror.org/057g20z61 |
| authorships[0].institutions[0].type | government |
| authorships[0].institutions[0].lineage | https://openalex.org/I162524378, https://openalex.org/I4210087105 |
| authorships[0].institutions[0].country_code | GB |
| authorships[0].institutions[0].display_name | Science and Technology Facilities Council |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Jennifer Handsel |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Science and Technology Facilities Council |
| authorships[1].author.id | https://openalex.org/A5112526839 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Brian W. Matthews |
| authorships[1].countries | GB |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I162524378 |
| authorships[1].affiliations[0].raw_affiliation_string | Scientific Computing Department, Science and Technology Facilities Council, Didcot, OX11 0FA, UK. |
| authorships[1].institutions[0].id | https://openalex.org/I162524378 |
| authorships[1].institutions[0].ror | https://ror.org/057g20z61 |
| authorships[1].institutions[0].type | government |
| authorships[1].institutions[0].lineage | https://openalex.org/I162524378, https://openalex.org/I4210087105 |
| authorships[1].institutions[0].country_code | GB |
| authorships[1].institutions[0].display_name | Science and Technology Facilities Council |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Brian Matthews |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Scientific Computing Department, Science and Technology Facilities Council, Didcot, OX11 0FA, UK. |
| authorships[2].author.id | https://openalex.org/A5084670412 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-8286-3835 |
| authorships[2].author.display_name | Nicola Knight |
| authorships[2].countries | GB |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I43439940 |
| authorships[2].affiliations[0].raw_affiliation_string | School of Chemistry, Faculty of Engineering and Physical Sciences, University of Southampton, Southampton, SO17 1BJ, UK. |
| authorships[2].institutions[0].id | https://openalex.org/I43439940 |
| authorships[2].institutions[0].ror | https://ror.org/01ryk1543 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I43439940 |
| authorships[2].institutions[0].country_code | GB |
| authorships[2].institutions[0].display_name | University of Southampton |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Nicola Knight |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | School of Chemistry, Faculty of Engineering and Physical Sciences, University of Southampton, Southampton, SO17 1BJ, UK. |
| authorships[3].author.id | https://openalex.org/A5014634951 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-8414-9272 |
| authorships[3].author.display_name | Simon J. Coles |
| authorships[3].countries | GB |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I43439940 |
| authorships[3].affiliations[0].raw_affiliation_string | School of Chemistry, Faculty of Engineering and Physical Sciences, University of Southampton, Southampton, SO17 1BJ, UK. |
| authorships[3].institutions[0].id | https://openalex.org/I43439940 |
| authorships[3].institutions[0].ror | https://ror.org/01ryk1543 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I43439940 |
| authorships[3].institutions[0].country_code | GB |
| authorships[3].institutions[0].display_name | University of Southampton |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Simon Coles |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | School of Chemistry, Faculty of Engineering and Physical Sciences, University of Southampton, Southampton, SO17 1BJ, UK. |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdf |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Translating the Molecules: Adapting Neural Machine Translation to Predict IUPAC Names from a Chemical Identifier |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10211 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9959999918937683 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1703 |
| primary_topic.subfield.display_name | Computational Theory and Mathematics |
| primary_topic.display_name | Computational Drug Discovery Methods |
| related_works | https://openalex.org/W3186979489, https://openalex.org/W2153037537, https://openalex.org/W2317928565, https://openalex.org/W3143892890, https://openalex.org/W2561309435, https://openalex.org/W2185159477, https://openalex.org/W2903373627, https://openalex.org/W158734140, https://openalex.org/W2988249053, https://openalex.org/W2066666235 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2022 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2021 |
| counts_by_year[1].cited_by_count | 2 |
| counts_by_year[2].year | 2020 |
| counts_by_year[2].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | doi:10.26434/chemrxiv.14170472.v1 |
| best_oa_location.is_oa | True |
| best_oa_location.source | |
| best_oa_location.license | cc-by-nc-nd |
| best_oa_location.pdf_url | https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdf |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | posted-content |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-nc-nd |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.26434/chemrxiv.14170472.v1 |
| primary_location.id | doi:10.26434/chemrxiv.14170472.v1 |
| primary_location.is_oa | True |
| primary_location.source | |
| primary_location.license | cc-by-nc-nd |
| primary_location.pdf_url | https://chemrxiv.org/engage/api-gateway/chemrxiv/assets/orp/resource/item/60c755d5567dfe21ffec6368/original/translating-the-molecules-adapting-neural-machine-translation-to-predict-iupac-names-from-a-chemical-identifier.pdf |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | https://openalex.org/licenses/cc-by-nc-nd |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.26434/chemrxiv.14170472.v1 |
| publication_date | 2021-03-08 |
| publication_year | 2021 |
| referenced_works | https://openalex.org/W2344597831, https://openalex.org/W2240521982, https://openalex.org/W3097145107, https://openalex.org/W2169678694, https://openalex.org/W1981276685, https://openalex.org/W575847245, https://openalex.org/W1976892175, https://openalex.org/W1904365287, https://openalex.org/W2183341477, https://openalex.org/W4385245566, https://openalex.org/W2962784628, https://openalex.org/W2963250244, https://openalex.org/W1508604947, https://openalex.org/W1902237438, https://openalex.org/W3099107321 |
| referenced_works_count | 15 |
| abstract_inverted_index.2 | 67 |
| abstract_inverted_index.a | 2, 13, 33, 78, 102, 140 |
| abstract_inverted_index.%) | 163 |
| abstract_inverted_index.10 | 81 |
| abstract_inverted_index.We | 0 |
| abstract_inverted_index.an | 30 |
| abstract_inverted_index.as | 157, 159 |
| abstract_inverted_index.be | 147 |
| abstract_inverted_index.by | 71, 149 |
| abstract_inverted_index.in | 29, 41, 152 |
| abstract_inverted_index.of | 12, 27, 80, 92, 112, 129, 143, 164 |
| abstract_inverted_index.on | 77, 101, 124 |
| abstract_inverted_index.or | 57 |
| abstract_inverted_index.to | 36 |
| abstract_inverted_index.91% | 116 |
| abstract_inverted_index.95% | 113 |
| abstract_inverted_index.K80 | 104 |
| abstract_inverted_index.The | 22, 73, 119, 131 |
| abstract_inverted_index.and | 53, 64, 106, 115 |
| abstract_inverted_index.can | 146 |
| abstract_inverted_index.for | 7, 136, 154 |
| abstract_inverted_index.its | 16 |
| abstract_inverted_index.low | 160 |
| abstract_inverted_index.our | 59 |
| abstract_inverted_index.the | 9, 37, 62, 66, 89, 107, 127, 165 |
| abstract_inverted_index.two | 25 |
| abstract_inverted_index.was | 75 |
| abstract_inverted_index.(1.4 | 162 |
| abstract_inverted_index.71%. | 144 |
| abstract_inverted_index.GPU, | 105 |
| abstract_inverted_index.This | 145 |
| abstract_inverted_index.days | 100 |
| abstract_inverted_index.five | 99 |
| abstract_inverted_index.from | 15, 88 |
| abstract_inverted_index.into | 55 |
| abstract_inverted_index.less | 134 |
| abstract_inverted_index.name | 11, 69, 84 |
| abstract_inverted_index.took | 98 |
| abstract_inverted_index.used | 40 |
| abstract_inverted_index.uses | 24 |
| abstract_inverted_index.well | 123, 158 |
| abstract_inverted_index.were | 133 |
| abstract_inverted_index.with | 126, 139 |
| abstract_inverted_index.IUPAC | 10, 68 |
| abstract_inverted_index.InChI | 63, 153 |
| abstract_inverted_index.Tesla | 103 |
| abstract_inverted_index.data. | 167 |
| abstract_inverted_index.input | 52 |
| abstract_inverted_index.model | 6, 23, 60, 74, 108, 120 |
| abstract_inverted_index.pairs | 85 |
| abstract_inverted_index.setup | 34 |
| abstract_inverted_index.which | 49 |
| abstract_inverted_index.words | 56 |
| abstract_inverted_index.(whole | 117 |
| abstract_inverted_index.Unlike | 45 |
| abstract_inverted_index.freely | 86 |
| abstract_inverted_index.name). | 118 |
| abstract_inverted_index.neural | 38, 46 |
| abstract_inverted_index.online | 94 |
| abstract_inverted_index.output | 54 |
| abstract_inverted_index.stacks | 26 |
| abstract_inverted_index.Library | 91 |
| abstract_inverted_index.PubChem | 95 |
| abstract_inverted_index.dataset | 79 |
| abstract_inverted_index.machine | 4, 43, 47 |
| abstract_inverted_index.million | 82 |
| abstract_inverted_index.present | 1 |
| abstract_inverted_index.similar | 35 |
| abstract_inverted_index.trained | 76 |
| abstract_inverted_index.usually | 50 |
| abstract_inverted_index.(InChI). | 21 |
| abstract_inverted_index.Chemical | 19 |
| abstract_inverted_index.National | 90 |
| abstract_inverted_index.Training | 97 |
| abstract_inverted_index.accuracy | 142 |
| abstract_inverted_index.accurate | 135 |
| abstract_inverted_index.achieved | 109 |
| abstract_inverted_index.chemical | 14 |
| abstract_inverted_index.coverage | 161 |
| abstract_inverted_index.inherent | 150 |
| abstract_inverted_index.learning | 5 |
| abstract_inverted_index.networks | 39 |
| abstract_inverted_index.predicts | 65 |
| abstract_inverted_index.service. | 96 |
| abstract_inverted_index.standard | 17 |
| abstract_inverted_index.test-set | 110 |
| abstract_inverted_index.training | 166 |
| abstract_inverted_index.character | 70 |
| abstract_inverted_index.exception | 128 |
| abstract_inverted_index.explained | 148 |
| abstract_inverted_index.inorganic | 137 |
| abstract_inverted_index.organics, | 125 |
| abstract_inverted_index.performed | 121 |
| abstract_inverted_index.processes | 61 |
| abstract_inverted_index.tokenizes | 51 |
| abstract_inverted_index.Identifier | 20 |
| abstract_inverted_index.accuracies | 111 |
| abstract_inverted_index.character. | 72 |
| abstract_inverted_index.compounds, | 138 |
| abstract_inverted_index.downloaded | 87 |
| abstract_inverted_index.predicting | 8 |
| abstract_inverted_index.sub-words, | 58 |
| abstract_inverted_index.InChI/IUPAC | 83 |
| abstract_inverted_index.inorganics, | 156 |
| abstract_inverted_index.limitations | 151 |
| abstract_inverted_index.predictions | 132 |
| abstract_inverted_index.Medicine’s | 93 |
| abstract_inverted_index.macrocycles. | 130 |
| abstract_inverted_index.particularly | 122 |
| abstract_inverted_index.representing | 155 |
| abstract_inverted_index.transformers | 28 |
| abstract_inverted_index.translation, | 48 |
| abstract_inverted_index.translation. | 44 |
| abstract_inverted_index.International | 18 |
| abstract_inverted_index.architecture, | 32 |
| abstract_inverted_index.character-level | 141 |
| abstract_inverted_index.encoder-decoder | 31 |
| abstract_inverted_index.state-of-the-art | 42 |
| abstract_inverted_index.(character-level) | 114 |
| abstract_inverted_index.sequence-to-sequence | 3 |
| cited_by_percentile_year.max | 95 |
| cited_by_percentile_year.min | 89 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/17 |
| sustainable_development_goals[0].score | 0.44999998807907104 |
| sustainable_development_goals[0].display_name | Partnerships for the goals |
| citation_normalized_percentile.value | 0.70453934 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |