Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2503.06337
Generative Flow Networks (GFlowNets) have recently emerged as a suitable framework for generating diverse and high-quality molecular structures by learning from rewards treated as unnormalized distributions. Previous works in this framework often restrict exploration by using predefined molecular fragments as building blocks, limiting the chemical space that can be accessed. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using drug-like molecule datasets, which teaches A-GFNs about inexpensive yet informative molecular descriptors such as drug-likeliness, topological polar surface area, and synthetic accessibility scores. These properties serve as proxy rewards, guiding A-GFNs towards regions of chemical space that exhibit desirable pharmacological properties. We further implement a goal-conditioned finetuning process, which adapts A-GFNs to optimize for specific target properties. In this work, we pretrain A-GFN on a subset of ZINC dataset, and by employing robust evaluation metrics we show the effectiveness of our approach when compared to other relevant baseline methods for a wide range of drug design tasks. The code is accessible at https://github.com/diamondspark/AGFN.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2503.06337
- https://arxiv.org/pdf/2503.06337
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4408973838
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4408973838Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2503.06337Digital Object Identifier
- Title
-
Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph GenerationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-03-08Full publication date if available
- Authors
-
Mohit Pandey, Gopeshh Subbaraj, Artem Cherkasov, Martin Ester, Emmanuel BengioList of authors in order
- Landing page
-
https://arxiv.org/abs/2503.06337Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2503.06337Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2503.06337Direct OA link when available
- Concepts
-
Generative grammar, Graph, Computer science, Flow (mathematics), Control flow graph, Artificial intelligence, Theoretical computer science, Mathematics, GeometryTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4408973838 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2503.06337 |
| ids.doi | https://doi.org/10.48550/arxiv.2503.06337 |
| ids.openalex | https://openalex.org/W4408973838 |
| fwci | |
| type | preprint |
| title | Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11407 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.9944000244140625 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2204 |
| topics[0].subfield.display_name | Biomedical Engineering |
| topics[0].display_name | Innovative Microfluidic and Catalytic Techniques Innovation |
| topics[1].id | https://openalex.org/T11948 |
| topics[1].field.id | https://openalex.org/fields/25 |
| topics[1].field.display_name | Materials Science |
| topics[1].score | 0.9337000250816345 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2505 |
| topics[1].subfield.display_name | Materials Chemistry |
| topics[1].display_name | Machine Learning in Materials Science |
| topics[2].id | https://openalex.org/T10211 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9279999732971191 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1703 |
| topics[2].subfield.display_name | Computational Theory and Mathematics |
| topics[2].display_name | Computational Drug Discovery Methods |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C39890363 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8049079179763794 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q36108 |
| concepts[0].display_name | Generative grammar |
| concepts[1].id | https://openalex.org/C132525143 |
| concepts[1].level | 2 |
| concepts[1].score | 0.575545608997345 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q141488 |
| concepts[1].display_name | Graph |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5203951597213745 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C38349280 |
| concepts[3].level | 2 |
| concepts[3].score | 0.47800102829933167 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1434290 |
| concepts[3].display_name | Flow (mathematics) |
| concepts[4].id | https://openalex.org/C27458966 |
| concepts[4].level | 2 |
| concepts[4].score | 0.4164818823337555 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q1187693 |
| concepts[4].display_name | Control flow graph |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.32617396116256714 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C80444323 |
| concepts[6].level | 1 |
| concepts[6].score | 0.29187750816345215 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[6].display_name | Theoretical computer science |
| concepts[7].id | https://openalex.org/C33923547 |
| concepts[7].level | 0 |
| concepts[7].score | 0.2306532859802246 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[7].display_name | Mathematics |
| concepts[8].id | https://openalex.org/C2524010 |
| concepts[8].level | 1 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[8].display_name | Geometry |
| keywords[0].id | https://openalex.org/keywords/generative-grammar |
| keywords[0].score | 0.8049079179763794 |
| keywords[0].display_name | Generative grammar |
| keywords[1].id | https://openalex.org/keywords/graph |
| keywords[1].score | 0.575545608997345 |
| keywords[1].display_name | Graph |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5203951597213745 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/flow |
| keywords[3].score | 0.47800102829933167 |
| keywords[3].display_name | Flow (mathematics) |
| keywords[4].id | https://openalex.org/keywords/control-flow-graph |
| keywords[4].score | 0.4164818823337555 |
| keywords[4].display_name | Control flow graph |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.32617396116256714 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[6].score | 0.29187750816345215 |
| keywords[6].display_name | Theoretical computer science |
| keywords[7].id | https://openalex.org/keywords/mathematics |
| keywords[7].score | 0.2306532859802246 |
| keywords[7].display_name | Mathematics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2503.06337 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2503.06337 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2503.06337 |
| locations[1].id | doi:10.48550/arxiv.2503.06337 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2503.06337 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5014143640 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-2562-7155 |
| authorships[0].author.display_name | Mohit Pandey |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Pandey, Mohit |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5114369609 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Gopeshh Subbaraj |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Subbaraj, Gopeshh |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5070580886 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1599-1439 |
| authorships[2].author.display_name | Artem Cherkasov |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Cherkasov, Artem |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5018267399 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-7732-2815 |
| authorships[3].author.display_name | Martin Ester |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Ester, Martin |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5114369610 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Emmanuel Bengio |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Bengio, Emmanuel |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2503.06337 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11407 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.9944000244140625 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2204 |
| primary_topic.subfield.display_name | Biomedical Engineering |
| primary_topic.display_name | Innovative Microfluidic and Catalytic Techniques Innovation |
| related_works | https://openalex.org/W2380075625, https://openalex.org/W4390718435, https://openalex.org/W4390549206, https://openalex.org/W3137171911, https://openalex.org/W4379540039, https://openalex.org/W4237784285, https://openalex.org/W2374712251, https://openalex.org/W4383031710, https://openalex.org/W3211753092, https://openalex.org/W2386000789 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2503.06337 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2503.06337 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2503.06337 |
| primary_location.id | pmh:oai:arXiv.org:2503.06337 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2503.06337 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2503.06337 |
| publication_date | 2025-03-08 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 8, 58, 126, 146, 172 |
| abstract_inverted_index.In | 50, 139 |
| abstract_inverted_index.We | 75, 123 |
| abstract_inverted_index.an | 77 |
| abstract_inverted_index.as | 7, 23, 39, 65, 95, 108 |
| abstract_inverted_index.at | 183 |
| abstract_inverted_index.be | 48 |
| abstract_inverted_index.by | 18, 34, 152 |
| abstract_inverted_index.in | 28 |
| abstract_inverted_index.is | 181 |
| abstract_inverted_index.of | 115, 148, 161, 175 |
| abstract_inverted_index.on | 145 |
| abstract_inverted_index.to | 68, 133, 166 |
| abstract_inverted_index.we | 53, 142, 157 |
| abstract_inverted_index.The | 179 |
| abstract_inverted_index.and | 14, 101, 151 |
| abstract_inverted_index.can | 47 |
| abstract_inverted_index.for | 11, 135, 171 |
| abstract_inverted_index.our | 162 |
| abstract_inverted_index.the | 43, 159 |
| abstract_inverted_index.yet | 90 |
| abstract_inverted_index.Flow | 1 |
| abstract_inverted_index.ZINC | 149 |
| abstract_inverted_index.code | 180 |
| abstract_inverted_index.drug | 176 |
| abstract_inverted_index.from | 20 |
| abstract_inverted_index.have | 4 |
| abstract_inverted_index.more | 73 |
| abstract_inverted_index.show | 158 |
| abstract_inverted_index.such | 94 |
| abstract_inverted_index.that | 46, 118 |
| abstract_inverted_index.this | 29, 51, 140 |
| abstract_inverted_index.when | 164 |
| abstract_inverted_index.wide | 173 |
| abstract_inverted_index.A-GFN | 144 |
| abstract_inverted_index.These | 105 |
| abstract_inverted_index.about | 88 |
| abstract_inverted_index.area, | 100 |
| abstract_inverted_index.atoms | 64 |
| abstract_inverted_index.model | 61 |
| abstract_inverted_index.often | 31 |
| abstract_inverted_index.other | 167 |
| abstract_inverted_index.polar | 98 |
| abstract_inverted_index.proxy | 109 |
| abstract_inverted_index.range | 174 |
| abstract_inverted_index.serve | 107 |
| abstract_inverted_index.space | 45, 72, 117 |
| abstract_inverted_index.using | 35, 81 |
| abstract_inverted_index.which | 85, 130 |
| abstract_inverted_index.work, | 52, 141 |
| abstract_inverted_index.works | 27 |
| abstract_inverted_index.A-GFNs | 87, 112, 132 |
| abstract_inverted_index.Atomic | 55 |
| abstract_inverted_index.adapts | 131 |
| abstract_inverted_index.blocks | 67 |
| abstract_inverted_index.design | 177 |
| abstract_inverted_index.robust | 154 |
| abstract_inverted_index.subset | 147 |
| abstract_inverted_index.target | 137 |
| abstract_inverted_index.tasks. | 178 |
| abstract_inverted_index.blocks, | 41 |
| abstract_inverted_index.diverse | 13 |
| abstract_inverted_index.emerged | 6 |
| abstract_inverted_index.exhibit | 119 |
| abstract_inverted_index.explore | 69 |
| abstract_inverted_index.further | 124 |
| abstract_inverted_index.guiding | 111 |
| abstract_inverted_index.methods | 170 |
| abstract_inverted_index.metrics | 156 |
| abstract_inverted_index.propose | 76 |
| abstract_inverted_index.regions | 114 |
| abstract_inverted_index.rewards | 21 |
| abstract_inverted_index.scores. | 104 |
| abstract_inverted_index.surface | 99 |
| abstract_inverted_index.teaches | 86 |
| abstract_inverted_index.towards | 113 |
| abstract_inverted_index.treated | 22 |
| abstract_inverted_index.Networks | 2 |
| abstract_inverted_index.Previous | 26 |
| abstract_inverted_index.approach | 80, 163 |
| abstract_inverted_index.baseline | 169 |
| abstract_inverted_index.building | 40, 66 |
| abstract_inverted_index.chemical | 44, 71, 116 |
| abstract_inverted_index.compared | 165 |
| abstract_inverted_index.dataset, | 150 |
| abstract_inverted_index.learning | 19 |
| abstract_inverted_index.limiting | 42 |
| abstract_inverted_index.molecule | 83 |
| abstract_inverted_index.optimize | 134 |
| abstract_inverted_index.pretrain | 143 |
| abstract_inverted_index.process, | 129 |
| abstract_inverted_index.recently | 5 |
| abstract_inverted_index.relevant | 168 |
| abstract_inverted_index.restrict | 32 |
| abstract_inverted_index.rewards, | 110 |
| abstract_inverted_index.specific | 136 |
| abstract_inverted_index.suitable | 9 |
| abstract_inverted_index.(A-GFNs), | 57 |
| abstract_inverted_index.GFlowNets | 56 |
| abstract_inverted_index.accessed. | 49 |
| abstract_inverted_index.datasets, | 84 |
| abstract_inverted_index.desirable | 120 |
| abstract_inverted_index.drug-like | 70, 82 |
| abstract_inverted_index.employing | 153 |
| abstract_inverted_index.fragments | 38 |
| abstract_inverted_index.framework | 10, 30 |
| abstract_inverted_index.implement | 125 |
| abstract_inverted_index.introduce | 54 |
| abstract_inverted_index.molecular | 16, 37, 92 |
| abstract_inverted_index.synthetic | 102 |
| abstract_inverted_index.Generative | 0 |
| abstract_inverted_index.accessible | 182 |
| abstract_inverted_index.evaluation | 155 |
| abstract_inverted_index.finetuning | 128 |
| abstract_inverted_index.generating | 12 |
| abstract_inverted_index.generative | 60 |
| abstract_inverted_index.individual | 63 |
| abstract_inverted_index.leveraging | 62 |
| abstract_inverted_index.predefined | 36 |
| abstract_inverted_index.properties | 106 |
| abstract_inverted_index.structures | 17 |
| abstract_inverted_index.(GFlowNets) | 3 |
| abstract_inverted_index.descriptors | 93 |
| abstract_inverted_index.exploration | 33 |
| abstract_inverted_index.inexpensive | 89 |
| abstract_inverted_index.informative | 91 |
| abstract_inverted_index.properties. | 122, 138 |
| abstract_inverted_index.topological | 97 |
| abstract_inverted_index.foundational | 59 |
| abstract_inverted_index.high-quality | 15 |
| abstract_inverted_index.pre-training | 79 |
| abstract_inverted_index.unnormalized | 24 |
| abstract_inverted_index.unsupervised | 78 |
| abstract_inverted_index.accessibility | 103 |
| abstract_inverted_index.effectiveness | 160 |
| abstract_inverted_index.distributions. | 25 |
| abstract_inverted_index.pharmacological | 121 |
| abstract_inverted_index.comprehensively. | 74 |
| abstract_inverted_index.drug-likeliness, | 96 |
| abstract_inverted_index.goal-conditioned | 127 |
| abstract_inverted_index.https://github.com/diamondspark/AGFN. | 184 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |