Archon: An Architecture Search Framework for Inference-Time Techniques Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2409.15254
Inference-time techniques, such as repeated sampling or iterative revisions, are emerging as powerful ways to enhance large-language models (LLMs) at test time. However, best practices for developing systems that combine these techniques remain underdeveloped due to our limited understanding of the utility of each technique across models and tasks, the interactions between them, and the massive search space for combining them. To address these challenges, we introduce Archon, a modular and automated framework for optimizing the process of selecting and combining inference-time techniques and LLMs. Given a compute budget and a set of available LLMs, Archon explores a large design space to discover optimized configurations tailored to target benchmarks. It can design custom or general-purpose architectures that advance the Pareto frontier of accuracy vs. maximum token budget compared to top-performing baselines. Across instruction-following, reasoning, and coding tasks, we show that Archon can leverage additional inference compute budget to design systems that outperform frontier models such as OpenAI's o1, GPT-4o, and Claude 3.5 Sonnet by an average of 15.1%.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2409.15254
- https://arxiv.org/pdf/2409.15254
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403853635
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403853635Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2409.15254Digital Object Identifier
- Title
-
Archon: An Architecture Search Framework for Inference-Time TechniquesWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-09-23Full publication date if available
- Authors
-
Jon Saad-Falcon, Alberto Lluch Lafuente, Shlok Natarajan, Noriaki Maru, Hristo Todorov, Etash Guha, E. Kelly Buchanan, Mayee Chen, Neel Guha, Christopher Ré, Azalia MirhoseiniList of authors in order
- Landing page
-
https://arxiv.org/abs/2409.15254Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2409.15254Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2409.15254Direct OA link when available
- Concepts
-
Architecture, Inference, Computer science, Artificial intelligence, Natural language processing, History, ArchaeologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403853635 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2409.15254 |
| ids.doi | https://doi.org/10.48550/arxiv.2409.15254 |
| ids.openalex | https://openalex.org/W4403853635 |
| fwci | |
| type | preprint |
| title | Archon: An Architecture Search Framework for Inference-Time Techniques |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10317 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9807999730110168 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1705 |
| topics[0].subfield.display_name | Computer Networks and Communications |
| topics[0].display_name | Advanced Database Systems and Queries |
| topics[1].id | https://openalex.org/T10215 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9394000172615051 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Semantic Web and Ontologies |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C123657996 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6998903751373291 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q12271 |
| concepts[0].display_name | Architecture |
| concepts[1].id | https://openalex.org/C2776214188 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6699793338775635 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q408386 |
| concepts[1].display_name | Inference |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.5119815468788147 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3698210120201111 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C204321447 |
| concepts[4].level | 1 |
| concepts[4].score | 0.3222328722476959 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[4].display_name | Natural language processing |
| concepts[5].id | https://openalex.org/C95457728 |
| concepts[5].level | 0 |
| concepts[5].score | 0.24345937371253967 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q309 |
| concepts[5].display_name | History |
| concepts[6].id | https://openalex.org/C166957645 |
| concepts[6].level | 1 |
| concepts[6].score | 0.11198320984840393 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q23498 |
| concepts[6].display_name | Archaeology |
| keywords[0].id | https://openalex.org/keywords/architecture |
| keywords[0].score | 0.6998903751373291 |
| keywords[0].display_name | Architecture |
| keywords[1].id | https://openalex.org/keywords/inference |
| keywords[1].score | 0.6699793338775635 |
| keywords[1].display_name | Inference |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.5119815468788147 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.3698210120201111 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/natural-language-processing |
| keywords[4].score | 0.3222328722476959 |
| keywords[4].display_name | Natural language processing |
| keywords[5].id | https://openalex.org/keywords/history |
| keywords[5].score | 0.24345937371253967 |
| keywords[5].display_name | History |
| keywords[6].id | https://openalex.org/keywords/archaeology |
| keywords[6].score | 0.11198320984840393 |
| keywords[6].display_name | Archaeology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2409.15254 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2409.15254 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2409.15254 |
| locations[1].id | doi:10.48550/arxiv.2409.15254 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2409.15254 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5065747091 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Jon Saad-Falcon |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Saad-Falcon, Jon |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5054527006 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7405-0818 |
| authorships[1].author.display_name | Alberto Lluch Lafuente |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Lafuente, Adrian Gamarra |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5111084962 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Shlok Natarajan |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Natarajan, Shlok |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5016235344 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Noriaki Maru |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Maru, Nahum |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5114439615 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Hristo Todorov |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Todorov, Hristo |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5042004933 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Etash Guha |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Guha, Etash |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5088549773 |
| authorships[6].author.orcid | https://orcid.org/0000-0003-1448-5662 |
| authorships[6].author.display_name | E. Kelly Buchanan |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Buchanan, E. Kelly |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5039665431 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Mayee Chen |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Chen, Mayee |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5068178240 |
| authorships[8].author.orcid | https://orcid.org/0009-0003-5120-1726 |
| authorships[8].author.display_name | Neel Guha |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Guha, Neel |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5103852640 |
| authorships[9].author.orcid | |
| authorships[9].author.display_name | Christopher Ré |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Ré, Christopher |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5070731184 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-2440-0944 |
| authorships[10].author.display_name | Azalia Mirhoseini |
| authorships[10].author_position | last |
| authorships[10].raw_author_name | Mirhoseini, Azalia |
| authorships[10].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2409.15254 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Archon: An Architecture Search Framework for Inference-Time Techniques |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10317 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9807999730110168 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1705 |
| primary_topic.subfield.display_name | Computer Networks and Communications |
| primary_topic.display_name | Advanced Database Systems and Queries |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2409.15254 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2409.15254 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2409.15254 |
| primary_location.id | pmh:oai:arXiv.org:2409.15254 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2409.15254 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2409.15254 |
| publication_date | 2024-09-23 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 68, 86, 90, 97 |
| abstract_inverted_index.It | 109 |
| abstract_inverted_index.To | 61 |
| abstract_inverted_index.an | 164 |
| abstract_inverted_index.as | 3, 11, 155 |
| abstract_inverted_index.at | 19 |
| abstract_inverted_index.by | 163 |
| abstract_inverted_index.of | 39, 42, 77, 92, 121, 166 |
| abstract_inverted_index.or | 6, 113 |
| abstract_inverted_index.to | 14, 35, 101, 106, 128, 147 |
| abstract_inverted_index.we | 65, 137 |
| abstract_inverted_index.3.5 | 161 |
| abstract_inverted_index.and | 47, 53, 70, 79, 83, 89, 134, 159 |
| abstract_inverted_index.are | 9 |
| abstract_inverted_index.can | 110, 141 |
| abstract_inverted_index.due | 34 |
| abstract_inverted_index.for | 25, 58, 73 |
| abstract_inverted_index.o1, | 157 |
| abstract_inverted_index.our | 36 |
| abstract_inverted_index.set | 91 |
| abstract_inverted_index.the | 40, 49, 54, 75, 118 |
| abstract_inverted_index.vs. | 123 |
| abstract_inverted_index.best | 23 |
| abstract_inverted_index.each | 43 |
| abstract_inverted_index.show | 138 |
| abstract_inverted_index.such | 2, 154 |
| abstract_inverted_index.test | 20 |
| abstract_inverted_index.that | 28, 116, 139, 150 |
| abstract_inverted_index.ways | 13 |
| abstract_inverted_index.Given | 85 |
| abstract_inverted_index.LLMs, | 94 |
| abstract_inverted_index.LLMs. | 84 |
| abstract_inverted_index.large | 98 |
| abstract_inverted_index.space | 57, 100 |
| abstract_inverted_index.them, | 52 |
| abstract_inverted_index.them. | 60 |
| abstract_inverted_index.these | 30, 63 |
| abstract_inverted_index.time. | 21 |
| abstract_inverted_index.token | 125 |
| abstract_inverted_index.(LLMs) | 18 |
| abstract_inverted_index.15.1%. | 167 |
| abstract_inverted_index.Across | 131 |
| abstract_inverted_index.Archon | 95, 140 |
| abstract_inverted_index.Claude | 160 |
| abstract_inverted_index.Pareto | 119 |
| abstract_inverted_index.Sonnet | 162 |
| abstract_inverted_index.across | 45 |
| abstract_inverted_index.budget | 88, 126, 146 |
| abstract_inverted_index.coding | 135 |
| abstract_inverted_index.custom | 112 |
| abstract_inverted_index.design | 99, 111, 148 |
| abstract_inverted_index.models | 17, 46, 153 |
| abstract_inverted_index.remain | 32 |
| abstract_inverted_index.search | 56 |
| abstract_inverted_index.target | 107 |
| abstract_inverted_index.tasks, | 48, 136 |
| abstract_inverted_index.Archon, | 67 |
| abstract_inverted_index.GPT-4o, | 158 |
| abstract_inverted_index.address | 62 |
| abstract_inverted_index.advance | 117 |
| abstract_inverted_index.average | 165 |
| abstract_inverted_index.between | 51 |
| abstract_inverted_index.combine | 29 |
| abstract_inverted_index.compute | 87, 145 |
| abstract_inverted_index.enhance | 15 |
| abstract_inverted_index.limited | 37 |
| abstract_inverted_index.massive | 55 |
| abstract_inverted_index.maximum | 124 |
| abstract_inverted_index.modular | 69 |
| abstract_inverted_index.process | 76 |
| abstract_inverted_index.systems | 27, 149 |
| abstract_inverted_index.utility | 41 |
| abstract_inverted_index.However, | 22 |
| abstract_inverted_index.OpenAI's | 156 |
| abstract_inverted_index.accuracy | 122 |
| abstract_inverted_index.compared | 127 |
| abstract_inverted_index.discover | 102 |
| abstract_inverted_index.emerging | 10 |
| abstract_inverted_index.explores | 96 |
| abstract_inverted_index.frontier | 120, 152 |
| abstract_inverted_index.leverage | 142 |
| abstract_inverted_index.powerful | 12 |
| abstract_inverted_index.repeated | 4 |
| abstract_inverted_index.sampling | 5 |
| abstract_inverted_index.tailored | 105 |
| abstract_inverted_index.automated | 71 |
| abstract_inverted_index.available | 93 |
| abstract_inverted_index.combining | 59, 80 |
| abstract_inverted_index.framework | 72 |
| abstract_inverted_index.inference | 144 |
| abstract_inverted_index.introduce | 66 |
| abstract_inverted_index.iterative | 7 |
| abstract_inverted_index.optimized | 103 |
| abstract_inverted_index.practices | 24 |
| abstract_inverted_index.selecting | 78 |
| abstract_inverted_index.technique | 44 |
| abstract_inverted_index.additional | 143 |
| abstract_inverted_index.baselines. | 130 |
| abstract_inverted_index.developing | 26 |
| abstract_inverted_index.optimizing | 74 |
| abstract_inverted_index.outperform | 151 |
| abstract_inverted_index.reasoning, | 133 |
| abstract_inverted_index.revisions, | 8 |
| abstract_inverted_index.techniques | 31, 82 |
| abstract_inverted_index.benchmarks. | 108 |
| abstract_inverted_index.challenges, | 64 |
| abstract_inverted_index.techniques, | 1 |
| abstract_inverted_index.interactions | 50 |
| abstract_inverted_index.architectures | 115 |
| abstract_inverted_index.understanding | 38 |
| abstract_inverted_index.Inference-time | 0 |
| abstract_inverted_index.configurations | 104 |
| abstract_inverted_index.inference-time | 81 |
| abstract_inverted_index.large-language | 16 |
| abstract_inverted_index.top-performing | 129 |
| abstract_inverted_index.underdeveloped | 33 |
| abstract_inverted_index.general-purpose | 114 |
| abstract_inverted_index.instruction-following, | 132 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 11 |
| citation_normalized_percentile |