Searching for Best Practices in Retrieval-Augmented Generation Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2407.01219
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains. While many RAG approaches have been proposed to enhance large language models through query-dependent retrievals, these approaches still suffer from their complex implementation and prolonged response times. Typically, a RAG workflow involves multiple processing steps, each of which can be executed in various ways. Here, we investigate existing RAG approaches and their potential combinations to identify optimal RAG practices. Through extensive experiments, we suggest several strategies for deploying RAG that balance both performance and efficiency. Moreover, we demonstrate that multimodal retrieval techniques can significantly enhance question-answering capabilities about visual inputs and accelerate the generation of multimodal content using a "retrieval as generation" strategy.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2407.01219
- https://arxiv.org/pdf/2407.01219
- OA Status
- green
- Cited By
- 6
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4400341981
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4400341981Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2407.01219Digital Object Identifier
- Title
-
Searching for Best Practices in Retrieval-Augmented GenerationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-07-01Full publication date if available
- Authors
-
Xiaohua Wang, Zhenhua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qian Qi, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Jimmy Xiangji HuangList of authors in order
- Landing page
-
https://arxiv.org/abs/2407.01219Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2407.01219Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2407.01219Direct OA link when available
- Concepts
-
Information retrieval, Computer science, Artificial intelligenceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
6Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 5, 2024: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4400341981 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2407.01219 |
| ids.doi | https://doi.org/10.48550/arxiv.2407.01219 |
| ids.openalex | https://openalex.org/W4400341981 |
| fwci | |
| type | preprint |
| title | Searching for Best Practices in Retrieval-Augmented Generation |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10203 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.7422999739646912 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1710 |
| topics[0].subfield.display_name | Information Systems |
| topics[0].display_name | Recommender Systems and Techniques |
| topics[1].id | https://openalex.org/T10286 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.6974999904632568 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1710 |
| topics[1].subfield.display_name | Information Systems |
| topics[1].display_name | Information Retrieval and Search Behavior |
| topics[2].id | https://openalex.org/T12031 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.6973000168800354 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Speech and dialogue systems |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C23123220 |
| concepts[0].level | 1 |
| concepts[0].score | 0.5638186931610107 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q816826 |
| concepts[0].display_name | Information retrieval |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.5195934772491455 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3322594165802002 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| keywords[0].id | https://openalex.org/keywords/information-retrieval |
| keywords[0].score | 0.5638186931610107 |
| keywords[0].display_name | Information retrieval |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.5195934772491455 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.3322594165802002 |
| keywords[2].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2407.01219 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2407.01219 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2407.01219 |
| locations[1].id | doi:10.48550/arxiv.2407.01219 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2407.01219 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100438593 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0136-9806 |
| authorships[0].author.display_name | Xiaohua Wang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wang, Xiaohua |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5100395206 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-0369-2765 |
| authorships[1].author.display_name | Zhenhua Wang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wang, Zhenghua |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5112109082 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Xuan Gao |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Gao, Xuan |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5000450267 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-2890-5915 |
| authorships[3].author.display_name | Feiran Zhang |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Zhang, Feiran |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5102863764 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-3000-9423 |
| authorships[4].author.display_name | Yixin Wu |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wu, Yixin |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5050397343 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-6331-6940 |
| authorships[5].author.display_name | Zhibo Xu |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Xu, Zhibo |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5101107477 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Tianyuan Shi |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Shi, Tianyuan |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5037791789 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Zhengyuan Wang |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Wang, Zhengyuan |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5011617835 |
| authorships[8].author.orcid | |
| authorships[8].author.display_name | Shizheng Li |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Li, Shizheng |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5102026071 |
| authorships[9].author.orcid | https://orcid.org/0000-0002-9853-0936 |
| authorships[9].author.display_name | Qian Qi |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Qian, Qi |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5109737650 |
| authorships[10].author.orcid | |
| authorships[10].author.display_name | Ruicheng Yin |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Yin, Ruicheng |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5113112983 |
| authorships[11].author.orcid | |
| authorships[11].author.display_name | Changze Lv |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Lv, Changze |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5017835517 |
| authorships[12].author.orcid | https://orcid.org/0000-0003-4430-5036 |
| authorships[12].author.display_name | Xiaoqing Zheng |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Zheng, Xiaoqing |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5000409439 |
| authorships[13].author.orcid | https://orcid.org/0000-0003-1292-1491 |
| authorships[13].author.display_name | Jimmy Xiangji Huang |
| authorships[13].author_position | last |
| authorships[13].raw_author_name | Huang, Xuanjing |
| authorships[13].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2407.01219 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Searching for Best Practices in Retrieval-Augmented Generation |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10203 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.7422999739646912 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1710 |
| primary_topic.subfield.display_name | Information Systems |
| primary_topic.display_name | Recommender Systems and Techniques |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052, https://openalex.org/W2382290278, https://openalex.org/W4395014643 |
| cited_by_count | 6 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 5 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2407.01219 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2407.01219 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2407.01219 |
| primary_location.id | pmh:oai:arXiv.org:2407.01219 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2407.01219 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2407.01219 |
| publication_date | 2024-07-01 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 51, 121 |
| abstract_inverted_index.as | 123 |
| abstract_inverted_index.be | 7, 62 |
| abstract_inverted_index.in | 9, 20, 64 |
| abstract_inverted_index.of | 59, 117 |
| abstract_inverted_index.to | 6, 30, 77 |
| abstract_inverted_index.we | 68, 85, 99 |
| abstract_inverted_index.RAG | 25, 52, 71, 80, 91 |
| abstract_inverted_index.and | 15, 46, 73, 96, 113 |
| abstract_inverted_index.can | 61, 105 |
| abstract_inverted_index.for | 89 |
| abstract_inverted_index.the | 115 |
| abstract_inverted_index.been | 28 |
| abstract_inverted_index.both | 94 |
| abstract_inverted_index.each | 58 |
| abstract_inverted_index.from | 42 |
| abstract_inverted_index.have | 4, 27 |
| abstract_inverted_index.many | 24 |
| abstract_inverted_index.that | 92, 101 |
| abstract_inverted_index.(RAG) | 2 |
| abstract_inverted_index.Here, | 67 |
| abstract_inverted_index.While | 23 |
| abstract_inverted_index.about | 110 |
| abstract_inverted_index.large | 32 |
| abstract_inverted_index.still | 40 |
| abstract_inverted_index.their | 43, 74 |
| abstract_inverted_index.these | 38 |
| abstract_inverted_index.using | 120 |
| abstract_inverted_index.ways. | 66 |
| abstract_inverted_index.which | 60 |
| abstract_inverted_index.inputs | 112 |
| abstract_inverted_index.models | 34 |
| abstract_inverted_index.proven | 5 |
| abstract_inverted_index.steps, | 57 |
| abstract_inverted_index.suffer | 41 |
| abstract_inverted_index.times. | 49 |
| abstract_inverted_index.visual | 111 |
| abstract_inverted_index.Through | 82 |
| abstract_inverted_index.balance | 93 |
| abstract_inverted_index.complex | 44 |
| abstract_inverted_index.content | 119 |
| abstract_inverted_index.enhance | 31, 107 |
| abstract_inverted_index.optimal | 79 |
| abstract_inverted_index.several | 87 |
| abstract_inverted_index.suggest | 86 |
| abstract_inverted_index.through | 35 |
| abstract_inverted_index.various | 65 |
| abstract_inverted_index.domains. | 22 |
| abstract_inverted_index.executed | 63 |
| abstract_inverted_index.existing | 70 |
| abstract_inverted_index.identify | 78 |
| abstract_inverted_index.involves | 54 |
| abstract_inverted_index.language | 33 |
| abstract_inverted_index.multiple | 55 |
| abstract_inverted_index.proposed | 29 |
| abstract_inverted_index.quality, | 18 |
| abstract_inverted_index.response | 17, 48 |
| abstract_inverted_index.workflow | 53 |
| abstract_inverted_index.Moreover, | 98 |
| abstract_inverted_index.deploying | 90 |
| abstract_inverted_index.effective | 8 |
| abstract_inverted_index.enhancing | 16 |
| abstract_inverted_index.extensive | 83 |
| abstract_inverted_index.potential | 75 |
| abstract_inverted_index.prolonged | 47 |
| abstract_inverted_index.retrieval | 103 |
| abstract_inverted_index.strategy. | 125 |
| abstract_inverted_index."retrieval | 122 |
| abstract_inverted_index.Typically, | 50 |
| abstract_inverted_index.accelerate | 114 |
| abstract_inverted_index.approaches | 26, 39, 72 |
| abstract_inverted_index.generation | 1, 116 |
| abstract_inverted_index.mitigating | 13 |
| abstract_inverted_index.multimodal | 102, 118 |
| abstract_inverted_index.practices. | 81 |
| abstract_inverted_index.processing | 56 |
| abstract_inverted_index.strategies | 88 |
| abstract_inverted_index.techniques | 3, 104 |
| abstract_inverted_index.up-to-date | 11 |
| abstract_inverted_index.demonstrate | 100 |
| abstract_inverted_index.efficiency. | 97 |
| abstract_inverted_index.generation" | 124 |
| abstract_inverted_index.integrating | 10 |
| abstract_inverted_index.investigate | 69 |
| abstract_inverted_index.performance | 95 |
| abstract_inverted_index.retrievals, | 37 |
| abstract_inverted_index.specialized | 21 |
| abstract_inverted_index.capabilities | 109 |
| abstract_inverted_index.combinations | 76 |
| abstract_inverted_index.experiments, | 84 |
| abstract_inverted_index.information, | 12 |
| abstract_inverted_index.particularly | 19 |
| abstract_inverted_index.significantly | 106 |
| abstract_inverted_index.implementation | 45 |
| abstract_inverted_index.hallucinations, | 14 |
| abstract_inverted_index.query-dependent | 36 |
| abstract_inverted_index.question-answering | 108 |
| abstract_inverted_index.Retrieval-augmented | 0 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 14 |
| citation_normalized_percentile |