FLAME: Factuality-Aware Alignment for Large Language Models Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2405.01525
Alignment is a standard procedure to fine-tune pre-trained large language models (LLMs) to follow natural language instructions and serve as helpful AI assistants. We have observed, however, that the conventional alignment process fails to enhance the factual accuracy of LLMs, and often leads to the generation of more false facts (i.e. hallucination). In this paper, we study how to make the LLM alignment process more factual, by first identifying factors that lead to hallucination in both alignment steps:\ supervised fine-tuning (SFT) and reinforcement learning (RL). In particular, we find that training the LLM on new knowledge or unfamiliar texts can encourage hallucination. This makes SFT less factual as it trains on human labeled data that may be novel to the LLM. Furthermore, reward functions used in standard RL can also encourage hallucination, because it guides the LLM to provide more helpful responses on a diverse set of instructions, often preferring longer and more detailed responses. Based on these observations, we propose factuality-aware alignment, comprised of factuality-aware SFT and factuality-aware RL through direct preference optimization. Experiments show that our proposed factuality-aware alignment guides LLMs to output more factual responses while maintaining instruction-following capability.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2405.01525
- https://arxiv.org/pdf/2405.01525
- OA Status
- green
- Cited By
- 3
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4396651341
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4396651341Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2405.01525Digital Object Identifier
- Title
-
FLAME: Factuality-Aware Alignment for Large Language ModelsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-05-02Full publication date if available
- Authors
-
Sheng-Chieh Lin, Luyu Gao, Barlas Oğuz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun ChenList of authors in order
- Landing page
-
https://arxiv.org/abs/2405.01525Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2405.01525Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2405.01525Direct OA link when available
- Concepts
-
Computer science, Natural language processing, Artificial intelligence, Linguistics, PhilosophyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
3Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1, 2024: 2Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4396651341 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2405.01525 |
| ids.doi | https://doi.org/10.48550/arxiv.2405.01525 |
| ids.openalex | https://openalex.org/W4396651341 |
| fwci | |
| type | preprint |
| title | FLAME: Factuality-Aware Alignment for Large Language Models |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10028 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9952999949455261 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Topic Modeling |
| topics[1].id | https://openalex.org/T10181 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9789999723434448 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Natural Language Processing Techniques |
| topics[2].id | https://openalex.org/T13910 |
| topics[2].field.id | https://openalex.org/fields/33 |
| topics[2].field.display_name | Social Sciences |
| topics[2].score | 0.9714999794960022 |
| topics[2].domain.id | https://openalex.org/domains/2 |
| topics[2].domain.display_name | Social Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/3300 |
| topics[2].subfield.display_name | General Social Sciences |
| topics[2].display_name | Computational and Text Analysis Methods |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.5570462942123413 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C204321447 |
| concepts[1].level | 1 |
| concepts[1].score | 0.5001060962677002 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[1].display_name | Natural language processing |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.4350415766239166 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C41895202 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3727790117263794 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q8162 |
| concepts[3].display_name | Linguistics |
| concepts[4].id | https://openalex.org/C138885662 |
| concepts[4].level | 0 |
| concepts[4].score | 0.11257016658782959 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[4].display_name | Philosophy |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.5570462942123413 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/natural-language-processing |
| keywords[1].score | 0.5001060962677002 |
| keywords[1].display_name | Natural language processing |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.4350415766239166 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/linguistics |
| keywords[3].score | 0.3727790117263794 |
| keywords[3].display_name | Linguistics |
| keywords[4].id | https://openalex.org/keywords/philosophy |
| keywords[4].score | 0.11257016658782959 |
| keywords[4].display_name | Philosophy |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2405.01525 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2405.01525 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2405.01525 |
| locations[1].id | doi:10.48550/arxiv.2405.01525 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2405.01525 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5032699557 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-7989-9703 |
| authorships[0].author.display_name | Sheng-Chieh Lin |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Lin, Sheng-Chieh |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5060885692 |
| authorships[1].author.orcid | https://orcid.org/0009-0006-5806-3022 |
| authorships[1].author.display_name | Luyu Gao |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Gao, Luyu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5071728146 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Barlas Oğuz |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Oguz, Barlas |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5110635444 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Wenhan Xiong |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Xiong, Wenhan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5082997975 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-0661-7189 |
| authorships[4].author.display_name | Jimmy Lin |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Lin, Jimmy |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5066873932 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Wen-tau Yih |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Yih, Wen-tau |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5065674779 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Xilun Chen |
| authorships[6].author_position | last |
| authorships[6].raw_author_name | Chen, Xilun |
| authorships[6].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2405.01525 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | FLAME: Factuality-Aware Alignment for Large Language Models |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10028 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9952999949455261 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Topic Modeling |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W2382290278, https://openalex.org/W4395014643, https://openalex.org/W4391913857, https://openalex.org/W3204019825 |
| cited_by_count | 3 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 2 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2405.01525 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2405.01525 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2405.01525 |
| primary_location.id | pmh:oai:arXiv.org:2405.01525 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2405.01525 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2405.01525 |
| publication_date | 2024-05-02 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 2, 143 |
| abstract_inverted_index.AI | 21 |
| abstract_inverted_index.In | 52, 85 |
| abstract_inverted_index.RL | 127, 169 |
| abstract_inverted_index.We | 23 |
| abstract_inverted_index.as | 19, 107 |
| abstract_inverted_index.be | 116 |
| abstract_inverted_index.by | 66 |
| abstract_inverted_index.in | 74, 125 |
| abstract_inverted_index.is | 1 |
| abstract_inverted_index.it | 108, 133 |
| abstract_inverted_index.of | 38, 46, 146, 164 |
| abstract_inverted_index.on | 93, 110, 142, 156 |
| abstract_inverted_index.or | 96 |
| abstract_inverted_index.to | 5, 12, 33, 43, 58, 72, 118, 137, 183 |
| abstract_inverted_index.we | 55, 87, 159 |
| abstract_inverted_index.LLM | 61, 92, 136 |
| abstract_inverted_index.SFT | 104, 166 |
| abstract_inverted_index.and | 17, 40, 81, 151, 167 |
| abstract_inverted_index.can | 99, 128 |
| abstract_inverted_index.how | 57 |
| abstract_inverted_index.may | 115 |
| abstract_inverted_index.new | 94 |
| abstract_inverted_index.our | 177 |
| abstract_inverted_index.set | 145 |
| abstract_inverted_index.the | 28, 35, 44, 60, 91, 119, 135 |
| abstract_inverted_index.LLM. | 120 |
| abstract_inverted_index.LLMs | 182 |
| abstract_inverted_index.This | 102 |
| abstract_inverted_index.also | 129 |
| abstract_inverted_index.both | 75 |
| abstract_inverted_index.data | 113 |
| abstract_inverted_index.find | 88 |
| abstract_inverted_index.have | 24 |
| abstract_inverted_index.lead | 71 |
| abstract_inverted_index.less | 105 |
| abstract_inverted_index.make | 59 |
| abstract_inverted_index.more | 47, 64, 139, 152, 185 |
| abstract_inverted_index.show | 175 |
| abstract_inverted_index.that | 27, 70, 89, 114, 176 |
| abstract_inverted_index.this | 53 |
| abstract_inverted_index.used | 124 |
| abstract_inverted_index.(RL). | 84 |
| abstract_inverted_index.(SFT) | 80 |
| abstract_inverted_index.(i.e. | 50 |
| abstract_inverted_index.Based | 155 |
| abstract_inverted_index.LLMs, | 39 |
| abstract_inverted_index.facts | 49 |
| abstract_inverted_index.fails | 32 |
| abstract_inverted_index.false | 48 |
| abstract_inverted_index.first | 67 |
| abstract_inverted_index.human | 111 |
| abstract_inverted_index.large | 8 |
| abstract_inverted_index.leads | 42 |
| abstract_inverted_index.makes | 103 |
| abstract_inverted_index.novel | 117 |
| abstract_inverted_index.often | 41, 148 |
| abstract_inverted_index.serve | 18 |
| abstract_inverted_index.study | 56 |
| abstract_inverted_index.texts | 98 |
| abstract_inverted_index.these | 157 |
| abstract_inverted_index.while | 188 |
| abstract_inverted_index.(LLMs) | 11 |
| abstract_inverted_index.direct | 171 |
| abstract_inverted_index.follow | 13 |
| abstract_inverted_index.guides | 134, 181 |
| abstract_inverted_index.longer | 150 |
| abstract_inverted_index.models | 10 |
| abstract_inverted_index.output | 184 |
| abstract_inverted_index.paper, | 54 |
| abstract_inverted_index.reward | 122 |
| abstract_inverted_index.trains | 109 |
| abstract_inverted_index.because | 132 |
| abstract_inverted_index.diverse | 144 |
| abstract_inverted_index.enhance | 34 |
| abstract_inverted_index.factors | 69 |
| abstract_inverted_index.factual | 36, 106, 186 |
| abstract_inverted_index.helpful | 20, 140 |
| abstract_inverted_index.labeled | 112 |
| abstract_inverted_index.natural | 14 |
| abstract_inverted_index.process | 31, 63 |
| abstract_inverted_index.propose | 160 |
| abstract_inverted_index.provide | 138 |
| abstract_inverted_index.steps:\ | 77 |
| abstract_inverted_index.through | 170 |
| abstract_inverted_index.accuracy | 37 |
| abstract_inverted_index.detailed | 153 |
| abstract_inverted_index.factual, | 65 |
| abstract_inverted_index.however, | 26 |
| abstract_inverted_index.language | 9, 15 |
| abstract_inverted_index.learning | 83 |
| abstract_inverted_index.proposed | 178 |
| abstract_inverted_index.standard | 3, 126 |
| abstract_inverted_index.training | 90 |
| abstract_inverted_index.Alignment | 0 |
| abstract_inverted_index.alignment | 30, 62, 76, 180 |
| abstract_inverted_index.comprised | 163 |
| abstract_inverted_index.encourage | 100, 130 |
| abstract_inverted_index.fine-tune | 6 |
| abstract_inverted_index.functions | 123 |
| abstract_inverted_index.knowledge | 95 |
| abstract_inverted_index.observed, | 25 |
| abstract_inverted_index.procedure | 4 |
| abstract_inverted_index.responses | 141, 187 |
| abstract_inverted_index.alignment, | 162 |
| abstract_inverted_index.generation | 45 |
| abstract_inverted_index.preference | 172 |
| abstract_inverted_index.preferring | 149 |
| abstract_inverted_index.responses. | 154 |
| abstract_inverted_index.supervised | 78 |
| abstract_inverted_index.unfamiliar | 97 |
| abstract_inverted_index.Experiments | 174 |
| abstract_inverted_index.assistants. | 22 |
| abstract_inverted_index.capability. | 191 |
| abstract_inverted_index.fine-tuning | 79 |
| abstract_inverted_index.identifying | 68 |
| abstract_inverted_index.maintaining | 189 |
| abstract_inverted_index.particular, | 86 |
| abstract_inverted_index.pre-trained | 7 |
| abstract_inverted_index.Furthermore, | 121 |
| abstract_inverted_index.conventional | 29 |
| abstract_inverted_index.instructions | 16 |
| abstract_inverted_index.hallucination | 73 |
| abstract_inverted_index.instructions, | 147 |
| abstract_inverted_index.observations, | 158 |
| abstract_inverted_index.optimization. | 173 |
| abstract_inverted_index.reinforcement | 82 |
| abstract_inverted_index.hallucination, | 131 |
| abstract_inverted_index.hallucination. | 101 |
| abstract_inverted_index.hallucination). | 51 |
| abstract_inverted_index.factuality-aware | 161, 165, 168, 179 |
| abstract_inverted_index.instruction-following | 190 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 7 |
| citation_normalized_percentile |