ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2408.03402
Large Language Models (LLMs) excel in various natural language processing tasks, but leveraging them for dense passage embedding remains challenging. This is due to their causal attention mechanism and the misalignment between their pre-training objectives and the text ranking tasks. Despite some recent efforts to address these issues, existing frameworks for LLM-based text embeddings have been limited by their support for only a limited range of LLM architectures and fine-tuning strategies, limiting their practical application and versatility. In this work, we introduce the Unified framework for Large Language Model Embedding (ULLME), a flexible, plug-and-play implementation that enables bidirectional attention across various LLMs and supports a range of fine-tuning strategies. We also propose Generation-augmented Representation Learning (GRL), a novel fine-tuning method to boost LLMs for text embedding tasks. GRL enforces consistency between representation-based and generation-based relevance scores, leveraging LLMs' powerful generative abilities for learning passage embeddings. To showcase our framework's flexibility and effectiveness, we release three pre-trained models from ULLME with different backbone architectures, ranging from 1.5B to 8B parameters, all of which demonstrate strong performance on the Massive Text Embedding Benchmark. Our framework is publicly available at: https://github.com/nlp-uoregon/ullme. A demo video for ULLME can also be found at https://rb.gy/ws1ile.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2408.03402
- https://arxiv.org/pdf/2408.03402
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403965028
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403965028Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2408.03402Digital Object Identifier
- Title
-
ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-08-06Full publication date if available
- Authors
-
Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu NguyenList of authors in order
- Landing page
-
https://arxiv.org/abs/2408.03402Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2408.03402Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2408.03402Direct OA link when available
- Concepts
-
Computer science, Augmented reality, Natural language processing, Artificial intelligenceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403965028 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2408.03402 |
| ids.doi | https://doi.org/10.48550/arxiv.2408.03402 |
| ids.openalex | https://openalex.org/W4403965028 |
| fwci | |
| type | preprint |
| title | ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10028 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.987500011920929 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Topic Modeling |
| topics[1].id | https://openalex.org/T10181 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9769999980926514 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Natural Language Processing Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.598594069480896 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| concepts[1].id | https://openalex.org/C153715457 |
| concepts[1].level | 2 |
| concepts[1].score | 0.456174373626709 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q254183 |
| concepts[1].display_name | Augmented reality |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3985273838043213 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3815784454345703 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.598594069480896 |
| keywords[0].display_name | Computer science |
| keywords[1].id | https://openalex.org/keywords/augmented-reality |
| keywords[1].score | 0.456174373626709 |
| keywords[1].display_name | Augmented reality |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.3985273838043213 |
| keywords[2].display_name | Natural language processing |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.3815784454345703 |
| keywords[3].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2408.03402 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2408.03402 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2408.03402 |
| locations[1].id | doi:10.48550/arxiv.2408.03402 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2408.03402 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5083073497 |
| authorships[0].author.orcid | https://orcid.org/0009-0005-7236-6890 |
| authorships[0].author.display_name | Hieu Man |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Man, Hieu |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5087736585 |
| authorships[1].author.orcid | https://orcid.org/0009-0003-4314-5365 |
| authorships[1].author.display_name | Nghia Trung Ngo |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ngo, Nghia Trung |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5028863551 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1119-1346 |
| authorships[2].author.display_name | Franck Dernoncourt |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Dernoncourt, Franck |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5026113034 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-3768-4736 |
| authorships[3].author.display_name | Thien Huu Nguyen |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Nguyen, Thien Huu |
| authorships[3].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2408.03402 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-11-01T00:00:00 |
| display_name | ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10028 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.987500011920929 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Topic Modeling |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2172197285, https://openalex.org/W2991048842, https://openalex.org/W2750280393, https://openalex.org/W2355696739, https://openalex.org/W3158001554, https://openalex.org/W2771909920, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2408.03402 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2408.03402 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2408.03402 |
| primary_location.id | pmh:oai:arXiv.org:2408.03402 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2408.03402 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2408.03402 |
| publication_date | 2024-08-06 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.A | 188 |
| abstract_inverted_index.a | 62, 91, 104, 116 |
| abstract_inverted_index.8B | 167 |
| abstract_inverted_index.In | 77 |
| abstract_inverted_index.To | 145 |
| abstract_inverted_index.We | 109 |
| abstract_inverted_index.at | 197 |
| abstract_inverted_index.be | 195 |
| abstract_inverted_index.by | 57 |
| abstract_inverted_index.in | 5 |
| abstract_inverted_index.is | 21, 183 |
| abstract_inverted_index.of | 65, 106, 170 |
| abstract_inverted_index.on | 175 |
| abstract_inverted_index.to | 23, 44, 120, 166 |
| abstract_inverted_index.we | 80, 152 |
| abstract_inverted_index.GRL | 127 |
| abstract_inverted_index.LLM | 66 |
| abstract_inverted_index.Our | 181 |
| abstract_inverted_index.all | 169 |
| abstract_inverted_index.and | 28, 35, 68, 75, 102, 132, 150 |
| abstract_inverted_index.at: | 186 |
| abstract_inverted_index.but | 11 |
| abstract_inverted_index.can | 193 |
| abstract_inverted_index.due | 22 |
| abstract_inverted_index.for | 14, 50, 60, 85, 123, 141, 191 |
| abstract_inverted_index.our | 147 |
| abstract_inverted_index.the | 29, 36, 82, 176 |
| abstract_inverted_index.1.5B | 165 |
| abstract_inverted_index.LLMs | 101, 122 |
| abstract_inverted_index.Text | 178 |
| abstract_inverted_index.This | 20 |
| abstract_inverted_index.also | 110, 194 |
| abstract_inverted_index.been | 55 |
| abstract_inverted_index.demo | 189 |
| abstract_inverted_index.from | 157, 164 |
| abstract_inverted_index.have | 54 |
| abstract_inverted_index.only | 61 |
| abstract_inverted_index.some | 41 |
| abstract_inverted_index.text | 37, 52, 124 |
| abstract_inverted_index.that | 95 |
| abstract_inverted_index.them | 13 |
| abstract_inverted_index.this | 78 |
| abstract_inverted_index.with | 159 |
| abstract_inverted_index.LLMs' | 137 |
| abstract_inverted_index.Large | 0, 86 |
| abstract_inverted_index.Model | 88 |
| abstract_inverted_index.ULLME | 158, 192 |
| abstract_inverted_index.boost | 121 |
| abstract_inverted_index.dense | 15 |
| abstract_inverted_index.excel | 4 |
| abstract_inverted_index.found | 196 |
| abstract_inverted_index.novel | 117 |
| abstract_inverted_index.range | 64, 105 |
| abstract_inverted_index.their | 24, 32, 58, 72 |
| abstract_inverted_index.these | 46 |
| abstract_inverted_index.three | 154 |
| abstract_inverted_index.video | 190 |
| abstract_inverted_index.which | 171 |
| abstract_inverted_index.work, | 79 |
| abstract_inverted_index.(GRL), | 115 |
| abstract_inverted_index.(LLMs) | 3 |
| abstract_inverted_index.Models | 2 |
| abstract_inverted_index.across | 99 |
| abstract_inverted_index.causal | 25 |
| abstract_inverted_index.method | 119 |
| abstract_inverted_index.models | 156 |
| abstract_inverted_index.recent | 42 |
| abstract_inverted_index.strong | 173 |
| abstract_inverted_index.tasks, | 10 |
| abstract_inverted_index.tasks. | 39, 126 |
| abstract_inverted_index.Despite | 40 |
| abstract_inverted_index.Massive | 177 |
| abstract_inverted_index.Unified | 83 |
| abstract_inverted_index.address | 45 |
| abstract_inverted_index.between | 31, 130 |
| abstract_inverted_index.efforts | 43 |
| abstract_inverted_index.enables | 96 |
| abstract_inverted_index.issues, | 47 |
| abstract_inverted_index.limited | 56, 63 |
| abstract_inverted_index.natural | 7 |
| abstract_inverted_index.passage | 16, 143 |
| abstract_inverted_index.propose | 111 |
| abstract_inverted_index.ranging | 163 |
| abstract_inverted_index.ranking | 38 |
| abstract_inverted_index.release | 153 |
| abstract_inverted_index.remains | 18 |
| abstract_inverted_index.scores, | 135 |
| abstract_inverted_index.support | 59 |
| abstract_inverted_index.various | 6, 100 |
| abstract_inverted_index.(ULLME), | 90 |
| abstract_inverted_index.Language | 1, 87 |
| abstract_inverted_index.Learning | 114 |
| abstract_inverted_index.backbone | 161 |
| abstract_inverted_index.enforces | 128 |
| abstract_inverted_index.existing | 48 |
| abstract_inverted_index.language | 8 |
| abstract_inverted_index.learning | 142 |
| abstract_inverted_index.limiting | 71 |
| abstract_inverted_index.powerful | 138 |
| abstract_inverted_index.publicly | 184 |
| abstract_inverted_index.showcase | 146 |
| abstract_inverted_index.supports | 103 |
| abstract_inverted_index.Embedding | 89, 179 |
| abstract_inverted_index.LLM-based | 51 |
| abstract_inverted_index.abilities | 140 |
| abstract_inverted_index.attention | 26, 98 |
| abstract_inverted_index.available | 185 |
| abstract_inverted_index.different | 160 |
| abstract_inverted_index.embedding | 17, 125 |
| abstract_inverted_index.flexible, | 92 |
| abstract_inverted_index.framework | 84, 182 |
| abstract_inverted_index.introduce | 81 |
| abstract_inverted_index.mechanism | 27 |
| abstract_inverted_index.practical | 73 |
| abstract_inverted_index.relevance | 134 |
| abstract_inverted_index.Benchmark. | 180 |
| abstract_inverted_index.embeddings | 53 |
| abstract_inverted_index.frameworks | 49 |
| abstract_inverted_index.generative | 139 |
| abstract_inverted_index.leveraging | 12, 136 |
| abstract_inverted_index.objectives | 34 |
| abstract_inverted_index.processing | 9 |
| abstract_inverted_index.application | 74 |
| abstract_inverted_index.consistency | 129 |
| abstract_inverted_index.demonstrate | 172 |
| abstract_inverted_index.embeddings. | 144 |
| abstract_inverted_index.fine-tuning | 69, 107, 118 |
| abstract_inverted_index.flexibility | 149 |
| abstract_inverted_index.framework's | 148 |
| abstract_inverted_index.parameters, | 168 |
| abstract_inverted_index.performance | 174 |
| abstract_inverted_index.pre-trained | 155 |
| abstract_inverted_index.strategies, | 70 |
| abstract_inverted_index.strategies. | 108 |
| abstract_inverted_index.challenging. | 19 |
| abstract_inverted_index.misalignment | 30 |
| abstract_inverted_index.pre-training | 33 |
| abstract_inverted_index.versatility. | 76 |
| abstract_inverted_index.architectures | 67 |
| abstract_inverted_index.bidirectional | 97 |
| abstract_inverted_index.plug-and-play | 93 |
| abstract_inverted_index.Representation | 113 |
| abstract_inverted_index.architectures, | 162 |
| abstract_inverted_index.effectiveness, | 151 |
| abstract_inverted_index.implementation | 94 |
| abstract_inverted_index.generation-based | 133 |
| abstract_inverted_index.Generation-augmented | 112 |
| abstract_inverted_index.representation-based | 131 |
| abstract_inverted_index.https://rb.gy/ws1ile. | 198 |
| abstract_inverted_index.https://github.com/nlp-uoregon/ullme. | 187 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| citation_normalized_percentile |