GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network Article Swipe

PDF

Shuzhou Yuan , Ercong Nie , Michael Färber , Helmut Schmid , Hinrich Schütze ·

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2402.11709

Large Language Models (LLMs) exhibit strong In-Context Learning (ICL) capabilities when prompts with demonstrations are used. However, fine-tuning still remains crucial to further enhance their adaptability. Prompt-based fine-tuning proves to be an effective fine-tuning method in low-data scenarios, but high demands on computing resources limit its practicality. We address this issue by introducing a prompt-based parameter-efficient fine-tuning (PEFT) approach. GNNavi leverages insights into ICL's information flow dynamics, which indicates that label words act in prompts as anchors for information propagation. GNNavi employs a Graph Neural Network (GNN) layer to precisely guide the aggregation and distribution of information flow during the processing of prompts by hardwiring the desired information flow into the GNN. Our experiments on text classification tasks with GPT-2 and Llama2 show GNNavi surpasses standard prompt-based fine-tuning methods in few-shot settings by updating just 0.2% to 0.5% of parameters. We compare GNNavi with prevalent PEFT approaches, such as prefix tuning, LoRA and Adapter in terms of performance and efficiency. Our analysis reveals that GNNavi enhances information flow and ensures a clear aggregation process.

Related Topics

Computer Science

Artificial Intelligence

Theoretical Computer Science

Mathematics

Philosophy

Geometry

Concepts

Computer science Information flow Graph Artificial neural network Flow (mathematics) Artificial intelligence Theoretical computer science Linguistics Mathematics Philosophy Geometry

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2402.11709
PDF: https://arxiv.org/pdf/2402.11709
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4391986290

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4391986290

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2402.11709

Digital Object Identifier
Title: GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2024

Year of publication
Publication date: 2024-02-18

Full publication date if available
Authors: Shuzhou Yuan, Ercong Nie, Michael Färber, Helmut Schmid, Hinrich Schütze

List of authors in order
Landing page: https://arxiv.org/abs/2402.11709

Publisher landing page
PDF URL: https://arxiv.org/pdf/2402.11709

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2402.11709

Direct OA link when available
Concepts: Computer science, Information flow, Graph, Artificial neural network, Flow (mathematics), Artificial intelligence, Theoretical computer science, Linguistics, Mathematics, Philosophy, Geometry

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4391986290
doi	https://doi.org/10.48550/arxiv.2402.11709
ids.doi	https://doi.org/10.48550/arxiv.2402.11709
ids.openalex	https://openalex.org/W4391986290
fwci
type	preprint
title	GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T11273
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9638000130653381
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1702
topics[0].subfield.display_name	Artificial Intelligence
topics[0].display_name	Advanced Graph Neural Networks
topics[1].id	https://openalex.org/T10028
topics[1].field.id	https://openalex.org/fields/17
topics[1].field.display_name	Computer Science
topics[1].score	0.9470999836921692
topics[1].domain.id	https://openalex.org/domains/3
topics[1].domain.display_name	Physical Sciences
topics[1].subfield.id	https://openalex.org/subfields/1702
topics[1].subfield.display_name	Artificial Intelligence
topics[1].display_name	Topic Modeling
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C41008148
concepts[0].level	0
concepts[0].score	0.6324517726898193
concepts[0].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[0].display_name	Computer science
concepts[1].id	https://openalex.org/C2779136372
concepts[1].level	2
concepts[1].score	0.5292937755584717
concepts[1].wikidata	https://www.wikidata.org/wiki/Q10283002
concepts[1].display_name	Information flow
concepts[2].id	https://openalex.org/C132525143
concepts[2].level	2
concepts[2].score	0.48839300870895386
concepts[2].wikidata	https://www.wikidata.org/wiki/Q141488
concepts[2].display_name	Graph
concepts[3].id	https://openalex.org/C50644808
concepts[3].level	2
concepts[3].score	0.4625309109687805
concepts[3].wikidata	https://www.wikidata.org/wiki/Q192776
concepts[3].display_name	Artificial neural network
concepts[4].id	https://openalex.org/C38349280
concepts[4].level	2
concepts[4].score	0.44092506170272827
concepts[4].wikidata	https://www.wikidata.org/wiki/Q1434290
concepts[4].display_name	Flow (mathematics)
concepts[5].id	https://openalex.org/C154945302
concepts[5].level	1
concepts[5].score	0.3372572660446167
concepts[5].wikidata	https://www.wikidata.org/wiki/Q11660
concepts[5].display_name	Artificial intelligence
concepts[6].id	https://openalex.org/C80444323
concepts[6].level	1
concepts[6].score	0.30030155181884766
concepts[6].wikidata	https://www.wikidata.org/wiki/Q2878974
concepts[6].display_name	Theoretical computer science
concepts[7].id	https://openalex.org/C41895202
concepts[7].level	1
concepts[7].score	0.1971280574798584
concepts[7].wikidata	https://www.wikidata.org/wiki/Q8162
concepts[7].display_name	Linguistics
concepts[8].id	https://openalex.org/C33923547
concepts[8].level	0
concepts[8].score	0.11869436502456665
concepts[8].wikidata	https://www.wikidata.org/wiki/Q395
concepts[8].display_name	Mathematics
concepts[9].id	https://openalex.org/C138885662
concepts[9].level	0
concepts[9].score	0.06503498554229736
concepts[9].wikidata	https://www.wikidata.org/wiki/Q5891
concepts[9].display_name	Philosophy
concepts[10].id	https://openalex.org/C2524010
concepts[10].level	1
concepts[10].score	0.0
concepts[10].wikidata	https://www.wikidata.org/wiki/Q8087
concepts[10].display_name	Geometry
keywords[0].id	https://openalex.org/keywords/computer-science
keywords[0].score	0.6324517726898193
keywords[0].display_name	Computer science
keywords[1].id	https://openalex.org/keywords/information-flow
keywords[1].score	0.5292937755584717
keywords[1].display_name	Information flow
keywords[2].id	https://openalex.org/keywords/graph
keywords[2].score	0.48839300870895386
keywords[2].display_name	Graph
keywords[3].id	https://openalex.org/keywords/artificial-neural-network
keywords[3].score	0.4625309109687805
keywords[3].display_name	Artificial neural network
keywords[4].id	https://openalex.org/keywords/flow
keywords[4].score	0.44092506170272827
keywords[4].display_name	Flow (mathematics)
keywords[5].id	https://openalex.org/keywords/artificial-intelligence
keywords[5].score	0.3372572660446167
keywords[5].display_name	Artificial intelligence
keywords[6].id	https://openalex.org/keywords/theoretical-computer-science
keywords[6].score	0.30030155181884766
keywords[6].display_name	Theoretical computer science
keywords[7].id	https://openalex.org/keywords/linguistics
keywords[7].score	0.1971280574798584
keywords[7].display_name	Linguistics
keywords[8].id	https://openalex.org/keywords/mathematics
keywords[8].score	0.11869436502456665
keywords[8].display_name	Mathematics
keywords[9].id	https://openalex.org/keywords/philosophy
keywords[9].score	0.06503498554229736
keywords[9].display_name	Philosophy
language	en
locations[0].id	pmh:oai:arXiv.org:2402.11709
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license	cc-by-sa
locations[0].pdf_url	https://arxiv.org/pdf/2402.11709
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id	https://openalex.org/licenses/cc-by-sa
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2402.11709
locations[1].id	doi:10.48550/arxiv.2402.11709
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2402.11709
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5008824325
authorships[0].author.orcid
authorships[0].author.display_name	Shuzhou Yuan
authorships[0].author_position	first
authorships[0].raw_author_name	Yuan, Shuzhou
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5054893267
authorships[1].author.orcid	https://orcid.org/0000-0003-1453-4460
authorships[1].author.display_name	Ercong Nie
authorships[1].author_position	middle
authorships[1].raw_author_name	Nie, Ercong
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5031600582
authorships[2].author.orcid	https://orcid.org/0000-0001-5458-8645
authorships[2].author.display_name	Michael Färber
authorships[2].author_position	middle
authorships[2].raw_author_name	Färber, Michael
authorships[2].is_corresponding	False
authorships[3].author.id	https://openalex.org/A5027452875
authorships[3].author.orcid	https://orcid.org/0000-0001-8003-7708
authorships[3].author.display_name	Helmut Schmid
authorships[3].author_position	middle
authorships[3].raw_author_name	Schmid, Helmut
authorships[3].is_corresponding	False
authorships[4].author.id	https://openalex.org/A5071144367
authorships[4].author.orcid
authorships[4].author.display_name	Hinrich Schütze
authorships[4].author_position	last
authorships[4].raw_author_name	Schütze, Hinrich
authorships[4].is_corresponding	False
has_content.pdf	True
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2402.11709
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2024-02-21T00:00:00
display_name	GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network
has_fulltext	False
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T11273
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9638000130653381
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1702
primary_topic.subfield.display_name	Artificial Intelligence
primary_topic.display_name	Advanced Graph Neural Networks
related_works	https://openalex.org/W2391251536, https://openalex.org/W2362198218, https://openalex.org/W2019521278, https://openalex.org/W2388615687, https://openalex.org/W2129858673, https://openalex.org/W2147462260, https://openalex.org/W2518891271, https://openalex.org/W4288600360, https://openalex.org/W1892058059, https://openalex.org/W3016796767
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2402.11709
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license	cc-by-sa
best_oa_location.pdf_url	https://arxiv.org/pdf/2402.11709
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id	https://openalex.org/licenses/cc-by-sa
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2402.11709
primary_location.id	pmh:oai:arXiv.org:2402.11709
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license	cc-by-sa
primary_location.pdf_url	https://arxiv.org/pdf/2402.11709
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id	https://openalex.org/licenses/cc-by-sa
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2402.11709
publication_date	2024-02-18
publication_year	2024
referenced_works_count	0
abstract_inverted_index.a	53, 82, 170
abstract_inverted_index.We	47, 140
abstract_inverted_index.an	31
abstract_inverted_index.as	75, 148
abstract_inverted_index.be	30
abstract_inverted_index.by	51, 103, 132
abstract_inverted_index.in	35, 73, 129, 154
abstract_inverted_index.of	95, 101, 138, 156
abstract_inverted_index.on	41, 114
abstract_inverted_index.to	21, 29, 88, 136
abstract_inverted_index.Our	112, 160
abstract_inverted_index.act	72
abstract_inverted_index.and	93, 120, 152, 158, 168
abstract_inverted_index.are	14
abstract_inverted_index.but	38
abstract_inverted_index.for	77
abstract_inverted_index.its	45
abstract_inverted_index.the	91, 99, 105, 110
abstract_inverted_index.0.2%	135
abstract_inverted_index.0.5%	137
abstract_inverted_index.GNN.	111
abstract_inverted_index.LoRA	151
abstract_inverted_index.PEFT	145
abstract_inverted_index.flow	65, 97, 108, 167
abstract_inverted_index.high	39
abstract_inverted_index.into	62, 109
abstract_inverted_index.just	134
abstract_inverted_index.show	122
abstract_inverted_index.such	147
abstract_inverted_index.text	115
abstract_inverted_index.that	69, 163
abstract_inverted_index.this	49
abstract_inverted_index.when	10
abstract_inverted_index.with	12, 118, 143
abstract_inverted_index.(GNN)	86
abstract_inverted_index.(ICL)	8
abstract_inverted_index.GPT-2	119
abstract_inverted_index.Graph	83
abstract_inverted_index.ICL's	63
abstract_inverted_index.Large	0
abstract_inverted_index.clear	171
abstract_inverted_index.guide	90
abstract_inverted_index.issue	50
abstract_inverted_index.label	70
abstract_inverted_index.layer	87
abstract_inverted_index.limit	44
abstract_inverted_index.still	18
abstract_inverted_index.tasks	117
abstract_inverted_index.terms	155
abstract_inverted_index.their	24
abstract_inverted_index.used.	15
abstract_inverted_index.which	67
abstract_inverted_index.words	71
abstract_inverted_index.(LLMs)	3
abstract_inverted_index.(PEFT)	57
abstract_inverted_index.GNNavi	59, 80, 123, 142, 164
abstract_inverted_index.Llama2	121
abstract_inverted_index.Models	2
abstract_inverted_index.Neural	84
abstract_inverted_index.during	98
abstract_inverted_index.method	34
abstract_inverted_index.prefix	149
abstract_inverted_index.proves	28
abstract_inverted_index.strong	5
abstract_inverted_index.Adapter	153
abstract_inverted_index.Network	85
abstract_inverted_index.address	48
abstract_inverted_index.anchors	76
abstract_inverted_index.compare	141
abstract_inverted_index.crucial	20
abstract_inverted_index.demands	40
abstract_inverted_index.desired	106
abstract_inverted_index.employs	81
abstract_inverted_index.enhance	23
abstract_inverted_index.ensures	169
abstract_inverted_index.exhibit	4
abstract_inverted_index.further	22
abstract_inverted_index.methods	128
abstract_inverted_index.prompts	11, 74, 102
abstract_inverted_index.remains	19
abstract_inverted_index.reveals	162
abstract_inverted_index.tuning,	150
abstract_inverted_index.However,	16
abstract_inverted_index.Language	1
abstract_inverted_index.Learning	7
abstract_inverted_index.analysis	161
abstract_inverted_index.enhances	165
abstract_inverted_index.few-shot	130
abstract_inverted_index.insights	61
abstract_inverted_index.low-data	36
abstract_inverted_index.process.	173
abstract_inverted_index.settings	131
abstract_inverted_index.standard	125
abstract_inverted_index.updating	133
abstract_inverted_index.approach.	58
abstract_inverted_index.computing	42
abstract_inverted_index.dynamics,	66
abstract_inverted_index.effective	32
abstract_inverted_index.indicates	68
abstract_inverted_index.leverages	60
abstract_inverted_index.precisely	89
abstract_inverted_index.prevalent	144
abstract_inverted_index.resources	43
abstract_inverted_index.surpasses	124
abstract_inverted_index.In-Context	6
abstract_inverted_index.hardwiring	104
abstract_inverted_index.processing	100
abstract_inverted_index.scenarios,	37
abstract_inverted_index.aggregation	92, 172
abstract_inverted_index.approaches,	146
abstract_inverted_index.efficiency.	159
abstract_inverted_index.experiments	113
abstract_inverted_index.fine-tuning	17, 27, 33, 56, 127
abstract_inverted_index.information	64, 78, 96, 107, 166
abstract_inverted_index.introducing	52
abstract_inverted_index.parameters.	139
abstract_inverted_index.performance	157
abstract_inverted_index.Prompt-based	26
abstract_inverted_index.capabilities	9
abstract_inverted_index.distribution	94
abstract_inverted_index.prompt-based	54, 126
abstract_inverted_index.propagation.	79
abstract_inverted_index.adaptability.	25
abstract_inverted_index.practicality.	46
abstract_inverted_index.classification	116
abstract_inverted_index.demonstrations	13
abstract_inverted_index.parameter-efficient	55
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	5
citation_normalized_percentile