Dual Instruction Tuning with Large Language Models for Mathematical Reasoning Article Swipe

PDF

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2403.18295

Recent advancements highlight the success of instruction tuning with large language models (LLMs) utilizing Chain-of-Thought (CoT) data for mathematical reasoning tasks. Despite the fine-tuned LLMs, challenges persist, such as incorrect, missing, and redundant steps in CoT generation leading to inaccuracies in answer predictions. To alleviate this problem, we propose a dual instruction tuning strategy to meticulously model mathematical reasoning from both forward and reverse directions. This involves introducing the Intermediate Reasoning State Prediction task (forward reasoning) and the Instruction Reconstruction task (reverse reasoning) to enhance the LLMs' understanding and execution of instructions. Training instances for these tasks are constructed based on existing mathematical instruction tuning datasets. Subsequently, LLMs undergo multi-task fine-tuning using both existing mathematical instructions and the newly created data. Comprehensive experiments validate the effectiveness and domain generalization of the dual instruction tuning strategy across various mathematical reasoning tasks.

Related Topics

Computer Science

Mathematics Education

Philosophy

Concepts

Dual language Computer science Dual (grammatical number) Language model Cognitive science Natural language processing Linguistics Mathematics education Psychology Philosophy

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2403.18295
PDF: https://arxiv.org/pdf/2403.18295
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4393300084

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4393300084

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2403.18295

Digital Object Identifier
Title: Dual Instruction Tuning with Large Language Models for Mathematical Reasoning

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2024

Year of publication
Publication date: 2024-03-27

Full publication date if available
Authors: Yongwei Zhou, Tiejun Zhao

List of authors in order
Landing page: https://arxiv.org/abs/2403.18295

Publisher landing page
PDF URL: https://arxiv.org/pdf/2403.18295

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2403.18295

Direct OA link when available
Concepts: Dual language, Computer science, Dual (grammatical number), Language model, Cognitive science, Natural language processing, Linguistics, Mathematics education, Psychology, Philosophy

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4393300084
doi	https://doi.org/10.48550/arxiv.2403.18295
ids.doi	https://doi.org/10.48550/arxiv.2403.18295
ids.openalex	https://openalex.org/W4393300084
fwci
type	preprint
title	Dual Instruction Tuning with Large Language Models for Mathematical Reasoning
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T11902
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.8561000227928162
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1702
topics[0].subfield.display_name	Artificial Intelligence
topics[0].display_name	Intelligent Tutoring Systems and Adaptive Learning
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C2992540719
concepts[0].level	2
concepts[0].score	0.7452183961868286
concepts[0].wikidata	https://www.wikidata.org/wiki/Q5310213
concepts[0].display_name	Dual language
concepts[1].id	https://openalex.org/C41008148
concepts[1].level	0
concepts[1].score	0.6390169858932495
concepts[1].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[1].display_name	Computer science
concepts[2].id	https://openalex.org/C2780980858
concepts[2].level	2
concepts[2].score	0.6060367822647095
concepts[2].wikidata	https://www.wikidata.org/wiki/Q110022
concepts[2].display_name	Dual (grammatical number)
concepts[3].id	https://openalex.org/C137293760
concepts[3].level	2
concepts[3].score	0.4334193170070648
concepts[3].wikidata	https://www.wikidata.org/wiki/Q3621696
concepts[3].display_name	Language model
concepts[4].id	https://openalex.org/C188147891
concepts[4].level	1
concepts[4].score	0.33535236120224
concepts[4].wikidata	https://www.wikidata.org/wiki/Q147638
concepts[4].display_name	Cognitive science
concepts[5].id	https://openalex.org/C204321447
concepts[5].level	1
concepts[5].score	0.3128809332847595
concepts[5].wikidata	https://www.wikidata.org/wiki/Q30642
concepts[5].display_name	Natural language processing
concepts[6].id	https://openalex.org/C41895202
concepts[6].level	1
concepts[6].score	0.3100670874118805
concepts[6].wikidata	https://www.wikidata.org/wiki/Q8162
concepts[6].display_name	Linguistics
concepts[7].id	https://openalex.org/C145420912
concepts[7].level	1
concepts[7].score	0.3048100471496582
concepts[7].wikidata	https://www.wikidata.org/wiki/Q853077
concepts[7].display_name	Mathematics education
concepts[8].id	https://openalex.org/C15744967
concepts[8].level	0
concepts[8].score	0.21332550048828125
concepts[8].wikidata	https://www.wikidata.org/wiki/Q9418
concepts[8].display_name	Psychology
concepts[9].id	https://openalex.org/C138885662
concepts[9].level	0
concepts[9].score	0.08900758624076843
concepts[9].wikidata	https://www.wikidata.org/wiki/Q5891
concepts[9].display_name	Philosophy
keywords[0].id	https://openalex.org/keywords/dual-language
keywords[0].score	0.7452183961868286
keywords[0].display_name	Dual language
keywords[1].id	https://openalex.org/keywords/computer-science
keywords[1].score	0.6390169858932495
keywords[1].display_name	Computer science
keywords[2].id	https://openalex.org/keywords/dual
keywords[2].score	0.6060367822647095
keywords[2].display_name	Dual (grammatical number)
keywords[3].id	https://openalex.org/keywords/language-model
keywords[3].score	0.4334193170070648
keywords[3].display_name	Language model
keywords[4].id	https://openalex.org/keywords/cognitive-science
keywords[4].score	0.33535236120224
keywords[4].display_name	Cognitive science
keywords[5].id	https://openalex.org/keywords/natural-language-processing
keywords[5].score	0.3128809332847595
keywords[5].display_name	Natural language processing
keywords[6].id	https://openalex.org/keywords/linguistics
keywords[6].score	0.3100670874118805
keywords[6].display_name	Linguistics
keywords[7].id	https://openalex.org/keywords/mathematics-education
keywords[7].score	0.3048100471496582
keywords[7].display_name	Mathematics education
keywords[8].id	https://openalex.org/keywords/psychology
keywords[8].score	0.21332550048828125
keywords[8].display_name	Psychology
keywords[9].id	https://openalex.org/keywords/philosophy
keywords[9].score	0.08900758624076843
keywords[9].display_name	Philosophy
language	en
locations[0].id	pmh:oai:arXiv.org:2403.18295
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2403.18295
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2403.18295
locations[1].id	doi:10.48550/arxiv.2403.18295
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2403.18295
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5102195851
authorships[0].author.orcid
authorships[0].author.display_name	Yongwei Zhou
authorships[0].author_position	first
authorships[0].raw_author_name	Zhou, Yongwei
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5101661008
authorships[1].author.orcid	https://orcid.org/0000-0003-4659-4935
authorships[1].author.display_name	Tiejun Zhao
authorships[1].author_position	last
authorships[1].raw_author_name	Zhao, Tiejun
authorships[1].is_corresponding	False
has_content.pdf	False
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2403.18295
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2025-10-10T00:00:00
display_name	Dual Instruction Tuning with Large Language Models for Mathematical Reasoning
has_fulltext	False
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T11902
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.8561000227928162
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1702
primary_topic.subfield.display_name	Artificial Intelligence
primary_topic.display_name	Intelligent Tutoring Systems and Adaptive Learning
related_works	https://openalex.org/W2742579949, https://openalex.org/W4385978307, https://openalex.org/W2339968826, https://openalex.org/W2981284406, https://openalex.org/W4200412772, https://openalex.org/W2975760732, https://openalex.org/W1707019247, https://openalex.org/W2221456461, https://openalex.org/W4307244554, https://openalex.org/W2323798523
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2403.18295
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2403.18295
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2403.18295
primary_location.id	pmh:oai:arXiv.org:2403.18295
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2403.18295
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2403.18295
publication_date	2024-03-27
publication_year	2024
referenced_works_count	0
abstract_inverted_index.a	49
abstract_inverted_index.To	43
abstract_inverted_index.as	28
abstract_inverted_index.in	34, 40
abstract_inverted_index.of	5, 90, 129
abstract_inverted_index.on	100
abstract_inverted_index.to	38, 54, 83
abstract_inverted_index.we	47
abstract_inverted_index.CoT	35
abstract_inverted_index.and	31, 62, 76, 88, 116, 126
abstract_inverted_index.are	97
abstract_inverted_index.for	17, 94
abstract_inverted_index.the	3, 22, 68, 77, 85, 117, 124, 130
abstract_inverted_index.LLMs	107
abstract_inverted_index.This	65
abstract_inverted_index.both	60, 112
abstract_inverted_index.data	16
abstract_inverted_index.dual	50, 131
abstract_inverted_index.from	59
abstract_inverted_index.such	27
abstract_inverted_index.task	73, 80
abstract_inverted_index.this	45
abstract_inverted_index.with	8
abstract_inverted_index.(CoT)	15
abstract_inverted_index.LLMs'	86
abstract_inverted_index.LLMs,	24
abstract_inverted_index.State	71
abstract_inverted_index.based	99
abstract_inverted_index.data.	120
abstract_inverted_index.large	9
abstract_inverted_index.model	56
abstract_inverted_index.newly	118
abstract_inverted_index.steps	33
abstract_inverted_index.tasks	96
abstract_inverted_index.these	95
abstract_inverted_index.using	111
abstract_inverted_index.(LLMs)	12
abstract_inverted_index.Recent	0
abstract_inverted_index.across	135
abstract_inverted_index.answer	41
abstract_inverted_index.domain	127
abstract_inverted_index.models	11
abstract_inverted_index.tasks.	20, 139
abstract_inverted_index.tuning	7, 52, 104, 133
abstract_inverted_index.Despite	21
abstract_inverted_index.created	119
abstract_inverted_index.enhance	84
abstract_inverted_index.forward	61
abstract_inverted_index.leading	37
abstract_inverted_index.propose	48
abstract_inverted_index.reverse	63
abstract_inverted_index.success	4
abstract_inverted_index.undergo	108
abstract_inverted_index.various	136
abstract_inverted_index.(forward	74
abstract_inverted_index.(reverse	81
abstract_inverted_index.Training	92
abstract_inverted_index.existing	101, 113
abstract_inverted_index.involves	66
abstract_inverted_index.language	10
abstract_inverted_index.missing,	30
abstract_inverted_index.persist,	26
abstract_inverted_index.problem,	46
abstract_inverted_index.strategy	53, 134
abstract_inverted_index.validate	123
abstract_inverted_index.Reasoning	70
abstract_inverted_index.alleviate	44
abstract_inverted_index.datasets.	105
abstract_inverted_index.execution	89
abstract_inverted_index.highlight	2
abstract_inverted_index.instances	93
abstract_inverted_index.reasoning	19, 58, 138
abstract_inverted_index.redundant	32
abstract_inverted_index.utilizing	13
abstract_inverted_index.Prediction	72
abstract_inverted_index.challenges	25
abstract_inverted_index.fine-tuned	23
abstract_inverted_index.generation	36
abstract_inverted_index.incorrect,	29
abstract_inverted_index.multi-task	109
abstract_inverted_index.reasoning)	75, 82
abstract_inverted_index.Instruction	78
abstract_inverted_index.constructed	98
abstract_inverted_index.directions.	64
abstract_inverted_index.experiments	122
abstract_inverted_index.fine-tuning	110
abstract_inverted_index.instruction	6, 51, 103, 132
abstract_inverted_index.introducing	67
abstract_inverted_index.Intermediate	69
abstract_inverted_index.advancements	1
abstract_inverted_index.inaccuracies	39
abstract_inverted_index.instructions	115
abstract_inverted_index.mathematical	18, 57, 102, 114, 137
abstract_inverted_index.meticulously	55
abstract_inverted_index.predictions.	42
abstract_inverted_index.Comprehensive	121
abstract_inverted_index.Subsequently,	106
abstract_inverted_index.effectiveness	125
abstract_inverted_index.instructions.	91
abstract_inverted_index.understanding	87
abstract_inverted_index.Reconstruction	79
abstract_inverted_index.generalization	128
abstract_inverted_index.Chain-of-Thought	14
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	2
citation_normalized_percentile