TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2406.03618
Large Language Models (LLMs) often do not perform well on queries that require the aggregation of information across texts. To better evaluate this setting and facilitate modeling efforts, we introduce TACT - Text And Calculations through Tables, a dataset crafted to evaluate LLMs' reasoning and computational abilities using complex instructions. TACT contains challenging instructions that demand stitching information scattered across one or more texts, and performing complex integration on this information to generate the answer. We construct this dataset by leveraging an existing dataset of texts and their associated tables. For each such tables, we formulate new queries, and gather their respective answers. We demonstrate that all contemporary LLMs perform poorly on this dataset, achieving an accuracy below 38%. To pinpoint the difficulties and thoroughly dissect the problem, we analyze model performance across three components: table-generation, Pandas command-generation, and execution. Unexpectedly, we discover that each component presents substantial challenges for current LLMs. These insights lead us to propose a focused modeling framework, which we refer to as IE as a tool. Specifically, we propose to add "tools" for each of the above steps, and implement each such tool with few-shot prompting. This approach shows an improvement over existing prompting techniques, offering a promising direction for enhancing model capabilities in these tasks.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2406.03618
- https://arxiv.org/pdf/2406.03618
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4399453901
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4399453901Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2406.03618Digital Object Identifier
- Title
-
TACT: Advancing Complex Aggregative Reasoning with Information Extraction ToolsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-06-05Full publication date if available
- Authors
-
Avi Caciularu, Alon Jacovi, Eyal Ben-David, Sasha Goldshtein, Tal Schuster, Jonathan Herzig, Gal Elidan, Amir GlobersonList of authors in order
- Landing page
-
https://arxiv.org/abs/2406.03618Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2406.03618Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2406.03618Direct OA link when available
- Concepts
-
Tact, Computer science, Psychology, PsychotherapistTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4399453901 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2406.03618 |
| ids.doi | https://doi.org/10.48550/arxiv.2406.03618 |
| ids.openalex | https://openalex.org/W4399453901 |
| fwci | |
| type | preprint |
| title | TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10215 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9472000002861023 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Semantic Web and Ontologies |
| topics[1].id | https://openalex.org/T11303 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9333999752998352 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Bayesian Modeling and Causal Inference |
| topics[2].id | https://openalex.org/T11010 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9297000169754028 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Logic, Reasoning, and Knowledge |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2778833722 |
| concepts[0].level | 2 |
| concepts[0].score | 0.9522445201873779 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q28457372 |
| concepts[0].display_name | Tact |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.39723923802375793 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C15744967 |
| concepts[2].level | 0 |
| concepts[2].score | 0.28368932008743286 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[2].display_name | Psychology |
| concepts[3].id | https://openalex.org/C542102704 |
| concepts[3].level | 1 |
| concepts[3].score | 0.0 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q183257 |
| concepts[3].display_name | Psychotherapist |
| keywords[0].id | https://openalex.org/keywords/tact |
| keywords[0].score | 0.9522445201873779 |
| keywords[0].display_name | Tact |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.39723923802375793 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/psychology |
| keywords[2].score | 0.28368932008743286 |
| keywords[2].display_name | Psychology |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2406.03618 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2406.03618 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2406.03618 |
| locations[1].id | doi:10.48550/arxiv.2406.03618 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2406.03618 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5077903013 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0573-1075 |
| authorships[0].author.display_name | Avi Caciularu |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Caciularu, Avi |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5045156755 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-7263-2061 |
| authorships[1].author.display_name | Alon Jacovi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Jacovi, Alon |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5099056955 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Eyal Ben-David |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Ben-David, Eyal |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5099056956 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Sasha Goldshtein |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Goldshtein, Sasha |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5058037733 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-7772-8230 |
| authorships[4].author.display_name | Tal Schuster |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Schuster, Tal |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5071893787 |
| authorships[5].author.orcid | https://orcid.org/0009-0000-7227-6557 |
| authorships[5].author.display_name | Jonathan Herzig |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Herzig, Jonathan |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5026687501 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Gal Elidan |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Elidan, Gal |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5047817959 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Amir Globerson |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Globerson, Amir |
| authorships[7].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2406.03618 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10215 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9472000002861023 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Semantic Web and Ontologies |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2067695523, https://openalex.org/W325698339, https://openalex.org/W2311375138, https://openalex.org/W2238176571, https://openalex.org/W2587815166, https://openalex.org/W2241513875, https://openalex.org/W1599691175, https://openalex.org/W2062900861 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2406.03618 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2406.03618 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2406.03618 |
| primary_location.id | pmh:oai:arXiv.org:2406.03618 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2406.03618 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2406.03618 |
| publication_date | 2024-06-05 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.- | 31 |
| abstract_inverted_index.a | 37, 158, 169, 201 |
| abstract_inverted_index.IE | 167 |
| abstract_inverted_index.To | 19, 119 |
| abstract_inverted_index.We | 75, 103 |
| abstract_inverted_index.an | 81, 115, 194 |
| abstract_inverted_index.as | 166, 168 |
| abstract_inverted_index.by | 79 |
| abstract_inverted_index.do | 5 |
| abstract_inverted_index.in | 208 |
| abstract_inverted_index.of | 15, 84, 179 |
| abstract_inverted_index.on | 9, 68, 111 |
| abstract_inverted_index.or | 61 |
| abstract_inverted_index.to | 40, 71, 156, 165, 174 |
| abstract_inverted_index.us | 155 |
| abstract_inverted_index.we | 28, 94, 128, 141, 163, 172 |
| abstract_inverted_index.And | 33 |
| abstract_inverted_index.For | 90 |
| abstract_inverted_index.add | 175 |
| abstract_inverted_index.all | 106 |
| abstract_inverted_index.and | 24, 44, 64, 86, 98, 123, 138, 183 |
| abstract_inverted_index.for | 149, 177, 204 |
| abstract_inverted_index.new | 96 |
| abstract_inverted_index.not | 6 |
| abstract_inverted_index.one | 60 |
| abstract_inverted_index.the | 13, 73, 121, 126, 180 |
| abstract_inverted_index.38%. | 118 |
| abstract_inverted_index.LLMs | 108 |
| abstract_inverted_index.TACT | 30, 50 |
| abstract_inverted_index.Text | 32 |
| abstract_inverted_index.This | 191 |
| abstract_inverted_index.each | 91, 144, 178, 185 |
| abstract_inverted_index.lead | 154 |
| abstract_inverted_index.more | 62 |
| abstract_inverted_index.over | 196 |
| abstract_inverted_index.such | 92, 186 |
| abstract_inverted_index.that | 11, 54, 105, 143 |
| abstract_inverted_index.this | 22, 69, 77, 112 |
| abstract_inverted_index.tool | 187 |
| abstract_inverted_index.well | 8 |
| abstract_inverted_index.with | 188 |
| abstract_inverted_index.LLMs' | 42 |
| abstract_inverted_index.LLMs. | 151 |
| abstract_inverted_index.Large | 0 |
| abstract_inverted_index.These | 152 |
| abstract_inverted_index.above | 181 |
| abstract_inverted_index.below | 117 |
| abstract_inverted_index.model | 130, 206 |
| abstract_inverted_index.often | 4 |
| abstract_inverted_index.refer | 164 |
| abstract_inverted_index.shows | 193 |
| abstract_inverted_index.texts | 85 |
| abstract_inverted_index.their | 87, 100 |
| abstract_inverted_index.these | 209 |
| abstract_inverted_index.three | 133 |
| abstract_inverted_index.tool. | 170 |
| abstract_inverted_index.using | 47 |
| abstract_inverted_index.which | 162 |
| abstract_inverted_index.(LLMs) | 3 |
| abstract_inverted_index.Models | 2 |
| abstract_inverted_index.Pandas | 136 |
| abstract_inverted_index.across | 17, 59, 132 |
| abstract_inverted_index.better | 20 |
| abstract_inverted_index.demand | 55 |
| abstract_inverted_index.gather | 99 |
| abstract_inverted_index.poorly | 110 |
| abstract_inverted_index.steps, | 182 |
| abstract_inverted_index.tasks. | 210 |
| abstract_inverted_index.texts, | 63 |
| abstract_inverted_index.texts. | 18 |
| abstract_inverted_index."tools" | 176 |
| abstract_inverted_index.Tables, | 36 |
| abstract_inverted_index.analyze | 129 |
| abstract_inverted_index.answer. | 74 |
| abstract_inverted_index.complex | 48, 66 |
| abstract_inverted_index.crafted | 39 |
| abstract_inverted_index.current | 150 |
| abstract_inverted_index.dataset | 38, 78, 83 |
| abstract_inverted_index.dissect | 125 |
| abstract_inverted_index.focused | 159 |
| abstract_inverted_index.perform | 7, 109 |
| abstract_inverted_index.propose | 157, 173 |
| abstract_inverted_index.queries | 10 |
| abstract_inverted_index.require | 12 |
| abstract_inverted_index.setting | 23 |
| abstract_inverted_index.tables, | 93 |
| abstract_inverted_index.tables. | 89 |
| abstract_inverted_index.through | 35 |
| abstract_inverted_index.Language | 1 |
| abstract_inverted_index.accuracy | 116 |
| abstract_inverted_index.answers. | 102 |
| abstract_inverted_index.approach | 192 |
| abstract_inverted_index.contains | 51 |
| abstract_inverted_index.dataset, | 113 |
| abstract_inverted_index.discover | 142 |
| abstract_inverted_index.efforts, | 27 |
| abstract_inverted_index.evaluate | 21, 41 |
| abstract_inverted_index.existing | 82, 197 |
| abstract_inverted_index.few-shot | 189 |
| abstract_inverted_index.generate | 72 |
| abstract_inverted_index.insights | 153 |
| abstract_inverted_index.modeling | 26, 160 |
| abstract_inverted_index.offering | 200 |
| abstract_inverted_index.pinpoint | 120 |
| abstract_inverted_index.presents | 146 |
| abstract_inverted_index.problem, | 127 |
| abstract_inverted_index.queries, | 97 |
| abstract_inverted_index.abilities | 46 |
| abstract_inverted_index.achieving | 114 |
| abstract_inverted_index.component | 145 |
| abstract_inverted_index.construct | 76 |
| abstract_inverted_index.direction | 203 |
| abstract_inverted_index.enhancing | 205 |
| abstract_inverted_index.formulate | 95 |
| abstract_inverted_index.implement | 184 |
| abstract_inverted_index.introduce | 29 |
| abstract_inverted_index.promising | 202 |
| abstract_inverted_index.prompting | 198 |
| abstract_inverted_index.reasoning | 43 |
| abstract_inverted_index.scattered | 58 |
| abstract_inverted_index.stitching | 56 |
| abstract_inverted_index.associated | 88 |
| abstract_inverted_index.challenges | 148 |
| abstract_inverted_index.execution. | 139 |
| abstract_inverted_index.facilitate | 25 |
| abstract_inverted_index.framework, | 161 |
| abstract_inverted_index.leveraging | 80 |
| abstract_inverted_index.performing | 65 |
| abstract_inverted_index.prompting. | 190 |
| abstract_inverted_index.respective | 101 |
| abstract_inverted_index.thoroughly | 124 |
| abstract_inverted_index.aggregation | 14 |
| abstract_inverted_index.challenging | 52 |
| abstract_inverted_index.components: | 134 |
| abstract_inverted_index.demonstrate | 104 |
| abstract_inverted_index.improvement | 195 |
| abstract_inverted_index.information | 16, 57, 70 |
| abstract_inverted_index.integration | 67 |
| abstract_inverted_index.performance | 131 |
| abstract_inverted_index.substantial | 147 |
| abstract_inverted_index.techniques, | 199 |
| abstract_inverted_index.Calculations | 34 |
| abstract_inverted_index.capabilities | 207 |
| abstract_inverted_index.contemporary | 107 |
| abstract_inverted_index.difficulties | 122 |
| abstract_inverted_index.instructions | 53 |
| abstract_inverted_index.Specifically, | 171 |
| abstract_inverted_index.Unexpectedly, | 140 |
| abstract_inverted_index.computational | 45 |
| abstract_inverted_index.instructions. | 49 |
| abstract_inverted_index.table-generation, | 135 |
| abstract_inverted_index.command-generation, | 137 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 8 |
| citation_normalized_percentile |