Towards the Assessment of Task-based Chatbots: From the TOFU-R Snapshot to the BRASATO Curated Dataset Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2508.15496
Task-based chatbots are increasingly being used to deliver real services, yet assessing their reliability, security, and robustness remains underexplored, also due to the lack of large-scale, high-quality datasets. The emerging automated quality assessment techniques targeting chatbots often rely on limited pools of subjects, such as custom-made toy examples, or outdated, no longer available, or scarcely popular agents, complicating the evaluation of such techniques. In this paper, we present two datasets and the tool support necessary to create and maintain these datasets. The first dataset is RASA TASK-BASED CHATBOTS FROM GITHUB (TOFU-R), which is a snapshot of the Rasa chatbots available on GitHub, representing the state of the practice in open-source chatbot development with Rasa. The second dataset is BOT RASA COLLECTION (BRASATO), a curated selection of the most relevant chatbots for dialogue complexity, functional complexity, and utility, whose goal is to ease reproducibility and facilitate research on chatbot reliability.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2508.15496
- https://arxiv.org/pdf/2508.15496
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416051114
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416051114Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2508.15496Digital Object Identifier
- Title
-
Towards the Assessment of Task-based Chatbots: From the TOFU-R Snapshot to the BRASATO Curated DatasetWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-08-21Full publication date if available
- Authors
-
Elena Masserini, Diego Clerissi, Daniela Micucci, João R. Campos, Leonardo MarianiList of authors in order
- Landing page
-
https://arxiv.org/abs/2508.15496Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2508.15496Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2508.15496Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416051114 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2508.15496 |
| ids.doi | https://doi.org/10.48550/arxiv.2508.15496 |
| ids.openalex | https://openalex.org/W4416051114 |
| fwci | |
| type | preprint |
| title | Towards the Assessment of Task-based Chatbots: From the TOFU-R Snapshot to the BRASATO Curated Dataset |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2508.15496 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2508.15496 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2508.15496 |
| locations[1].id | doi:10.48550/arxiv.2508.15496 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2508.15496 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5065145526 |
| authorships[0].author.orcid | https://orcid.org/0009-0002-6969-1500 |
| authorships[0].author.display_name | Elena Masserini |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Masserini, Elena |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5015794004 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-7651-0400 |
| authorships[1].author.display_name | Diego Clerissi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Clerissi, Diego |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5015645148 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-1261-2234 |
| authorships[2].author.display_name | Daniela Micucci |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Micucci, Daniela |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5045188001 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4623-764X |
| authorships[3].author.display_name | João R. Campos |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Campos, João R. |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5036120394 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-9527-7042 |
| authorships[4].author.display_name | Leonardo Mariani |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Mariani, Leonardo |
| authorships[4].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2508.15496 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Towards the Assessment of Task-based Chatbots: From the TOFU-R Snapshot to the BRASATO Curated Dataset |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-28T10:50:16.911541 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2508.15496 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2508.15496 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2508.15496 |
| primary_location.id | pmh:oai:arXiv.org:2508.15496 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2508.15496 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2508.15496 |
| publication_date | 2025-08-21 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 93, 122 |
| abstract_inverted_index.In | 63 |
| abstract_inverted_index.as | 44 |
| abstract_inverted_index.in | 108 |
| abstract_inverted_index.is | 84, 92, 117, 139 |
| abstract_inverted_index.no | 50 |
| abstract_inverted_index.of | 24, 41, 60, 95, 105, 125 |
| abstract_inverted_index.on | 38, 100, 146 |
| abstract_inverted_index.or | 48, 53 |
| abstract_inverted_index.to | 6, 21, 75, 140 |
| abstract_inverted_index.we | 66 |
| abstract_inverted_index.BOT | 118 |
| abstract_inverted_index.The | 28, 81, 114 |
| abstract_inverted_index.and | 15, 70, 77, 135, 143 |
| abstract_inverted_index.are | 2 |
| abstract_inverted_index.due | 20 |
| abstract_inverted_index.for | 130 |
| abstract_inverted_index.the | 22, 58, 71, 96, 103, 106, 126 |
| abstract_inverted_index.toy | 46 |
| abstract_inverted_index.two | 68 |
| abstract_inverted_index.yet | 10 |
| abstract_inverted_index.FROM | 88 |
| abstract_inverted_index.RASA | 85, 119 |
| abstract_inverted_index.Rasa | 97 |
| abstract_inverted_index.also | 19 |
| abstract_inverted_index.ease | 141 |
| abstract_inverted_index.goal | 138 |
| abstract_inverted_index.lack | 23 |
| abstract_inverted_index.most | 127 |
| abstract_inverted_index.real | 8 |
| abstract_inverted_index.rely | 37 |
| abstract_inverted_index.such | 43, 61 |
| abstract_inverted_index.this | 64 |
| abstract_inverted_index.tool | 72 |
| abstract_inverted_index.used | 5 |
| abstract_inverted_index.with | 112 |
| abstract_inverted_index.Rasa. | 113 |
| abstract_inverted_index.being | 4 |
| abstract_inverted_index.first | 82 |
| abstract_inverted_index.often | 36 |
| abstract_inverted_index.pools | 40 |
| abstract_inverted_index.state | 104 |
| abstract_inverted_index.their | 12 |
| abstract_inverted_index.these | 79 |
| abstract_inverted_index.which | 91 |
| abstract_inverted_index.whose | 137 |
| abstract_inverted_index.GITHUB | 89 |
| abstract_inverted_index.create | 76 |
| abstract_inverted_index.longer | 51 |
| abstract_inverted_index.paper, | 65 |
| abstract_inverted_index.second | 115 |
| abstract_inverted_index.GitHub, | 101 |
| abstract_inverted_index.agents, | 56 |
| abstract_inverted_index.chatbot | 110, 147 |
| abstract_inverted_index.curated | 123 |
| abstract_inverted_index.dataset | 83, 116 |
| abstract_inverted_index.deliver | 7 |
| abstract_inverted_index.limited | 39 |
| abstract_inverted_index.popular | 55 |
| abstract_inverted_index.present | 67 |
| abstract_inverted_index.quality | 31 |
| abstract_inverted_index.remains | 17 |
| abstract_inverted_index.support | 73 |
| abstract_inverted_index.CHATBOTS | 87 |
| abstract_inverted_index.chatbots | 1, 35, 98, 129 |
| abstract_inverted_index.datasets | 69 |
| abstract_inverted_index.dialogue | 131 |
| abstract_inverted_index.emerging | 29 |
| abstract_inverted_index.maintain | 78 |
| abstract_inverted_index.practice | 107 |
| abstract_inverted_index.relevant | 128 |
| abstract_inverted_index.research | 145 |
| abstract_inverted_index.scarcely | 54 |
| abstract_inverted_index.snapshot | 94 |
| abstract_inverted_index.utility, | 136 |
| abstract_inverted_index.(TOFU-R), | 90 |
| abstract_inverted_index.assessing | 11 |
| abstract_inverted_index.automated | 30 |
| abstract_inverted_index.available | 99 |
| abstract_inverted_index.datasets. | 27, 80 |
| abstract_inverted_index.examples, | 47 |
| abstract_inverted_index.necessary | 74 |
| abstract_inverted_index.outdated, | 49 |
| abstract_inverted_index.security, | 14 |
| abstract_inverted_index.selection | 124 |
| abstract_inverted_index.services, | 9 |
| abstract_inverted_index.subjects, | 42 |
| abstract_inverted_index.targeting | 34 |
| abstract_inverted_index.(BRASATO), | 121 |
| abstract_inverted_index.COLLECTION | 120 |
| abstract_inverted_index.TASK-BASED | 86 |
| abstract_inverted_index.Task-based | 0 |
| abstract_inverted_index.assessment | 32 |
| abstract_inverted_index.available, | 52 |
| abstract_inverted_index.evaluation | 59 |
| abstract_inverted_index.facilitate | 144 |
| abstract_inverted_index.functional | 133 |
| abstract_inverted_index.robustness | 16 |
| abstract_inverted_index.techniques | 33 |
| abstract_inverted_index.complexity, | 132, 134 |
| abstract_inverted_index.custom-made | 45 |
| abstract_inverted_index.development | 111 |
| abstract_inverted_index.open-source | 109 |
| abstract_inverted_index.techniques. | 62 |
| abstract_inverted_index.complicating | 57 |
| abstract_inverted_index.high-quality | 26 |
| abstract_inverted_index.increasingly | 3 |
| abstract_inverted_index.large-scale, | 25 |
| abstract_inverted_index.reliability, | 13 |
| abstract_inverted_index.reliability. | 148 |
| abstract_inverted_index.representing | 102 |
| abstract_inverted_index.underexplored, | 18 |
| abstract_inverted_index.reproducibility | 142 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |