Taming LLMs with Negative Samples: A Reference-Free Framework to Evaluate Presentation Content with Actionable Feedback Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.18240
The generation of presentation slides automatically is an important problem in the era of generative AI. This paper focuses on evaluating multimodal content in presentation slides that can effectively summarize a document and convey concepts to a broad audience. We introduce a benchmark dataset, RefSlides, consisting of human-made high-quality presentations that span various topics. Next, we propose a set of metrics to characterize different intrinsic properties of the content of a presentation and present REFLEX, an evaluation approach that generates scores and actionable feedback for these metrics. We achieve this by generating negative presentation samples with different degrees of metric-specific perturbations and use them to fine-tune LLMs. This reference-free evaluation technique does not require ground truth presentations during inference. Our extensive automated and human experiments demonstrate that our evaluation approach outperforms classical heuristic-based and state-of-the-art large language model-based evaluations in generating scores and explanations.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2505.18240
- https://arxiv.org/pdf/2505.18240
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4414581222
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4414581222Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.18240Digital Object Identifier
- Title
-
Taming LLMs with Negative Samples: A Reference-Free Framework to Evaluate Presentation Content with Actionable FeedbackWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-23Full publication date if available
- Authors
-
Ananth Muppidi, Tarak Das, Sambaran Bandyopadhyay, Tripti Shukla, D A DharunList of authors in order
- Landing page
-
https://arxiv.org/abs/2505.18240Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2505.18240Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2505.18240Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4414581222 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.18240 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.18240 |
| ids.openalex | https://openalex.org/W4414581222 |
| fwci | |
| type | preprint |
| title | Taming LLMs with Negative Samples: A Reference-Free Framework to Evaluate Presentation Content with Actionable Feedback |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9208999872207642 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T13643 |
| topics[1].field.id | https://openalex.org/fields/33 |
| topics[1].field.display_name | Social Sciences |
| topics[1].score | 0.9096999764442444 |
| topics[1].domain.id | https://openalex.org/domains/2 |
| topics[1].domain.display_name | Social Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/3320 |
| topics[1].subfield.display_name | Political Science and International Relations |
| topics[1].display_name | Artificial Intelligence in Law |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2505.18240 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2505.18240 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2505.18240 |
| locations[1].id | doi:10.48550/arxiv.2505.18240 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2505.18240 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5119181797 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Ananth Muppidi |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Muppidi, Ananth |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5072004888 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Tarak Das |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Das, Tarak |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5111227747 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Sambaran Bandyopadhyay |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Bandyopadhyay, Sambaran |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5013402151 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-8250-2431 |
| authorships[3].author.display_name | Tripti Shukla |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Shukla, Tripti |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5119753504 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | D A Dharun |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | A, Dharun D |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2505.18240 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Taming LLMs with Negative Samples: A Reference-Free Framework to Evaluate Presentation Content with Actionable Feedback |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9208999872207642 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2505.18240 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2505.18240 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2505.18240 |
| primary_location.id | pmh:oai:arXiv.org:2505.18240 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2505.18240 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2505.18240 |
| publication_date | 2025-05-23 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 30, 36, 41, 57, 70 |
| abstract_inverted_index.We | 39, 87 |
| abstract_inverted_index.an | 7, 75 |
| abstract_inverted_index.by | 90 |
| abstract_inverted_index.in | 10, 23, 139 |
| abstract_inverted_index.is | 6 |
| abstract_inverted_index.of | 2, 13, 46, 59, 66, 69, 98 |
| abstract_inverted_index.on | 19 |
| abstract_inverted_index.to | 35, 61, 104 |
| abstract_inverted_index.we | 55 |
| abstract_inverted_index.AI. | 15 |
| abstract_inverted_index.Our | 119 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.and | 32, 72, 81, 101, 122, 133, 142 |
| abstract_inverted_index.can | 27 |
| abstract_inverted_index.era | 12 |
| abstract_inverted_index.for | 84 |
| abstract_inverted_index.not | 112 |
| abstract_inverted_index.our | 127 |
| abstract_inverted_index.set | 58 |
| abstract_inverted_index.the | 11, 67 |
| abstract_inverted_index.use | 102 |
| abstract_inverted_index.This | 16, 107 |
| abstract_inverted_index.does | 111 |
| abstract_inverted_index.span | 51 |
| abstract_inverted_index.that | 26, 50, 78, 126 |
| abstract_inverted_index.them | 103 |
| abstract_inverted_index.this | 89 |
| abstract_inverted_index.with | 95 |
| abstract_inverted_index.LLMs. | 106 |
| abstract_inverted_index.Next, | 54 |
| abstract_inverted_index.broad | 37 |
| abstract_inverted_index.human | 123 |
| abstract_inverted_index.large | 135 |
| abstract_inverted_index.paper | 17 |
| abstract_inverted_index.these | 85 |
| abstract_inverted_index.truth | 115 |
| abstract_inverted_index.convey | 33 |
| abstract_inverted_index.during | 117 |
| abstract_inverted_index.ground | 114 |
| abstract_inverted_index.scores | 80, 141 |
| abstract_inverted_index.slides | 4, 25 |
| abstract_inverted_index.REFLEX, | 74 |
| abstract_inverted_index.achieve | 88 |
| abstract_inverted_index.content | 22, 68 |
| abstract_inverted_index.degrees | 97 |
| abstract_inverted_index.focuses | 18 |
| abstract_inverted_index.metrics | 60 |
| abstract_inverted_index.present | 73 |
| abstract_inverted_index.problem | 9 |
| abstract_inverted_index.propose | 56 |
| abstract_inverted_index.require | 113 |
| abstract_inverted_index.samples | 94 |
| abstract_inverted_index.topics. | 53 |
| abstract_inverted_index.various | 52 |
| abstract_inverted_index.approach | 77, 129 |
| abstract_inverted_index.concepts | 34 |
| abstract_inverted_index.dataset, | 43 |
| abstract_inverted_index.document | 31 |
| abstract_inverted_index.feedback | 83 |
| abstract_inverted_index.language | 136 |
| abstract_inverted_index.metrics. | 86 |
| abstract_inverted_index.negative | 92 |
| abstract_inverted_index.audience. | 38 |
| abstract_inverted_index.automated | 121 |
| abstract_inverted_index.benchmark | 42 |
| abstract_inverted_index.classical | 131 |
| abstract_inverted_index.different | 63, 96 |
| abstract_inverted_index.extensive | 120 |
| abstract_inverted_index.fine-tune | 105 |
| abstract_inverted_index.generates | 79 |
| abstract_inverted_index.important | 8 |
| abstract_inverted_index.intrinsic | 64 |
| abstract_inverted_index.introduce | 40 |
| abstract_inverted_index.summarize | 29 |
| abstract_inverted_index.technique | 110 |
| abstract_inverted_index.RefSlides, | 44 |
| abstract_inverted_index.actionable | 82 |
| abstract_inverted_index.consisting | 45 |
| abstract_inverted_index.evaluating | 20 |
| abstract_inverted_index.evaluation | 76, 109, 128 |
| abstract_inverted_index.generating | 91, 140 |
| abstract_inverted_index.generation | 1 |
| abstract_inverted_index.generative | 14 |
| abstract_inverted_index.human-made | 47 |
| abstract_inverted_index.inference. | 118 |
| abstract_inverted_index.multimodal | 21 |
| abstract_inverted_index.properties | 65 |
| abstract_inverted_index.demonstrate | 125 |
| abstract_inverted_index.effectively | 28 |
| abstract_inverted_index.evaluations | 138 |
| abstract_inverted_index.experiments | 124 |
| abstract_inverted_index.model-based | 137 |
| abstract_inverted_index.outperforms | 130 |
| abstract_inverted_index.characterize | 62 |
| abstract_inverted_index.high-quality | 48 |
| abstract_inverted_index.presentation | 3, 24, 71, 93 |
| abstract_inverted_index.automatically | 5 |
| abstract_inverted_index.explanations. | 143 |
| abstract_inverted_index.perturbations | 100 |
| abstract_inverted_index.presentations | 49, 116 |
| abstract_inverted_index.reference-free | 108 |
| abstract_inverted_index.heuristic-based | 132 |
| abstract_inverted_index.metric-specific | 99 |
| abstract_inverted_index.state-of-the-art | 134 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |