LitAI: Enhancing Multimodal Literature Understanding and Mining with Generative AI Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.1109/mipr62202.2024.00080
Information processing and retrieval in literature are critical for advancing scientific research and knowledge discovery. The inherent multimodality and diverse literature formats, including text, tables, and figures, present significant challenges in literature information retrieval. This paper introduces LitAI, a novel approach that employs readily available generative AI tools to enhance multimodal information retrieval from literature documents. By integrating tools such as optical character recognition (OCR) with generative AI services, LitAI facilitates the retrieval of text, tables, and figures from PDF documents. We have developed specific prompts that leverage in-context learning and prompt engineering within Generative AI to achieve precise information extraction. Our empirical evaluations, conducted on datasets from the ecological and biological sciences, demonstrate the superiority of our approach over several established baselines including Tesseract-OCR and GPT-4. The implementation of LitAI is accessible at https://github.com/ResponsibleAILab/LitAI.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1109/mipr62202.2024.00080
- OA Status
- green
- Cited By
- 3
- References
- 15
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403421871
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403421871Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1109/mipr62202.2024.00080Digital Object Identifier
- Title
-
LitAI: Enhancing Multimodal Literature Understanding and Mining with Generative AIWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-08-07Full publication date if available
- Authors
-
Gowtham Medisetti, Zacchaeus G. Compson, Heng Fan, Huaxiao Yang, Yunhe FengList of authors in order
- Landing page
-
https://doi.org/10.1109/mipr62202.2024.00080Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.ncbi.nlm.nih.gov/pmc/articles/11526646Direct OA link when available
- Concepts
-
Generative grammar, Computer science, Artificial intelligence, Natural language processingTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
3Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 3Per-year citation counts (last 5 years)
- References (count)
-
15Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403421871 |
|---|---|
| doi | https://doi.org/10.1109/mipr62202.2024.00080 |
| ids.doi | https://doi.org/10.1109/mipr62202.2024.00080 |
| ids.pmid | https://pubmed.ncbi.nlm.nih.gov/39483325 |
| ids.openalex | https://openalex.org/W4403421871 |
| fwci | 1.91633565 |
| type | article |
| title | LitAI: Enhancing Multimodal Literature Understanding and Mining with Generative AI |
| biblio.issue | |
| biblio.volume | 13 |
| biblio.last_page | 476 |
| biblio.first_page | 471 |
| topics[0].id | https://openalex.org/T10181 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9994000196456909 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Natural Language Processing Techniques |
| topics[1].id | https://openalex.org/T10215 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9987000226974487 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Semantic Web and Ontologies |
| topics[2].id | https://openalex.org/T10028 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9977999925613403 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Topic Modeling |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C39890363 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7684775590896606 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q36108 |
| concepts[0].display_name | Generative grammar |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6884309649467468 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.51751708984375 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C204321447 |
| concepts[3].level | 1 |
| concepts[3].score | 0.39744043350219727 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[3].display_name | Natural language processing |
| keywords[0].id | https://openalex.org/keywords/generative-grammar |
| keywords[0].score | 0.7684775590896606 |
| keywords[0].display_name | Generative grammar |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6884309649467468 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.51751708984375 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/natural-language-processing |
| keywords[3].score | 0.39744043350219727 |
| keywords[3].display_name | Natural language processing |
| language | en |
| locations[0].id | doi:10.1109/mipr62202.2024.00080 |
| locations[0].is_oa | False |
| locations[0].source | |
| locations[0].license | |
| locations[0].pdf_url | |
| locations[0].version | publishedVersion |
| locations[0].raw_type | proceedings-article |
| locations[0].license_id | |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR) |
| locations[0].landing_page_url | https://doi.org/10.1109/mipr62202.2024.00080 |
| locations[1].id | pmid:39483325 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306525036 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | PubMed |
| locations[1].source.host_organization | https://openalex.org/I1299303238 |
| locations[1].source.host_organization_name | National Institutes of Health |
| locations[1].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | publishedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | True |
| locations[1].is_published | True |
| locations[1].raw_source_name | Proceedings. IEEE Conference on Multimedia Information Processing and Retrieval |
| locations[1].landing_page_url | https://pubmed.ncbi.nlm.nih.gov/39483325 |
| locations[2].id | pmh:oai:pubmedcentral.nih.gov:11526646 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S2764455111 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | PubMed Central |
| locations[2].source.host_organization | https://openalex.org/I1299303238 |
| locations[2].source.host_organization_name | National Institutes of Health |
| locations[2].source.host_organization_lineage | https://openalex.org/I1299303238 |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | Text |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | Proc (IEEE Conf Multimed Inf Process Retr) |
| locations[2].landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11526646 |
| indexed_in | crossref, pubmed |
| authorships[0].author.id | https://openalex.org/A5114275420 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Gowtham Medisetti |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I123534392 |
| authorships[0].affiliations[0].raw_affiliation_string | University of North Texas,Denton,TX,USA |
| authorships[0].institutions[0].id | https://openalex.org/I123534392 |
| authorships[0].institutions[0].ror | https://ror.org/00v97ad02 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I123534392 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | University of North Texas |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Gowtham Medisetti |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | University of North Texas,Denton,TX,USA |
| authorships[1].author.id | https://openalex.org/A5077171703 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-2015-3015 |
| authorships[1].author.display_name | Zacchaeus G. Compson |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I123534392 |
| authorships[1].affiliations[0].raw_affiliation_string | University of North Texas,Denton,TX,USA |
| authorships[1].institutions[0].id | https://openalex.org/I123534392 |
| authorships[1].institutions[0].ror | https://ror.org/00v97ad02 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I123534392 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | University of North Texas |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Zacchaeus Compson |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | University of North Texas,Denton,TX,USA |
| authorships[2].author.id | https://openalex.org/A5047220188 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-7033-3690 |
| authorships[2].author.display_name | Heng Fan |
| authorships[2].countries | US |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I123534392 |
| authorships[2].affiliations[0].raw_affiliation_string | University of North Texas,Denton,TX,USA |
| authorships[2].institutions[0].id | https://openalex.org/I123534392 |
| authorships[2].institutions[0].ror | https://ror.org/00v97ad02 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I123534392 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | University of North Texas |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Heng Fan |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | University of North Texas,Denton,TX,USA |
| authorships[3].author.id | https://openalex.org/A5062744580 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-9335-8201 |
| authorships[3].author.display_name | Huaxiao Yang |
| authorships[3].countries | US |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I123534392 |
| authorships[3].affiliations[0].raw_affiliation_string | University of North Texas,Denton,TX,USA |
| authorships[3].institutions[0].id | https://openalex.org/I123534392 |
| authorships[3].institutions[0].ror | https://ror.org/00v97ad02 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I123534392 |
| authorships[3].institutions[0].country_code | US |
| authorships[3].institutions[0].display_name | University of North Texas |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Huaxiao Yang |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | University of North Texas,Denton,TX,USA |
| authorships[4].author.id | https://openalex.org/A5073748933 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-6577-227X |
| authorships[4].author.display_name | Yunhe Feng |
| authorships[4].countries | US |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I123534392 |
| authorships[4].affiliations[0].raw_affiliation_string | University of North Texas,Denton,TX,USA |
| authorships[4].institutions[0].id | https://openalex.org/I123534392 |
| authorships[4].institutions[0].ror | https://ror.org/00v97ad02 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I123534392 |
| authorships[4].institutions[0].country_code | US |
| authorships[4].institutions[0].display_name | University of North Texas |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Yunhe Feng |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | University of North Texas,Denton,TX,USA |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11526646 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-10-16T00:00:00 |
| display_name | LitAI: Enhancing Multimodal Literature Understanding and Mining with Generative AI |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10181 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9994000196456909 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Natural Language Processing Techniques |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2380075625, https://openalex.org/W2390279801, https://openalex.org/W4391913857, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W3204019825 |
| cited_by_count | 3 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 3 |
| locations_count | 3 |
| best_oa_location.id | pmh:oai:pubmedcentral.nih.gov:11526646 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2764455111 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | PubMed Central |
| best_oa_location.source.host_organization | https://openalex.org/I1299303238 |
| best_oa_location.source.host_organization_name | National Institutes of Health |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I1299303238 |
| best_oa_location.license | |
| best_oa_location.pdf_url | |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | Text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | Proc (IEEE Conf Multimed Inf Process Retr) |
| best_oa_location.landing_page_url | https://www.ncbi.nlm.nih.gov/pmc/articles/11526646 |
| primary_location.id | doi:10.1109/mipr62202.2024.00080 |
| primary_location.is_oa | False |
| primary_location.source | |
| primary_location.license | |
| primary_location.pdf_url | |
| primary_location.version | publishedVersion |
| primary_location.raw_type | proceedings-article |
| primary_location.license_id | |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR) |
| primary_location.landing_page_url | https://doi.org/10.1109/mipr62202.2024.00080 |
| publication_date | 2024-08-07 |
| publication_year | 2024 |
| referenced_works | https://openalex.org/W2031094399, https://openalex.org/W4394779950, https://openalex.org/W2153554054, https://openalex.org/W6727255303, https://openalex.org/W2164314981, https://openalex.org/W2001642682, https://openalex.org/W4389045144, https://openalex.org/W2137998699, https://openalex.org/W2023704244, https://openalex.org/W4389519217, https://openalex.org/W4399069273, https://openalex.org/W2142809404, https://openalex.org/W6621906925, https://openalex.org/W4226182655, https://openalex.org/W4300081969 |
| referenced_works_count | 15 |
| abstract_inverted_index.a | 38 |
| abstract_inverted_index.AI | 46, 67, 95 |
| abstract_inverted_index.By | 56 |
| abstract_inverted_index.We | 81 |
| abstract_inverted_index.as | 60 |
| abstract_inverted_index.at | 133 |
| abstract_inverted_index.in | 4, 30 |
| abstract_inverted_index.is | 131 |
| abstract_inverted_index.of | 73, 116, 129 |
| abstract_inverted_index.on | 105 |
| abstract_inverted_index.to | 48, 96 |
| abstract_inverted_index.Our | 101 |
| abstract_inverted_index.PDF | 79 |
| abstract_inverted_index.The | 15, 127 |
| abstract_inverted_index.and | 2, 12, 18, 25, 76, 90, 110, 125 |
| abstract_inverted_index.are | 6 |
| abstract_inverted_index.for | 8 |
| abstract_inverted_index.our | 117 |
| abstract_inverted_index.the | 71, 108, 114 |
| abstract_inverted_index.This | 34 |
| abstract_inverted_index.from | 53, 78, 107 |
| abstract_inverted_index.have | 82 |
| abstract_inverted_index.over | 119 |
| abstract_inverted_index.such | 59 |
| abstract_inverted_index.that | 41, 86 |
| abstract_inverted_index.with | 65 |
| abstract_inverted_index.(OCR) | 64 |
| abstract_inverted_index.novel | 39 |
| abstract_inverted_index.paper | 35 |
| abstract_inverted_index.text, | 23, 74 |
| abstract_inverted_index.tools | 47, 58 |
| abstract_inverted_index.GPT-4. | 126 |
| abstract_inverted_index.prompt | 91 |
| abstract_inverted_index.within | 93 |
| abstract_inverted_index.achieve | 97 |
| abstract_inverted_index.diverse | 19 |
| abstract_inverted_index.employs | 42 |
| abstract_inverted_index.enhance | 49 |
| abstract_inverted_index.figures | 77 |
| abstract_inverted_index.optical | 61 |
| abstract_inverted_index.precise | 98 |
| abstract_inverted_index.present | 27 |
| abstract_inverted_index.prompts | 85 |
| abstract_inverted_index.readily | 43 |
| abstract_inverted_index.several | 120 |
| abstract_inverted_index.tables, | 24, 75 |
| abstract_inverted_index.approach | 40, 118 |
| abstract_inverted_index.critical | 7 |
| abstract_inverted_index.datasets | 106 |
| abstract_inverted_index.figures, | 26 |
| abstract_inverted_index.formats, | 21 |
| abstract_inverted_index.inherent | 16 |
| abstract_inverted_index.learning | 89 |
| abstract_inverted_index.leverage | 87 |
| abstract_inverted_index.research | 11 |
| abstract_inverted_index.specific | 84 |
| abstract_inverted_index.advancing | 9 |
| abstract_inverted_index.available | 44 |
| abstract_inverted_index.baselines | 122 |
| abstract_inverted_index.character | 62 |
| abstract_inverted_index.conducted | 104 |
| abstract_inverted_index.developed | 83 |
| abstract_inverted_index.empirical | 102 |
| abstract_inverted_index.including | 22, 123 |
| abstract_inverted_index.knowledge | 13 |
| abstract_inverted_index.retrieval | 3, 52, 72 |
| abstract_inverted_index.sciences, | 112 |
| abstract_inverted_index.services, | 68 |
| abstract_inverted_index.Generative | 94 |
| abstract_inverted_index.accessible | 132 |
| abstract_inverted_index.biological | 111 |
| abstract_inverted_index.challenges | 29 |
| abstract_inverted_index.discovery. | 14 |
| abstract_inverted_index.documents. | 55, 80 |
| abstract_inverted_index.ecological | 109 |
| abstract_inverted_index.generative | 45, 66 |
| abstract_inverted_index.in-context | 88 |
| abstract_inverted_index.introduces | 36 |
| abstract_inverted_index.literature | 5, 20, 31, 54 |
| abstract_inverted_index.multimodal | 50 |
| abstract_inverted_index.processing | 1 |
| abstract_inverted_index.retrieval. | 33 |
| abstract_inverted_index.scientific | 10 |
| abstract_inverted_index.Information | 0 |
| abstract_inverted_index.demonstrate | 113 |
| abstract_inverted_index.engineering | 92 |
| abstract_inverted_index.established | 121 |
| abstract_inverted_index.extraction. | 100 |
| abstract_inverted_index.facilitates | 70 |
| abstract_inverted_index.information | 32, 51, 99 |
| abstract_inverted_index.integrating | 57 |
| abstract_inverted_index.recognition | 63 |
| abstract_inverted_index.significant | 28 |
| abstract_inverted_index.superiority | 115 |
| abstract_inverted_index.<i>LitAI</i> | 69, 130 |
| abstract_inverted_index.evaluations, | 103 |
| abstract_inverted_index.<i>LitAI</i>, | 37 |
| abstract_inverted_index.Tesseract-OCR | 124 |
| abstract_inverted_index.multimodality | 17 |
| abstract_inverted_index.implementation | 128 |
| abstract_inverted_index.https://github.com/ResponsibleAILab/LitAI. | 134 |
| cited_by_percentile_year.max | 97 |
| cited_by_percentile_year.min | 96 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile.value | 0.85221317 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |