Surgical video workflow analysis via visual-language learning Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.1038/s44401-024-00010-3
Surgical video workflow analysis has made intensive development in computer-assisted surgery by combining deep learning models, aiming to enhance surgical scene analysis and decision-making. However, previous research has primarily focused on coarse-grained analysis of surgical videos, e.g., phase recognition, instrument recognition, and triplet recognition that only considers relationships within surgical triplets. In order to provide a more comprehensive fine-grained analysis of surgical videos, this work focuses on accurately identifying triplets < instrument , verb , target > from surgical videos. Specifically, we propose a vision-language deep learning framework that incorporates intra- and inter- triplet modeling, termed I 2 TM, to explore the relationships among triplets and leverage the model understanding of the entire surgical process, thereby enhancing the accuracy and robustness of recognition. Besides, we also develop a new surgical triplet semantic enhancer (TSE) to establish semantic relationships, both intra- and inter-triplets, across visual and textual modalities. Extensive experimental results on surgical video benchmark datasets demonstrate that our approach can capture finer semantics, achieve effective surgical video understanding and analysis, with potential for widespread medical applications.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1038/s44401-024-00010-3
- https://www.nature.com/articles/s44401-024-00010-3.pdf
- OA Status
- hybrid
- References
- 53
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4406813556
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4406813556Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1038/s44401-024-00010-3Digital Object Identifier
- Title
-
Surgical video workflow analysis via visual-language learningWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-01-25Full publication date if available
- Authors
-
Pengpeng Li, Xiangbo Shu, Chun-Mei Feng, Yifei Feng, Wangmeng Zuo, Jinhui TangList of authors in order
- Landing page
-
https://doi.org/10.1038/s44401-024-00010-3Publisher landing page
- PDF URL
-
https://www.nature.com/articles/s44401-024-00010-3.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
hybridOpen access status per OpenAlex
- OA URL
-
https://www.nature.com/articles/s44401-024-00010-3.pdfDirect OA link when available
- Concepts
-
Workflow, Computer science, Natural language processing, Multimedia, Artificial intelligence, DatabaseTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
53Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4406813556 |
|---|---|
| doi | https://doi.org/10.1038/s44401-024-00010-3 |
| ids.doi | https://doi.org/10.1038/s44401-024-00010-3 |
| ids.openalex | https://openalex.org/W4406813556 |
| fwci | 0.0 |
| type | article |
| title | Surgical video workflow analysis via visual-language learning |
| awards[0].id | https://openalex.org/G5512729199 |
| awards[0].funder_id | https://openalex.org/F4320321001 |
| awards[0].display_name | |
| awards[0].funder_award_id | 62222207 |
| awards[0].funder_display_name | National Natural Science Foundation of China |
| awards[1].id | https://openalex.org/G6329772822 |
| awards[1].funder_id | https://openalex.org/F4320321001 |
| awards[1].display_name | |
| awards[1].funder_award_id | 61925204 |
| awards[1].funder_display_name | National Natural Science Foundation of China |
| biblio.issue | 1 |
| biblio.volume | 2 |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10916 |
| topics[0].field.id | https://openalex.org/fields/27 |
| topics[0].field.display_name | Medicine |
| topics[0].score | 0.9983999729156494 |
| topics[0].domain.id | https://openalex.org/domains/4 |
| topics[0].domain.display_name | Health Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2746 |
| topics[0].subfield.display_name | Surgery |
| topics[0].display_name | Surgical Simulation and Training |
| topics[1].id | https://openalex.org/T13953 |
| topics[1].field.id | https://openalex.org/fields/27 |
| topics[1].field.display_name | Medicine |
| topics[1].score | 0.9962999820709229 |
| topics[1].domain.id | https://openalex.org/domains/4 |
| topics[1].domain.display_name | Health Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2739 |
| topics[1].subfield.display_name | Public Health, Environmental and Occupational Health |
| topics[1].display_name | Digital Imaging in Medicine |
| topics[2].id | https://openalex.org/T11894 |
| topics[2].field.id | https://openalex.org/fields/27 |
| topics[2].field.display_name | Medicine |
| topics[2].score | 0.9900000095367432 |
| topics[2].domain.id | https://openalex.org/domains/4 |
| topics[2].domain.display_name | Health Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2741 |
| topics[2].subfield.display_name | Radiology, Nuclear Medicine and Imaging |
| topics[2].display_name | Radiology practices and education |
| funders[0].id | https://openalex.org/F4320321001 |
| funders[0].ror | https://ror.org/01h0zpd94 |
| funders[0].display_name | National Natural Science Foundation of China |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C177212765 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7157450318336487 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q627335 |
| concepts[0].display_name | Workflow |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.6606315970420837 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C204321447 |
| concepts[2].level | 1 |
| concepts[2].score | 0.3921440541744232 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q30642 |
| concepts[2].display_name | Natural language processing |
| concepts[3].id | https://openalex.org/C49774154 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3713897466659546 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q131765 |
| concepts[3].display_name | Multimedia |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.34703364968299866 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C77088390 |
| concepts[5].level | 1 |
| concepts[5].score | 0.09634050726890564 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q8513 |
| concepts[5].display_name | Database |
| keywords[0].id | https://openalex.org/keywords/workflow |
| keywords[0].score | 0.7157450318336487 |
| keywords[0].display_name | Workflow |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.6606315970420837 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/natural-language-processing |
| keywords[2].score | 0.3921440541744232 |
| keywords[2].display_name | Natural language processing |
| keywords[3].id | https://openalex.org/keywords/multimedia |
| keywords[3].score | 0.3713897466659546 |
| keywords[3].display_name | Multimedia |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.34703364968299866 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/database |
| keywords[5].score | 0.09634050726890564 |
| keywords[5].display_name | Database |
| language | en |
| locations[0].id | doi:10.1038/s44401-024-00010-3 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S5407042918 |
| locations[0].source.issn | 3005-1959 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | 3005-1959 |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | npj Health Systems |
| locations[0].source.host_organization | |
| locations[0].source.host_organization_name | |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.nature.com/articles/s44401-024-00010-3.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | npj Health Systems |
| locations[0].landing_page_url | https://doi.org/10.1038/s44401-024-00010-3 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5069405687 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-7563-601X |
| authorships[0].author.display_name | Pengpeng Li |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Pengpeng Li |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5040437528 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-4902-4663 |
| authorships[1].author.display_name | Xiangbo Shu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Xiangbo Shu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5049444898 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-3025-8964 |
| authorships[2].author.display_name | Chun-Mei Feng |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Chun-Mei Feng |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5071041884 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-9394-4814 |
| authorships[3].author.display_name | Yifei Feng |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Yifei Feng |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100636655 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-3330-783X |
| authorships[4].author.display_name | Wangmeng Zuo |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wangmeng Zuo |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5035112538 |
| authorships[5].author.orcid | https://orcid.org/0000-0001-9008-222X |
| authorships[5].author.display_name | Jinhui Tang |
| authorships[5].author_position | last |
| authorships[5].raw_author_name | Jinhui Tang |
| authorships[5].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.nature.com/articles/s44401-024-00010-3.pdf |
| open_access.oa_status | hybrid |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Surgical video workflow analysis via visual-language learning |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10916 |
| primary_topic.field.id | https://openalex.org/fields/27 |
| primary_topic.field.display_name | Medicine |
| primary_topic.score | 0.9983999729156494 |
| primary_topic.domain.id | https://openalex.org/domains/4 |
| primary_topic.domain.display_name | Health Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2746 |
| primary_topic.subfield.display_name | Surgery |
| primary_topic.display_name | Surgical Simulation and Training |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W1981780420, https://openalex.org/W2182707996, https://openalex.org/W45233828, https://openalex.org/W2964988449, https://openalex.org/W2397952901, https://openalex.org/W2029380707, https://openalex.org/W3204019825 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.1038/s44401-024-00010-3 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S5407042918 |
| best_oa_location.source.issn | 3005-1959 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | 3005-1959 |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | npj Health Systems |
| best_oa_location.source.host_organization | |
| best_oa_location.source.host_organization_name | |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.nature.com/articles/s44401-024-00010-3.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | npj Health Systems |
| best_oa_location.landing_page_url | https://doi.org/10.1038/s44401-024-00010-3 |
| primary_location.id | doi:10.1038/s44401-024-00010-3 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S5407042918 |
| primary_location.source.issn | 3005-1959 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | 3005-1959 |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | npj Health Systems |
| primary_location.source.host_organization | |
| primary_location.source.host_organization_name | |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.nature.com/articles/s44401-024-00010-3.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | npj Health Systems |
| primary_location.landing_page_url | https://doi.org/10.1038/s44401-024-00010-3 |
| publication_date | 2025-01-25 |
| publication_year | 2025 |
| referenced_works | https://openalex.org/W2509882994, https://openalex.org/W4243063216, https://openalex.org/W3203204495, https://openalex.org/W4282969740, https://openalex.org/W2777273430, https://openalex.org/W4309095845, https://openalex.org/W3087253276, https://openalex.org/W2884036902, https://openalex.org/W4311633645, https://openalex.org/W4388321696, https://openalex.org/W3091747220, https://openalex.org/W3207159596, https://openalex.org/W3091935435, https://openalex.org/W3091895136, https://openalex.org/W4385324520, https://openalex.org/W3198379050, https://openalex.org/W4366985998, https://openalex.org/W4285818716, https://openalex.org/W4387211888, https://openalex.org/W4388186341, https://openalex.org/W6601949647, https://openalex.org/W6601487527, https://openalex.org/W4392172801, https://openalex.org/W3135745701, https://openalex.org/W4385300713, https://openalex.org/W4316128849, https://openalex.org/W4383070152, https://openalex.org/W3092562667, https://openalex.org/W4387225560, https://openalex.org/W2995887567, https://openalex.org/W4280517006, https://openalex.org/W2580456502, https://openalex.org/W4387845481, https://openalex.org/W4210656949, https://openalex.org/W4323921292, https://openalex.org/W2884361029, https://openalex.org/W4310858512, https://openalex.org/W4380631538, https://openalex.org/W4379740528, https://openalex.org/W4360604432, https://openalex.org/W6828894009, https://openalex.org/W3151130473, https://openalex.org/W4386071863, https://openalex.org/W3007075806, https://openalex.org/W3004952083, https://openalex.org/W3135367836, https://openalex.org/W4386076454, https://openalex.org/W4387075457, https://openalex.org/W2041004593, https://openalex.org/W2108598243, https://openalex.org/W2991391304, https://openalex.org/W4365143687, https://openalex.org/W2266464013 |
| referenced_works_count | 53 |
| abstract_inverted_index., | 73, 75 |
| abstract_inverted_index.2 | 98 |
| abstract_inverted_index.I | 97 |
| abstract_inverted_index.a | 56, 84, 128 |
| abstract_inverted_index.In | 52 |
| abstract_inverted_index.by | 12 |
| abstract_inverted_index.in | 9 |
| abstract_inverted_index.of | 34, 61, 111, 122 |
| abstract_inverted_index.on | 31, 67, 151 |
| abstract_inverted_index.to | 18, 54, 100, 135 |
| abstract_inverted_index.we | 82, 125 |
| abstract_inverted_index.TM, | 99 |
| abstract_inverted_index.and | 23, 42, 92, 106, 120, 141, 145, 169 |
| abstract_inverted_index.can | 160 |
| abstract_inverted_index.for | 173 |
| abstract_inverted_index.has | 5, 28 |
| abstract_inverted_index.new | 129 |
| abstract_inverted_index.our | 158 |
| abstract_inverted_index.the | 102, 108, 112, 118 |
| abstract_inverted_index.> | 77 |
| abstract_inverted_index.< | 71 |
| abstract_inverted_index.also | 126 |
| abstract_inverted_index.both | 139 |
| abstract_inverted_index.deep | 14, 86 |
| abstract_inverted_index.from | 78 |
| abstract_inverted_index.made | 6 |
| abstract_inverted_index.more | 57 |
| abstract_inverted_index.only | 46 |
| abstract_inverted_index.that | 45, 89, 157 |
| abstract_inverted_index.this | 64 |
| abstract_inverted_index.verb | 74 |
| abstract_inverted_index.with | 171 |
| abstract_inverted_index.work | 65 |
| abstract_inverted_index.(TSE) | 134 |
| abstract_inverted_index.among | 104 |
| abstract_inverted_index.e.g., | 37 |
| abstract_inverted_index.finer | 162 |
| abstract_inverted_index.model | 109 |
| abstract_inverted_index.order | 53 |
| abstract_inverted_index.phase | 38 |
| abstract_inverted_index.scene | 21 |
| abstract_inverted_index.video | 2, 153, 167 |
| abstract_inverted_index.across | 143 |
| abstract_inverted_index.aiming | 17 |
| abstract_inverted_index.entire | 113 |
| abstract_inverted_index.inter- | 93 |
| abstract_inverted_index.intra- | 91, 140 |
| abstract_inverted_index.target | 76 |
| abstract_inverted_index.termed | 96 |
| abstract_inverted_index.visual | 144 |
| abstract_inverted_index.within | 49 |
| abstract_inverted_index.achieve | 164 |
| abstract_inverted_index.capture | 161 |
| abstract_inverted_index.develop | 127 |
| abstract_inverted_index.enhance | 19 |
| abstract_inverted_index.explore | 101 |
| abstract_inverted_index.focused | 30 |
| abstract_inverted_index.focuses | 66 |
| abstract_inverted_index.medical | 175 |
| abstract_inverted_index.models, | 16 |
| abstract_inverted_index.propose | 83 |
| abstract_inverted_index.provide | 55 |
| abstract_inverted_index.results | 150 |
| abstract_inverted_index.surgery | 11 |
| abstract_inverted_index.textual | 146 |
| abstract_inverted_index.thereby | 116 |
| abstract_inverted_index.triplet | 43, 94, 131 |
| abstract_inverted_index.videos, | 36, 63 |
| abstract_inverted_index.videos. | 80 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.Besides, | 124 |
| abstract_inverted_index.However, | 25 |
| abstract_inverted_index.Surgical | 1 |
| abstract_inverted_index.accuracy | 119 |
| abstract_inverted_index.analysis | 4, 22, 33, 60 |
| abstract_inverted_index.approach | 159 |
| abstract_inverted_index.datasets | 155 |
| abstract_inverted_index.enhancer | 133 |
| abstract_inverted_index.learning | 15, 87 |
| abstract_inverted_index.leverage | 107 |
| abstract_inverted_index.previous | 26 |
| abstract_inverted_index.process, | 115 |
| abstract_inverted_index.research | 27 |
| abstract_inverted_index.semantic | 132, 137 |
| abstract_inverted_index.surgical | 20, 35, 50, 62, 79, 114, 130, 152, 166 |
| abstract_inverted_index.triplets | 70, 105 |
| abstract_inverted_index.workflow | 3 |
| abstract_inverted_index.Extensive | 148 |
| abstract_inverted_index.analysis, | 170 |
| abstract_inverted_index.benchmark | 154 |
| abstract_inverted_index.combining | 13 |
| abstract_inverted_index.considers | 47 |
| abstract_inverted_index.effective | 165 |
| abstract_inverted_index.enhancing | 117 |
| abstract_inverted_index.establish | 136 |
| abstract_inverted_index.framework | 88 |
| abstract_inverted_index.intensive | 7 |
| abstract_inverted_index.modeling, | 95 |
| abstract_inverted_index.potential | 172 |
| abstract_inverted_index.primarily | 29 |
| abstract_inverted_index.triplets. | 51 |
| abstract_inverted_index.accurately | 68 |
| abstract_inverted_index.instrument | 40, 72 |
| abstract_inverted_index.robustness | 121 |
| abstract_inverted_index.semantics, | 163 |
| abstract_inverted_index.widespread | 174 |
| abstract_inverted_index.demonstrate | 156 |
| abstract_inverted_index.development | 8 |
| abstract_inverted_index.identifying | 69 |
| abstract_inverted_index.modalities. | 147 |
| abstract_inverted_index.recognition | 44 |
| abstract_inverted_index.experimental | 149 |
| abstract_inverted_index.fine-grained | 59 |
| abstract_inverted_index.incorporates | 90 |
| abstract_inverted_index.recognition, | 39, 41 |
| abstract_inverted_index.recognition. | 123 |
| abstract_inverted_index.Specifically, | 81 |
| abstract_inverted_index.applications. | 176 |
| abstract_inverted_index.comprehensive | 58 |
| abstract_inverted_index.relationships | 48, 103 |
| abstract_inverted_index.understanding | 110, 168 |
| abstract_inverted_index.coarse-grained | 32 |
| abstract_inverted_index.relationships, | 138 |
| abstract_inverted_index.inter-triplets, | 142 |
| abstract_inverted_index.vision-language | 85 |
| abstract_inverted_index.decision-making. | 24 |
| abstract_inverted_index.computer-assisted | 10 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 6 |
| citation_normalized_percentile.value | 0.03539221 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | True |