HiCoS-Net: hierarchical cross-modal graph learning with dynamic attention for hard negative-aware image-text matching Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.1007/s44443-025-00313-x
Fine-grained image-text matching, which is pivotal to multimodal intelligence, has advanced semantic correspondence inference through inter-modal region-word aggregation. Despite the efficacy of this approach, it remains limited by its inability to accommodate the semantic associations of hard negative samples. To illustrate this point, consider the failure to leverage shared knowledge across multiple samples on analogous topics. This failure results in an inadequate capacity to differentiate hard negative samples. In this study, it is posited that the establishment of sample relationships facilitates the learning of semantic associations between different samples. This, in turn, enables the effective identification of subtle differences between hard negative samples, thereby enhancing the overall embedding process. The proposal of HiCoS-Net is the subject of this paper. The proposed model is a novel hierarchical inter-modal semantic network that learns robust embeddings through local-to-sample semantic interaction propagation. Specifically, at the local level, a dynamic graph attention mechanism is designed to achieve region-lexicon fine-grained interactions; at the sample level, an embedding similarity graph is constructed by combining the relational mapping matrix with the semantic matching matrix to explicitly model the topological associations and semantic coupling strengths of inter-modal samples. A substantial programme of experimentation is undertaken to validate the advantages of the proposed HiCoS-Net method. This has been demonstrated to achieve state-of-the-art image-text matching results on the public benchmark datasets Flickr30K and MS-COCO.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1007/s44443-025-00313-x
- https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdf
- OA Status
- gold
- References
- 41
- OpenAlex ID
- https://openalex.org/W4415618821
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415618821Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1007/s44443-025-00313-xDigital Object Identifier
- Title
-
HiCoS-Net: hierarchical cross-modal graph learning with dynamic attention for hard negative-aware image-text matchingWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-10-28Full publication date if available
- Authors
-
Dingcheng Feng, Ning Luo, Shudong Zhang, Lijuan Zhou, Bing WeiList of authors in order
- Landing page
-
https://doi.org/10.1007/s44443-025-00313-xPublisher landing page
- PDF URL
-
https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdfDirect OA link when available
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
41Number of works referenced by this work
Full payload
| id | https://openalex.org/W4415618821 |
|---|---|
| doi | https://doi.org/10.1007/s44443-025-00313-x |
| ids.doi | https://doi.org/10.1007/s44443-025-00313-x |
| ids.openalex | https://openalex.org/W4415618821 |
| fwci | |
| type | article |
| title | HiCoS-Net: hierarchical cross-modal graph learning with dynamic attention for hard negative-aware image-text matching |
| biblio.issue | 9 |
| biblio.volume | 37 |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list.value | 1350 |
| apc_list.currency | USD |
| apc_list.value_usd | 1350 |
| apc_paid.value | 1350 |
| apc_paid.currency | USD |
| apc_paid.value_usd | 1350 |
| language | en |
| locations[0].id | doi:10.1007/s44443-025-00313-x |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S2764955546 |
| locations[0].source.issn | 1319-1578, 2213-1248 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 1319-1578 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | Journal of King Saud University - Computer and Information Sciences |
| locations[0].source.host_organization | https://openalex.org/P4310320990 |
| locations[0].source.host_organization_name | Elsevier BV |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310320990 |
| locations[0].source.host_organization_lineage_names | Elsevier BV |
| locations[0].license | cc-by-nc-nd |
| locations[0].pdf_url | https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by-nc-nd |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Journal of King Saud University Computer and Information Sciences |
| locations[0].landing_page_url | https://doi.org/10.1007/s44443-025-00313-x |
| locations[1].id | pmh:oai:doaj.org/article:c6ce5a05115444f7b80d31cda2058bf7 |
| locations[1].is_oa | False |
| locations[1].source.id | https://openalex.org/S4306401280 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | DOAJ (DOAJ: Directory of Open Access Journals) |
| locations[1].source.host_organization | |
| locations[1].source.host_organization_name | |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | Journal of King Saud University: Computer and Information Sciences, Vol 37, Iss 9, Pp 1-30 (2025) |
| locations[1].landing_page_url | https://doaj.org/article/c6ce5a05115444f7b80d31cda2058bf7 |
| indexed_in | crossref, doaj |
| authorships[0].author.id | https://openalex.org/A5030996490 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Dingcheng Feng |
| authorships[0].countries | CN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I20942203 |
| authorships[0].affiliations[0].raw_affiliation_string | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[0].institutions[0].id | https://openalex.org/I20942203 |
| authorships[0].institutions[0].ror | https://ror.org/03q648j11 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I20942203 |
| authorships[0].institutions[0].country_code | CN |
| authorships[0].institutions[0].display_name | Hainan University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Dingcheng Feng |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[1].author.id | https://openalex.org/A5104330299 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Ning Luo |
| authorships[1].countries | CN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I20942203 |
| authorships[1].affiliations[0].raw_affiliation_string | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[1].institutions[0].id | https://openalex.org/I20942203 |
| authorships[1].institutions[0].ror | https://ror.org/03q648j11 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I20942203 |
| authorships[1].institutions[0].country_code | CN |
| authorships[1].institutions[0].display_name | Hainan University |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ning Luo |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[2].author.id | https://openalex.org/A5078758349 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-7721-0167 |
| authorships[2].author.display_name | Shudong Zhang |
| authorships[2].countries | CN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I20942203 |
| authorships[2].affiliations[0].raw_affiliation_string | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[2].institutions[0].id | https://openalex.org/I20942203 |
| authorships[2].institutions[0].ror | https://ror.org/03q648j11 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I20942203 |
| authorships[2].institutions[0].country_code | CN |
| authorships[2].institutions[0].display_name | Hainan University |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Shudong Zhang |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[3].author.id | https://openalex.org/A5101798227 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-9951-6837 |
| authorships[3].author.display_name | Lijuan Zhou |
| authorships[3].countries | CN |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I20942203 |
| authorships[3].affiliations[0].raw_affiliation_string | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[3].institutions[0].id | https://openalex.org/I20942203 |
| authorships[3].institutions[0].ror | https://ror.org/03q648j11 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I20942203 |
| authorships[3].institutions[0].country_code | CN |
| authorships[3].institutions[0].display_name | Hainan University |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Lijuan Zhou |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[4].author.id | https://openalex.org/A5109716075 |
| authorships[4].author.orcid | |
| authorships[4].author.display_name | Bing Wei |
| authorships[4].countries | CN |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I20942203 |
| authorships[4].affiliations[0].raw_affiliation_string | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| authorships[4].institutions[0].id | https://openalex.org/I20942203 |
| authorships[4].institutions[0].ror | https://ror.org/03q648j11 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I20942203 |
| authorships[4].institutions[0].country_code | CN |
| authorships[4].institutions[0].display_name | Hainan University |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Bing Wei |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | School of Cyberspace Security, Hainan University, No. 58, Renmin Avenue, Haikou, Hainan, 570228, China |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdf |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-28T00:00:00 |
| display_name | HiCoS-Net: hierarchical cross-modal graph learning with dynamic attention for hard negative-aware image-text matching |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | doi:10.1007/s44443-025-00313-x |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2764955546 |
| best_oa_location.source.issn | 1319-1578, 2213-1248 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 1319-1578 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | Journal of King Saud University - Computer and Information Sciences |
| best_oa_location.source.host_organization | https://openalex.org/P4310320990 |
| best_oa_location.source.host_organization_name | Elsevier BV |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310320990 |
| best_oa_location.source.host_organization_lineage_names | Elsevier BV |
| best_oa_location.license | cc-by-nc-nd |
| best_oa_location.pdf_url | https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by-nc-nd |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Journal of King Saud University Computer and Information Sciences |
| best_oa_location.landing_page_url | https://doi.org/10.1007/s44443-025-00313-x |
| primary_location.id | doi:10.1007/s44443-025-00313-x |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S2764955546 |
| primary_location.source.issn | 1319-1578, 2213-1248 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 1319-1578 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | Journal of King Saud University - Computer and Information Sciences |
| primary_location.source.host_organization | https://openalex.org/P4310320990 |
| primary_location.source.host_organization_name | Elsevier BV |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310320990 |
| primary_location.source.host_organization_lineage_names | Elsevier BV |
| primary_location.license | cc-by-nc-nd |
| primary_location.pdf_url | https://link.springer.com/content/pdf/10.1007/s44443-025-00313-x.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by-nc-nd |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Journal of King Saud University Computer and Information Sciences |
| primary_location.landing_page_url | https://doi.org/10.1007/s44443-025-00313-x |
| publication_date | 2025-10-28 |
| publication_year | 2025 |
| referenced_works | https://openalex.org/W2967957126, https://openalex.org/W1933349210, https://openalex.org/W2803686446, https://openalex.org/W4391462706, https://openalex.org/W4283276478, https://openalex.org/W3146366485, https://openalex.org/W2767724106, https://openalex.org/W2594167370, https://openalex.org/W3168997536, https://openalex.org/W3005339439, https://openalex.org/W4403535081, https://openalex.org/W2951459719, https://openalex.org/W3110446398, https://openalex.org/W3043149731, https://openalex.org/W4210894218, https://openalex.org/W1973445088, https://openalex.org/W3166285241, https://openalex.org/W4387211558, https://openalex.org/W2895340641, https://openalex.org/W3213100861, https://openalex.org/W4393160904, https://openalex.org/W4390720784, https://openalex.org/W4312977351, https://openalex.org/W2470673105, https://openalex.org/W2585123518, https://openalex.org/W2983446232, https://openalex.org/W4394805464, https://openalex.org/W2463955103, https://openalex.org/W2117301471, https://openalex.org/W4323797084, https://openalex.org/W4285602451, https://openalex.org/W1773149199, https://openalex.org/W4312761738, https://openalex.org/W2994818707, https://openalex.org/W3035454331, https://openalex.org/W3189718596, https://openalex.org/W4285601999, https://openalex.org/W3035588244, https://openalex.org/W3175888430, https://openalex.org/W3035605030, https://openalex.org/W3198377975 |
| referenced_works_count | 41 |
| abstract_inverted_index.A | 190 |
| abstract_inverted_index.a | 124, 144 |
| abstract_inverted_index.In | 69 |
| abstract_inverted_index.To | 40 |
| abstract_inverted_index.an | 61, 160 |
| abstract_inverted_index.at | 140, 156 |
| abstract_inverted_index.by | 28, 166 |
| abstract_inverted_index.in | 60, 91 |
| abstract_inverted_index.is | 5, 73, 114, 123, 149, 164, 195 |
| abstract_inverted_index.it | 25, 72 |
| abstract_inverted_index.of | 22, 36, 78, 84, 97, 112, 117, 187, 193, 201 |
| abstract_inverted_index.on | 54, 216 |
| abstract_inverted_index.to | 7, 31, 47, 64, 151, 177, 197, 210 |
| abstract_inverted_index.The | 110, 120 |
| abstract_inverted_index.and | 183, 222 |
| abstract_inverted_index.has | 10, 207 |
| abstract_inverted_index.its | 29 |
| abstract_inverted_index.the | 20, 33, 45, 76, 82, 94, 106, 115, 141, 157, 168, 173, 180, 199, 202, 217 |
| abstract_inverted_index.This | 57, 206 |
| abstract_inverted_index.been | 208 |
| abstract_inverted_index.hard | 37, 66, 101 |
| abstract_inverted_index.that | 75, 130 |
| abstract_inverted_index.this | 23, 42, 70, 118 |
| abstract_inverted_index.with | 172 |
| abstract_inverted_index.This, | 90 |
| abstract_inverted_index.graph | 146, 163 |
| abstract_inverted_index.local | 142 |
| abstract_inverted_index.model | 122, 179 |
| abstract_inverted_index.novel | 125 |
| abstract_inverted_index.turn, | 92 |
| abstract_inverted_index.which | 4 |
| abstract_inverted_index.across | 51 |
| abstract_inverted_index.learns | 131 |
| abstract_inverted_index.level, | 143, 159 |
| abstract_inverted_index.matrix | 171, 176 |
| abstract_inverted_index.paper. | 119 |
| abstract_inverted_index.point, | 43 |
| abstract_inverted_index.public | 218 |
| abstract_inverted_index.robust | 132 |
| abstract_inverted_index.sample | 79, 158 |
| abstract_inverted_index.shared | 49 |
| abstract_inverted_index.study, | 71 |
| abstract_inverted_index.subtle | 98 |
| abstract_inverted_index.Despite | 19 |
| abstract_inverted_index.achieve | 152, 211 |
| abstract_inverted_index.between | 87, 100 |
| abstract_inverted_index.dynamic | 145 |
| abstract_inverted_index.enables | 93 |
| abstract_inverted_index.failure | 46, 58 |
| abstract_inverted_index.limited | 27 |
| abstract_inverted_index.mapping | 170 |
| abstract_inverted_index.method. | 205 |
| abstract_inverted_index.network | 129 |
| abstract_inverted_index.overall | 107 |
| abstract_inverted_index.pivotal | 6 |
| abstract_inverted_index.posited | 74 |
| abstract_inverted_index.remains | 26 |
| abstract_inverted_index.results | 59, 215 |
| abstract_inverted_index.samples | 53 |
| abstract_inverted_index.subject | 116 |
| abstract_inverted_index.thereby | 104 |
| abstract_inverted_index.through | 15, 134 |
| abstract_inverted_index.topics. | 56 |
| abstract_inverted_index.Abstract | 0 |
| abstract_inverted_index.MS-COCO. | 223 |
| abstract_inverted_index.advanced | 11 |
| abstract_inverted_index.capacity | 63 |
| abstract_inverted_index.consider | 44 |
| abstract_inverted_index.coupling | 185 |
| abstract_inverted_index.datasets | 220 |
| abstract_inverted_index.designed | 150 |
| abstract_inverted_index.efficacy | 21 |
| abstract_inverted_index.learning | 83 |
| abstract_inverted_index.leverage | 48 |
| abstract_inverted_index.matching | 175, 214 |
| abstract_inverted_index.multiple | 52 |
| abstract_inverted_index.negative | 38, 67, 102 |
| abstract_inverted_index.process. | 109 |
| abstract_inverted_index.proposal | 111 |
| abstract_inverted_index.proposed | 121, 203 |
| abstract_inverted_index.samples, | 103 |
| abstract_inverted_index.samples. | 39, 68, 89, 189 |
| abstract_inverted_index.semantic | 12, 34, 85, 128, 136, 174, 184 |
| abstract_inverted_index.validate | 198 |
| abstract_inverted_index.Flickr30K | 221 |
| abstract_inverted_index.HiCoS-Net | 113, 204 |
| abstract_inverted_index.analogous | 55 |
| abstract_inverted_index.approach, | 24 |
| abstract_inverted_index.attention | 147 |
| abstract_inverted_index.benchmark | 219 |
| abstract_inverted_index.combining | 167 |
| abstract_inverted_index.different | 88 |
| abstract_inverted_index.effective | 95 |
| abstract_inverted_index.embedding | 108, 161 |
| abstract_inverted_index.enhancing | 105 |
| abstract_inverted_index.inability | 30 |
| abstract_inverted_index.inference | 14 |
| abstract_inverted_index.knowledge | 50 |
| abstract_inverted_index.matching, | 3 |
| abstract_inverted_index.mechanism | 148 |
| abstract_inverted_index.programme | 192 |
| abstract_inverted_index.strengths | 186 |
| abstract_inverted_index.advantages | 200 |
| abstract_inverted_index.embeddings | 133 |
| abstract_inverted_index.explicitly | 178 |
| abstract_inverted_index.illustrate | 41 |
| abstract_inverted_index.image-text | 2, 213 |
| abstract_inverted_index.inadequate | 62 |
| abstract_inverted_index.multimodal | 8 |
| abstract_inverted_index.relational | 169 |
| abstract_inverted_index.similarity | 162 |
| abstract_inverted_index.undertaken | 196 |
| abstract_inverted_index.accommodate | 32 |
| abstract_inverted_index.constructed | 165 |
| abstract_inverted_index.differences | 99 |
| abstract_inverted_index.facilitates | 81 |
| abstract_inverted_index.inter-modal | 16, 127, 188 |
| abstract_inverted_index.interaction | 137 |
| abstract_inverted_index.region-word | 17 |
| abstract_inverted_index.substantial | 191 |
| abstract_inverted_index.topological | 181 |
| abstract_inverted_index.Fine-grained | 1 |
| abstract_inverted_index.aggregation. | 18 |
| abstract_inverted_index.associations | 35, 86, 182 |
| abstract_inverted_index.demonstrated | 209 |
| abstract_inverted_index.fine-grained | 154 |
| abstract_inverted_index.hierarchical | 126 |
| abstract_inverted_index.propagation. | 138 |
| abstract_inverted_index.Specifically, | 139 |
| abstract_inverted_index.differentiate | 65 |
| abstract_inverted_index.establishment | 77 |
| abstract_inverted_index.intelligence, | 9 |
| abstract_inverted_index.interactions; | 155 |
| abstract_inverted_index.relationships | 80 |
| abstract_inverted_index.correspondence | 13 |
| abstract_inverted_index.identification | 96 |
| abstract_inverted_index.region-lexicon | 153 |
| abstract_inverted_index.experimentation | 194 |
| abstract_inverted_index.local-to-sample | 135 |
| abstract_inverted_index.state-of-the-art | 212 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |