Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2412.20201
Weakly Supervised Monitoring Anomaly Detection (WSMAD) utilizes weak supervision learning to identify anomalies, a critical task for smart city monitoring. However, existing multimodal approaches often fail to meet the real-time and interpretability requirements of edge devices due to their complexity. This paper presents TCVADS (Two-stage Cross-modal Video Anomaly Detection System), which leverages knowledge distillation and cross-modal contrastive learning to enable efficient, accurate, and interpretable anomaly detection on edge devices.TCVADS operates in two stages: coarse-grained rapid classification and fine-grained detailed analysis. In the first stage, TCVADS extracts features from video frames and inputs them into a time series analysis module, which acts as the teacher model. Insights are then transferred via knowledge distillation to a simplified convolutional network (student model) for binary classification. Upon detecting an anomaly, the second stage is triggered, employing a fine-grained multi-class classification model. This stage uses CLIP for cross-modal contrastive learning with text and images, enhancing interpretability and achieving refined classification through specially designed triplet textual relationships. Experimental results demonstrate that TCVADS significantly outperforms existing methods in model performance, detection efficiency, and interpretability, offering valuable contributions to smart city monitoring applications.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.20201
- https://arxiv.org/pdf/2412.20201
- OA Status
- green
- Cited By
- 1
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405955673
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405955673Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.20201Digital Object Identifier
- Title
-
Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection SystemsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-28Full publication date if available
- Authors
-
Wendong Jiang, Chih‐Yung Chang, Hsiang-Chuan Chang, Jiyuan Chen, Diptendu Sinha RoyList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.20201Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.20201Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.20201Direct OA link when available
- Concepts
-
Anomaly detection, Computer science, Anomaly (physics), Artificial intelligence, Physics, Condensed matter physicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405955673 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.20201 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.20201 |
| ids.openalex | https://openalex.org/W4405955673 |
| fwci | |
| type | preprint |
| title | Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11512 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9998000264167786 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Anomaly Detection Techniques and Applications |
| topics[1].id | https://openalex.org/T10400 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9968000054359436 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1705 |
| topics[1].subfield.display_name | Computer Networks and Communications |
| topics[1].display_name | Network Security and Intrusion Detection |
| topics[2].id | https://openalex.org/T12357 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9805999994277954 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Digital Media Forensic Detection |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C739882 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7693709135055542 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q3560506 |
| concepts[0].display_name | Anomaly detection |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.5128910541534424 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C12997251 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5120935440063477 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q567560 |
| concepts[2].display_name | Anomaly (physics) |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.3043139576911926 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C121332964 |
| concepts[4].level | 0 |
| concepts[4].score | 0.1311812400817871 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[4].display_name | Physics |
| concepts[5].id | https://openalex.org/C26873012 |
| concepts[5].level | 1 |
| concepts[5].score | 0.0 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q214781 |
| concepts[5].display_name | Condensed matter physics |
| keywords[0].id | https://openalex.org/keywords/anomaly-detection |
| keywords[0].score | 0.7693709135055542 |
| keywords[0].display_name | Anomaly detection |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.5128910541534424 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/anomaly |
| keywords[2].score | 0.5120935440063477 |
| keywords[2].display_name | Anomaly (physics) |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.3043139576911926 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/physics |
| keywords[4].score | 0.1311812400817871 |
| keywords[4].display_name | Physics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.20201 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.20201 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.20201 |
| locations[1].id | doi:10.48550/arxiv.2412.20201 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.20201 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5047519765 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Wendong Jiang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Jiang, Wen-Dong |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5002582301 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-0672-5593 |
| authorships[1].author.display_name | Chih‐Yung Chang |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Chang, Chih-Yung |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5034111713 |
| authorships[2].author.orcid | https://orcid.org/0009-0003-5059-4484 |
| authorships[2].author.display_name | Hsiang-Chuan Chang |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Chang, Hsiang-Chuan |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5001942707 |
| authorships[3].author.orcid | https://orcid.org/0000-0003-4103-8294 |
| authorships[3].author.display_name | Jiyuan Chen |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Chen, Ji-Yuan |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5011943818 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-9731-2534 |
| authorships[4].author.display_name | Diptendu Sinha Roy |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Roy, Diptendu Sinha |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.20201 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Injecting Explainability and Lightweight Design into Weakly Supervised Video Anomaly Detection Systems |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11512 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9998000264167786 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Anomaly Detection Techniques and Applications |
| related_works | https://openalex.org/W2806741695, https://openalex.org/W4290647774, https://openalex.org/W3189286258, https://openalex.org/W3207797160, https://openalex.org/W3210364259, https://openalex.org/W4300558037, https://openalex.org/W2667207928, https://openalex.org/W2912112202, https://openalex.org/W4377864969, https://openalex.org/W3120251014 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.20201 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.20201 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.20201 |
| primary_location.id | pmh:oai:arXiv.org:2412.20201 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.20201 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.20201 |
| publication_date | 2024-12-28 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 13, 94, 113, 132 |
| abstract_inverted_index.In | 80 |
| abstract_inverted_index.an | 124 |
| abstract_inverted_index.as | 101 |
| abstract_inverted_index.in | 70, 170 |
| abstract_inverted_index.is | 129 |
| abstract_inverted_index.of | 33 |
| abstract_inverted_index.on | 66 |
| abstract_inverted_index.to | 10, 26, 37, 58, 112, 180 |
| abstract_inverted_index.and | 30, 54, 62, 76, 90, 147, 151, 175 |
| abstract_inverted_index.are | 106 |
| abstract_inverted_index.due | 36 |
| abstract_inverted_index.for | 16, 119, 141 |
| abstract_inverted_index.the | 28, 81, 102, 126 |
| abstract_inverted_index.two | 71 |
| abstract_inverted_index.via | 109 |
| abstract_inverted_index.CLIP | 140 |
| abstract_inverted_index.This | 40, 137 |
| abstract_inverted_index.Upon | 122 |
| abstract_inverted_index.acts | 100 |
| abstract_inverted_index.city | 18, 182 |
| abstract_inverted_index.edge | 34, 67 |
| abstract_inverted_index.fail | 25 |
| abstract_inverted_index.from | 87 |
| abstract_inverted_index.into | 93 |
| abstract_inverted_index.meet | 27 |
| abstract_inverted_index.task | 15 |
| abstract_inverted_index.text | 146 |
| abstract_inverted_index.that | 164 |
| abstract_inverted_index.them | 92 |
| abstract_inverted_index.then | 107 |
| abstract_inverted_index.time | 95 |
| abstract_inverted_index.uses | 139 |
| abstract_inverted_index.weak | 7 |
| abstract_inverted_index.with | 145 |
| abstract_inverted_index.Video | 46 |
| abstract_inverted_index.first | 82 |
| abstract_inverted_index.model | 171 |
| abstract_inverted_index.often | 24 |
| abstract_inverted_index.paper | 41 |
| abstract_inverted_index.rapid | 74 |
| abstract_inverted_index.smart | 17, 181 |
| abstract_inverted_index.stage | 128, 138 |
| abstract_inverted_index.their | 38 |
| abstract_inverted_index.video | 88 |
| abstract_inverted_index.which | 50, 99 |
| abstract_inverted_index.TCVADS | 43, 84, 165 |
| abstract_inverted_index.Weakly | 0 |
| abstract_inverted_index.binary | 120 |
| abstract_inverted_index.enable | 59 |
| abstract_inverted_index.frames | 89 |
| abstract_inverted_index.inputs | 91 |
| abstract_inverted_index.model) | 118 |
| abstract_inverted_index.model. | 104, 136 |
| abstract_inverted_index.second | 127 |
| abstract_inverted_index.series | 96 |
| abstract_inverted_index.stage, | 83 |
| abstract_inverted_index.(WSMAD) | 5 |
| abstract_inverted_index.Anomaly | 3, 47 |
| abstract_inverted_index.anomaly | 64 |
| abstract_inverted_index.devices | 35 |
| abstract_inverted_index.images, | 148 |
| abstract_inverted_index.methods | 169 |
| abstract_inverted_index.module, | 98 |
| abstract_inverted_index.network | 116 |
| abstract_inverted_index.refined | 153 |
| abstract_inverted_index.results | 162 |
| abstract_inverted_index.stages: | 72 |
| abstract_inverted_index.teacher | 103 |
| abstract_inverted_index.textual | 159 |
| abstract_inverted_index.through | 155 |
| abstract_inverted_index.triplet | 158 |
| abstract_inverted_index.(student | 117 |
| abstract_inverted_index.However, | 20 |
| abstract_inverted_index.Insights | 105 |
| abstract_inverted_index.System), | 49 |
| abstract_inverted_index.analysis | 97 |
| abstract_inverted_index.anomaly, | 125 |
| abstract_inverted_index.critical | 14 |
| abstract_inverted_index.designed | 157 |
| abstract_inverted_index.detailed | 78 |
| abstract_inverted_index.existing | 21, 168 |
| abstract_inverted_index.extracts | 85 |
| abstract_inverted_index.features | 86 |
| abstract_inverted_index.identify | 11 |
| abstract_inverted_index.learning | 9, 57, 144 |
| abstract_inverted_index.offering | 177 |
| abstract_inverted_index.operates | 69 |
| abstract_inverted_index.presents | 42 |
| abstract_inverted_index.utilizes | 6 |
| abstract_inverted_index.valuable | 178 |
| abstract_inverted_index.Detection | 4, 48 |
| abstract_inverted_index.accurate, | 61 |
| abstract_inverted_index.achieving | 152 |
| abstract_inverted_index.analysis. | 79 |
| abstract_inverted_index.detecting | 123 |
| abstract_inverted_index.detection | 65, 173 |
| abstract_inverted_index.employing | 131 |
| abstract_inverted_index.enhancing | 149 |
| abstract_inverted_index.knowledge | 52, 110 |
| abstract_inverted_index.leverages | 51 |
| abstract_inverted_index.real-time | 29 |
| abstract_inverted_index.specially | 156 |
| abstract_inverted_index.(Two-stage | 44 |
| abstract_inverted_index.Monitoring | 2 |
| abstract_inverted_index.Supervised | 1 |
| abstract_inverted_index.anomalies, | 12 |
| abstract_inverted_index.approaches | 23 |
| abstract_inverted_index.efficient, | 60 |
| abstract_inverted_index.monitoring | 183 |
| abstract_inverted_index.multimodal | 22 |
| abstract_inverted_index.simplified | 114 |
| abstract_inverted_index.triggered, | 130 |
| abstract_inverted_index.Cross-modal | 45 |
| abstract_inverted_index.complexity. | 39 |
| abstract_inverted_index.contrastive | 56, 143 |
| abstract_inverted_index.cross-modal | 55, 142 |
| abstract_inverted_index.demonstrate | 163 |
| abstract_inverted_index.efficiency, | 174 |
| abstract_inverted_index.monitoring. | 19 |
| abstract_inverted_index.multi-class | 134 |
| abstract_inverted_index.outperforms | 167 |
| abstract_inverted_index.supervision | 8 |
| abstract_inverted_index.transferred | 108 |
| abstract_inverted_index.Experimental | 161 |
| abstract_inverted_index.distillation | 53, 111 |
| abstract_inverted_index.fine-grained | 77, 133 |
| abstract_inverted_index.performance, | 172 |
| abstract_inverted_index.requirements | 32 |
| abstract_inverted_index.applications. | 184 |
| abstract_inverted_index.contributions | 179 |
| abstract_inverted_index.convolutional | 115 |
| abstract_inverted_index.interpretable | 63 |
| abstract_inverted_index.significantly | 166 |
| abstract_inverted_index.classification | 75, 135, 154 |
| abstract_inverted_index.coarse-grained | 73 |
| abstract_inverted_index.devices.TCVADS | 68 |
| abstract_inverted_index.relationships. | 160 |
| abstract_inverted_index.classification. | 121 |
| abstract_inverted_index.interpretability | 31, 150 |
| abstract_inverted_index.interpretability, | 176 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |