Enhanced Action Recognition through Deep Spatiotemporal Learning Using 3D CNN and GRU Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.21203/rs.3.rs-5406559/v1
The issues revolve around efficiently analyzing large video data streams while minimizing computer complexity and performing processing in real-time. On the other hand, it becomes more difficult to quickly react to unusual actions because of this. Also, smart homes, security systems, assisted living facilities, and health monitoring might all benefit from the ability to recognize events from video sequences. The techniques used to analyse data are still under constant scrutiny, even if sensing technology has advanced, especially with respect to 3D video. By combining 3D Convolutional Neural Networks (CNN) with gated recurrent units (GRU), we have created a new method for learning spatiotemporal features in movies. We found that 3D convolutional neural networks (CNNs) acquire spatiotemporal information better than 2D CNNs using the UCF50 dataset. Using smaller 3x3x3 convolution kernels in a uniform design also improves performance. Furthermore, we found that 3D CNN with GRU integrated yields better accuracy than 3D CNN alone. The results show that GRU outperforms LSTM in terms of accuracy (89.89%) and calculation time (less than LSTM) when compared.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.21203/rs.3.rs-5406559/v1
- https://www.researchsquare.com/article/rs-5406559/latest.pdf
- OA Status
- gold
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4404692293
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4404692293Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.21203/rs.3.rs-5406559/v1Digital Object Identifier
- Title
-
Enhanced Action Recognition through Deep Spatiotemporal Learning Using 3D CNN and GRUWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-11-25Full publication date if available
- Authors
-
Neha Bansal, Atul Bansal, Manish GuptaList of authors in order
- Landing page
-
https://doi.org/10.21203/rs.3.rs-5406559/v1Publisher landing page
- PDF URL
-
https://www.researchsquare.com/article/rs-5406559/latest.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://www.researchsquare.com/article/rs-5406559/latest.pdfDirect OA link when available
- Concepts
-
Action recognition, Computer science, Deep learning, Action (physics), Artificial intelligence, Pattern recognition (psychology), Physics, Class (philosophy), Quantum mechanicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4404692293 |
|---|---|
| doi | https://doi.org/10.21203/rs.3.rs-5406559/v1 |
| ids.doi | https://doi.org/10.21203/rs.3.rs-5406559/v1 |
| ids.openalex | https://openalex.org/W4404692293 |
| fwci | 0.0 |
| type | preprint |
| title | Enhanced Action Recognition through Deep Spatiotemporal Learning Using 3D CNN and GRU |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10812 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9807000160217285 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1707 |
| topics[0].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[0].display_name | Human Pose and Action Recognition |
| topics[1].id | https://openalex.org/T14510 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.9778000116348267 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2204 |
| topics[1].subfield.display_name | Biomedical Engineering |
| topics[1].display_name | Medical Imaging and Analysis |
| topics[2].id | https://openalex.org/T10036 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9190999865531921 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1707 |
| topics[2].subfield.display_name | Computer Vision and Pattern Recognition |
| topics[2].display_name | Advanced Neural Network Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2987834672 |
| concepts[0].level | 3 |
| concepts[0].score | 0.7520148754119873 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q4677630 |
| concepts[0].display_name | Action recognition |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.5804084539413452 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C108583219 |
| concepts[2].level | 2 |
| concepts[2].score | 0.5608921051025391 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q197536 |
| concepts[2].display_name | Deep learning |
| concepts[3].id | https://openalex.org/C2780791683 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5505162477493286 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q846785 |
| concepts[3].display_name | Action (physics) |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.5336827635765076 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C153180895 |
| concepts[5].level | 2 |
| concepts[5].score | 0.3957899212837219 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[5].display_name | Pattern recognition (psychology) |
| concepts[6].id | https://openalex.org/C121332964 |
| concepts[6].level | 0 |
| concepts[6].score | 0.05587714910507202 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[6].display_name | Physics |
| concepts[7].id | https://openalex.org/C2777212361 |
| concepts[7].level | 2 |
| concepts[7].score | 0.0 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q5127848 |
| concepts[7].display_name | Class (philosophy) |
| concepts[8].id | https://openalex.org/C62520636 |
| concepts[8].level | 1 |
| concepts[8].score | 0.0 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[8].display_name | Quantum mechanics |
| keywords[0].id | https://openalex.org/keywords/action-recognition |
| keywords[0].score | 0.7520148754119873 |
| keywords[0].display_name | Action recognition |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.5804084539413452 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/deep-learning |
| keywords[2].score | 0.5608921051025391 |
| keywords[2].display_name | Deep learning |
| keywords[3].id | https://openalex.org/keywords/action |
| keywords[3].score | 0.5505162477493286 |
| keywords[3].display_name | Action (physics) |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.5336827635765076 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/pattern-recognition |
| keywords[5].score | 0.3957899212837219 |
| keywords[5].display_name | Pattern recognition (psychology) |
| keywords[6].id | https://openalex.org/keywords/physics |
| keywords[6].score | 0.05587714910507202 |
| keywords[6].display_name | Physics |
| language | en |
| locations[0].id | doi:10.21203/rs.3.rs-5406559/v1 |
| locations[0].is_oa | True |
| locations[0].source | |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.researchsquare.com/article/rs-5406559/latest.pdf |
| locations[0].version | acceptedVersion |
| locations[0].raw_type | posted-content |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.21203/rs.3.rs-5406559/v1 |
| indexed_in | crossref |
| authorships[0].author.id | https://openalex.org/A5101935374 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-6727-7181 |
| authorships[0].author.display_name | Neha Bansal |
| authorships[0].countries | IN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I82571370 |
| authorships[0].affiliations[0].raw_affiliation_string | GLA University |
| authorships[0].institutions[0].id | https://openalex.org/I82571370 |
| authorships[0].institutions[0].ror | https://ror.org/05fnxgv12 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I82571370 |
| authorships[0].institutions[0].country_code | IN |
| authorships[0].institutions[0].display_name | GLA University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Neha Bansal |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | GLA University |
| authorships[1].author.id | https://openalex.org/A5001991879 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-8012-0349 |
| authorships[1].author.display_name | Atul Bansal |
| authorships[1].countries | IN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I82571370 |
| authorships[1].affiliations[0].raw_affiliation_string | GLA University |
| authorships[1].institutions[0].id | https://openalex.org/I82571370 |
| authorships[1].institutions[0].ror | https://ror.org/05fnxgv12 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I82571370 |
| authorships[1].institutions[0].country_code | IN |
| authorships[1].institutions[0].display_name | GLA University |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Atul Bansal |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | GLA University |
| authorships[2].author.id | https://openalex.org/A5046755750 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-2843-3110 |
| authorships[2].author.display_name | Manish Gupta |
| authorships[2].countries | IN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I82571370 |
| authorships[2].affiliations[0].raw_affiliation_string | GLA University |
| authorships[2].institutions[0].id | https://openalex.org/I82571370 |
| authorships[2].institutions[0].ror | https://ror.org/05fnxgv12 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I82571370 |
| authorships[2].institutions[0].country_code | IN |
| authorships[2].institutions[0].display_name | GLA University |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Manish Gupta |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | GLA University |
| has_content.pdf | True |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.researchsquare.com/article/rs-5406559/latest.pdf |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Enhanced Action Recognition through Deep Spatiotemporal Learning Using 3D CNN and GRU |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10812 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9807000160217285 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1707 |
| primary_topic.subfield.display_name | Computer Vision and Pattern Recognition |
| primary_topic.display_name | Human Pose and Action Recognition |
| related_works | https://openalex.org/W2731899572, https://openalex.org/W3215138031, https://openalex.org/W3009238340, https://openalex.org/W4360585206, https://openalex.org/W4321369474, https://openalex.org/W4285208911, https://openalex.org/W3082895349, https://openalex.org/W4213079790, https://openalex.org/W2248239756, https://openalex.org/W4323565446 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.21203/rs.3.rs-5406559/v1 |
| best_oa_location.is_oa | True |
| best_oa_location.source | |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.researchsquare.com/article/rs-5406559/latest.pdf |
| best_oa_location.version | acceptedVersion |
| best_oa_location.raw_type | posted-content |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.21203/rs.3.rs-5406559/v1 |
| primary_location.id | doi:10.21203/rs.3.rs-5406559/v1 |
| primary_location.is_oa | True |
| primary_location.source | |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.researchsquare.com/article/rs-5406559/latest.pdf |
| primary_location.version | acceptedVersion |
| primary_location.raw_type | posted-content |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.21203/rs.3.rs-5406559/v1 |
| publication_date | 2024-11-25 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 98, 132 |
| abstract_inverted_index.2D | 120 |
| abstract_inverted_index.3D | 81, 85, 110, 142, 151 |
| abstract_inverted_index.By | 83 |
| abstract_inverted_index.On | 20 |
| abstract_inverted_index.We | 107 |
| abstract_inverted_index.if | 72 |
| abstract_inverted_index.in | 18, 105, 131, 161 |
| abstract_inverted_index.it | 24 |
| abstract_inverted_index.of | 35, 163 |
| abstract_inverted_index.to | 28, 31, 54, 63, 80 |
| abstract_inverted_index.we | 95, 139 |
| abstract_inverted_index.CNN | 143, 152 |
| abstract_inverted_index.GRU | 145, 158 |
| abstract_inverted_index.The | 1, 60, 154 |
| abstract_inverted_index.all | 49 |
| abstract_inverted_index.and | 15, 45, 166 |
| abstract_inverted_index.are | 66 |
| abstract_inverted_index.for | 101 |
| abstract_inverted_index.has | 75 |
| abstract_inverted_index.new | 99 |
| abstract_inverted_index.the | 21, 52, 123 |
| abstract_inverted_index.CNNs | 121 |
| abstract_inverted_index.LSTM | 160 |
| abstract_inverted_index.also | 135 |
| abstract_inverted_index.data | 9, 65 |
| abstract_inverted_index.even | 71 |
| abstract_inverted_index.from | 51, 57 |
| abstract_inverted_index.have | 96 |
| abstract_inverted_index.more | 26 |
| abstract_inverted_index.show | 156 |
| abstract_inverted_index.than | 119, 150, 170 |
| abstract_inverted_index.that | 109, 141, 157 |
| abstract_inverted_index.time | 168 |
| abstract_inverted_index.used | 62 |
| abstract_inverted_index.when | 172 |
| abstract_inverted_index.with | 78, 90, 144 |
| abstract_inverted_index.(CNN) | 89 |
| abstract_inverted_index.(less | 169 |
| abstract_inverted_index.3x3x3 | 128 |
| abstract_inverted_index.Also, | 37 |
| abstract_inverted_index.LSTM) | 171 |
| abstract_inverted_index.UCF50 | 124 |
| abstract_inverted_index.Using | 126 |
| abstract_inverted_index.found | 108, 140 |
| abstract_inverted_index.gated | 91 |
| abstract_inverted_index.hand, | 23 |
| abstract_inverted_index.large | 7 |
| abstract_inverted_index.might | 48 |
| abstract_inverted_index.other | 22 |
| abstract_inverted_index.react | 30 |
| abstract_inverted_index.smart | 38 |
| abstract_inverted_index.still | 67 |
| abstract_inverted_index.terms | 162 |
| abstract_inverted_index.this. | 36 |
| abstract_inverted_index.under | 68 |
| abstract_inverted_index.units | 93 |
| abstract_inverted_index.using | 122 |
| abstract_inverted_index.video | 8, 58 |
| abstract_inverted_index.while | 11 |
| abstract_inverted_index.(CNNs) | 114 |
| abstract_inverted_index.(GRU), | 94 |
| abstract_inverted_index.Neural | 87 |
| abstract_inverted_index.alone. | 153 |
| abstract_inverted_index.around | 4 |
| abstract_inverted_index.better | 118, 148 |
| abstract_inverted_index.design | 134 |
| abstract_inverted_index.events | 56 |
| abstract_inverted_index.health | 46 |
| abstract_inverted_index.homes, | 39 |
| abstract_inverted_index.issues | 2 |
| abstract_inverted_index.living | 43 |
| abstract_inverted_index.method | 100 |
| abstract_inverted_index.neural | 112 |
| abstract_inverted_index.video. | 82 |
| abstract_inverted_index.yields | 147 |
| abstract_inverted_index.ability | 53 |
| abstract_inverted_index.acquire | 115 |
| abstract_inverted_index.actions | 33 |
| abstract_inverted_index.analyse | 64 |
| abstract_inverted_index.because | 34 |
| abstract_inverted_index.becomes | 25 |
| abstract_inverted_index.benefit | 50 |
| abstract_inverted_index.created | 97 |
| abstract_inverted_index.kernels | 130 |
| abstract_inverted_index.movies. | 106 |
| abstract_inverted_index.quickly | 29 |
| abstract_inverted_index.respect | 79 |
| abstract_inverted_index.results | 155 |
| abstract_inverted_index.revolve | 3 |
| abstract_inverted_index.sensing | 73 |
| abstract_inverted_index.smaller | 127 |
| abstract_inverted_index.streams | 10 |
| abstract_inverted_index.uniform | 133 |
| abstract_inverted_index.unusual | 32 |
| abstract_inverted_index.(89.89%) | 165 |
| abstract_inverted_index.Networks | 88 |
| abstract_inverted_index.accuracy | 149, 164 |
| abstract_inverted_index.assisted | 42 |
| abstract_inverted_index.computer | 13 |
| abstract_inverted_index.constant | 69 |
| abstract_inverted_index.dataset. | 125 |
| abstract_inverted_index.features | 104 |
| abstract_inverted_index.improves | 136 |
| abstract_inverted_index.learning | 102 |
| abstract_inverted_index.networks | 113 |
| abstract_inverted_index.security | 40 |
| abstract_inverted_index.systems, | 41 |
| abstract_inverted_index.advanced, | 76 |
| abstract_inverted_index.analyzing | 6 |
| abstract_inverted_index.combining | 84 |
| abstract_inverted_index.compared. | 173 |
| abstract_inverted_index.difficult | 27 |
| abstract_inverted_index.recognize | 55 |
| abstract_inverted_index.recurrent | 92 |
| abstract_inverted_index.scrutiny, | 70 |
| abstract_inverted_index.complexity | 14 |
| abstract_inverted_index.especially | 77 |
| abstract_inverted_index.integrated | 146 |
| abstract_inverted_index.minimizing | 12 |
| abstract_inverted_index.monitoring | 47 |
| abstract_inverted_index.performing | 16 |
| abstract_inverted_index.processing | 17 |
| abstract_inverted_index.real-time. | 19 |
| abstract_inverted_index.sequences. | 59 |
| abstract_inverted_index.techniques | 61 |
| abstract_inverted_index.technology | 74 |
| abstract_inverted_index.calculation | 167 |
| abstract_inverted_index.convolution | 129 |
| abstract_inverted_index.efficiently | 5 |
| abstract_inverted_index.facilities, | 44 |
| abstract_inverted_index.information | 117 |
| abstract_inverted_index.outperforms | 159 |
| abstract_inverted_index.Furthermore, | 138 |
| abstract_inverted_index.performance. | 137 |
| abstract_inverted_index.Convolutional | 86 |
| abstract_inverted_index.convolutional | 111 |
| abstract_inverted_index.spatiotemporal | 103, 116 |
| abstract_inverted_index.<title>Abstract</title> | 0 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/9 |
| sustainable_development_goals[0].score | 0.4099999964237213 |
| sustainable_development_goals[0].display_name | Industry, innovation and infrastructure |
| citation_normalized_percentile.value | 0.26767128 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |