Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input Article Swipe

PDF

J. C. Ott , Zuowen Wang , Shih‐Chii Liu ·

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2406.03439

Event cameras are advantageous for tasks that require vision sensors with low-latency and sparse output responses. However, the development of deep network algorithms using event cameras has been slow because of the lack of large labelled event camera datasets for network training. This paper reports a method for creating new labelled event datasets by using a text-to-X model, where X is one or multiple output modalities, in the case of this work, events. Our proposed text-to-events model produces synthetic event frames directly from text prompts. It uses an autoencoder which is trained to produce sparse event frames representing event camera outputs. By combining the pretrained autoencoder with a diffusion model architecture, the new text-to-events model is able to generate smooth synthetic event streams of moving objects. The autoencoder was first trained on an event camera dataset of diverse scenes. In the combined training with the diffusion model, the DVS gesture dataset was used. We demonstrate that the model can generate realistic event sequences of human gestures prompted by different text statements. The classification accuracy of the generated sequences, using a classifier trained on the real dataset, ranges between 42% to 92%, depending on the gesture group. The results demonstrate the capability of this method in synthesizing event datasets.

Related Topics

Computer Science

Artificial Intelligence

Physics

Quantum Mechanics

Concepts

STREAMS Event (particle physics) Computer science Natural language processing Artificial intelligence Physics Operating system Quantum mechanics

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2406.03439
PDF: https://arxiv.org/pdf/2406.03439
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4399448161

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4399448161

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2406.03439

Digital Object Identifier
Title: Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2024

Year of publication
Publication date: 2024-06-05

Full publication date if available
Authors: J. C. Ott, Zuowen Wang, Shih‐Chii Liu

List of authors in order
Landing page: https://arxiv.org/abs/2406.03439

Publisher landing page
PDF URL: https://arxiv.org/pdf/2406.03439

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2406.03439

Direct OA link when available
Concepts: STREAMS, Event (particle physics), Computer science, Natural language processing, Artificial intelligence, Physics, Operating system, Quantum mechanics

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4399448161
doi	https://doi.org/10.48550/arxiv.2406.03439
ids.doi	https://doi.org/10.48550/arxiv.2406.03439
ids.openalex	https://openalex.org/W4399448161
fwci
type	preprint
title	Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T11986
topics[0].field.id	https://openalex.org/fields/18
topics[0].field.display_name	Decision Sciences
topics[0].score	0.947700023651123
topics[0].domain.id	https://openalex.org/domains/2
topics[0].domain.display_name	Social Sciences
topics[0].subfield.id	https://openalex.org/subfields/1802
topics[0].subfield.display_name	Information Systems and Management
topics[0].display_name	Scientific Computing and Data Management
topics[1].id	https://openalex.org/T11719
topics[1].field.id	https://openalex.org/fields/18
topics[1].field.display_name	Decision Sciences
topics[1].score	0.9176999926567078
topics[1].domain.id	https://openalex.org/domains/2
topics[1].domain.display_name	Social Sciences
topics[1].subfield.id	https://openalex.org/subfields/1803
topics[1].subfield.display_name	Management Science and Operations Research
topics[1].display_name	Data Quality and Management
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C42090638
concepts[0].level	2
concepts[0].score	0.6825251579284668
concepts[0].wikidata	https://www.wikidata.org/wiki/Q4048907
concepts[0].display_name	STREAMS
concepts[1].id	https://openalex.org/C2779662365
concepts[1].level	2
concepts[1].score	0.6604136824607849
concepts[1].wikidata	https://www.wikidata.org/wiki/Q5416694
concepts[1].display_name	Event (particle physics)
concepts[2].id	https://openalex.org/C41008148
concepts[2].level	0
concepts[2].score	0.5854353904724121
concepts[2].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[2].display_name	Computer science
concepts[3].id	https://openalex.org/C204321447
concepts[3].level	1
concepts[3].score	0.4090704917907715
concepts[3].wikidata	https://www.wikidata.org/wiki/Q30642
concepts[3].display_name	Natural language processing
concepts[4].id	https://openalex.org/C154945302
concepts[4].level	1
concepts[4].score	0.3806968927383423
concepts[4].wikidata	https://www.wikidata.org/wiki/Q11660
concepts[4].display_name	Artificial intelligence
concepts[5].id	https://openalex.org/C121332964
concepts[5].level	0
concepts[5].score	0.08718672394752502
concepts[5].wikidata	https://www.wikidata.org/wiki/Q413
concepts[5].display_name	Physics
concepts[6].id	https://openalex.org/C111919701
concepts[6].level	1
concepts[6].score	0.07107013463973999
concepts[6].wikidata	https://www.wikidata.org/wiki/Q9135
concepts[6].display_name	Operating system
concepts[7].id	https://openalex.org/C62520636
concepts[7].level	1
concepts[7].score	0.0
concepts[7].wikidata	https://www.wikidata.org/wiki/Q944
concepts[7].display_name	Quantum mechanics
keywords[0].id	https://openalex.org/keywords/streams
keywords[0].score	0.6825251579284668
keywords[0].display_name	STREAMS
keywords[1].id	https://openalex.org/keywords/event
keywords[1].score	0.6604136824607849
keywords[1].display_name	Event (particle physics)
keywords[2].id	https://openalex.org/keywords/computer-science
keywords[2].score	0.5854353904724121
keywords[2].display_name	Computer science
keywords[3].id	https://openalex.org/keywords/natural-language-processing
keywords[3].score	0.4090704917907715
keywords[3].display_name	Natural language processing
keywords[4].id	https://openalex.org/keywords/artificial-intelligence
keywords[4].score	0.3806968927383423
keywords[4].display_name	Artificial intelligence
keywords[5].id	https://openalex.org/keywords/physics
keywords[5].score	0.08718672394752502
keywords[5].display_name	Physics
keywords[6].id	https://openalex.org/keywords/operating-system
keywords[6].score	0.07107013463973999
keywords[6].display_name	Operating system
language	en
locations[0].id	pmh:oai:arXiv.org:2406.03439
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2406.03439
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2406.03439
locations[1].id	doi:10.48550/arxiv.2406.03439
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2406.03439
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5090574378
authorships[0].author.orcid
authorships[0].author.display_name	J. C. Ott
authorships[0].author_position	first
authorships[0].raw_author_name	Ott, Joachim
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5005464029
authorships[1].author.orcid	https://orcid.org/0000-0002-8051-9886
authorships[1].author.display_name	Zuowen Wang
authorships[1].author_position	middle
authorships[1].raw_author_name	Wang, Zuowen
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5053821067
authorships[2].author.orcid	https://orcid.org/0000-0002-7557-045X
authorships[2].author.display_name	Shih‐Chii Liu
authorships[2].author_position	last
authorships[2].raw_author_name	Liu, Shih-Chii
authorships[2].is_corresponding	False
has_content.pdf	True
has_content.grobid_xml	True
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2406.03439
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2024-06-08T00:00:00
display_name	Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input
has_fulltext	True
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T11986
primary_topic.field.id	https://openalex.org/fields/18
primary_topic.field.display_name	Decision Sciences
primary_topic.score	0.947700023651123
primary_topic.domain.id	https://openalex.org/domains/2
primary_topic.domain.display_name	Social Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1802
primary_topic.subfield.display_name	Information Systems and Management
primary_topic.display_name	Scientific Computing and Data Management
related_works	https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2010317732, https://openalex.org/W2483328176, https://openalex.org/W2061705145, https://openalex.org/W193205649, https://openalex.org/W45006177, https://openalex.org/W2016919266, https://openalex.org/W1982793386, https://openalex.org/W3204019825
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2406.03439
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2406.03439
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2406.03439
primary_location.id	pmh:oai:arXiv.org:2406.03439
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2406.03439
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2406.03439
publication_date	2024-06-05
publication_year	2024
referenced_works_count	0
abstract_inverted_index.X	59
abstract_inverted_index.a	45, 55, 107, 179
abstract_inverted_index.By	101
abstract_inverted_index.In	139
abstract_inverted_index.It	85
abstract_inverted_index.We	153
abstract_inverted_index.an	87, 132
abstract_inverted_index.by	53, 167
abstract_inverted_index.in	66, 204
abstract_inverted_index.is	60, 90, 115
abstract_inverted_index.of	19, 30, 33, 69, 123, 136, 163, 174, 201
abstract_inverted_index.on	131, 182, 192
abstract_inverted_index.or	62
abstract_inverted_index.to	92, 117, 189
abstract_inverted_index.42%	188
abstract_inverted_index.DVS	148
abstract_inverted_index.Our	73
abstract_inverted_index.The	126, 171, 196
abstract_inverted_index.and	12
abstract_inverted_index.are	2
abstract_inverted_index.can	158
abstract_inverted_index.for	4, 39, 47
abstract_inverted_index.has	26
abstract_inverted_index.new	49, 112
abstract_inverted_index.one	61
abstract_inverted_index.the	17, 31, 67, 103, 111, 140, 144, 147, 156, 175, 183, 193, 199
abstract_inverted_index.was	128, 151
abstract_inverted_index.92%,	190
abstract_inverted_index.This	42
abstract_inverted_index.able	116
abstract_inverted_index.been	27
abstract_inverted_index.case	68
abstract_inverted_index.deep	20
abstract_inverted_index.from	82
abstract_inverted_index.lack	32
abstract_inverted_index.real	184
abstract_inverted_index.slow	28
abstract_inverted_index.text	83, 169
abstract_inverted_index.that	6, 155
abstract_inverted_index.this	70, 202
abstract_inverted_index.uses	86
abstract_inverted_index.with	10, 106, 143
abstract_inverted_index.Event	0
abstract_inverted_index.event	24, 36, 51, 79, 95, 98, 121, 133, 161, 206
abstract_inverted_index.first	129
abstract_inverted_index.human	164
abstract_inverted_index.large	34
abstract_inverted_index.model	76, 109, 114, 157
abstract_inverted_index.paper	43
abstract_inverted_index.tasks	5
abstract_inverted_index.used.	152
abstract_inverted_index.using	23, 54, 178
abstract_inverted_index.where	58
abstract_inverted_index.which	89
abstract_inverted_index.work,	71
abstract_inverted_index.camera	37, 99, 134
abstract_inverted_index.frames	80, 96
abstract_inverted_index.group.	195
abstract_inverted_index.method	46, 203
abstract_inverted_index.model,	57, 146
abstract_inverted_index.moving	124
abstract_inverted_index.output	14, 64
abstract_inverted_index.ranges	186
abstract_inverted_index.smooth	119
abstract_inverted_index.sparse	13, 94
abstract_inverted_index.vision	8
abstract_inverted_index.because	29
abstract_inverted_index.between	187
abstract_inverted_index.cameras	1, 25
abstract_inverted_index.dataset	135, 150
abstract_inverted_index.diverse	137
abstract_inverted_index.events.	72
abstract_inverted_index.gesture	149, 194
abstract_inverted_index.network	21, 40
abstract_inverted_index.produce	93
abstract_inverted_index.reports	44
abstract_inverted_index.require	7
abstract_inverted_index.results	197
abstract_inverted_index.scenes.	138
abstract_inverted_index.sensors	9
abstract_inverted_index.streams	122
abstract_inverted_index.trained	91, 130, 181
abstract_inverted_index.However,	16
abstract_inverted_index.accuracy	173
abstract_inverted_index.combined	141
abstract_inverted_index.creating	48
abstract_inverted_index.dataset,	185
abstract_inverted_index.datasets	38, 52
abstract_inverted_index.directly	81
abstract_inverted_index.generate	118, 159
abstract_inverted_index.gestures	165
abstract_inverted_index.labelled	35, 50
abstract_inverted_index.multiple	63
abstract_inverted_index.objects.	125
abstract_inverted_index.outputs.	100
abstract_inverted_index.produces	77
abstract_inverted_index.prompted	166
abstract_inverted_index.prompts.	84
abstract_inverted_index.proposed	74
abstract_inverted_index.training	142
abstract_inverted_index.combining	102
abstract_inverted_index.datasets.	207
abstract_inverted_index.depending	191
abstract_inverted_index.different	168
abstract_inverted_index.diffusion	108, 145
abstract_inverted_index.generated	176
abstract_inverted_index.realistic	160
abstract_inverted_index.sequences	162
abstract_inverted_index.synthetic	78, 120
abstract_inverted_index.text-to-X	56
abstract_inverted_index.training.	41
abstract_inverted_index.algorithms	22
abstract_inverted_index.capability	200
abstract_inverted_index.classifier	180
abstract_inverted_index.pretrained	104
abstract_inverted_index.responses.	15
abstract_inverted_index.sequences,	177
abstract_inverted_index.autoencoder	88, 105, 127
abstract_inverted_index.demonstrate	154, 198
abstract_inverted_index.development	18
abstract_inverted_index.low-latency	11
abstract_inverted_index.modalities,	65
abstract_inverted_index.statements.	170
abstract_inverted_index.advantageous	3
abstract_inverted_index.representing	97
abstract_inverted_index.synthesizing	205
abstract_inverted_index.architecture,	110
abstract_inverted_index.classification	172
abstract_inverted_index.text-to-events	75, 113
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	3
citation_normalized_percentile