Three Ways to Improve Semantic Segmentation with Self-Supervised Depth\n Estimation Article Swipe

PDF

Lukas Hoyer , Dengxin Dai , Yuhua Chen , Adrian Köring , Suman Saha , Luc Van Gool ·

YOU? · · 2020 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2012.10782

Training deep networks for semantic segmentation requires large amounts of\nlabeled training data, which presents a major challenge in practice, as\nlabeling segmentation masks is a highly labor-intensive process. To address\nthis issue, we present a framework for semi-supervised semantic segmentation,\nwhich is enhanced by self-supervised monocular depth estimation from unlabeled\nimage sequences. In particular, we propose three key contributions: (1) We\ntransfer knowledge from features learned during self-supervised depth\nestimation to semantic segmentation, (2) we implement a strong data\naugmentation by blending images and labels using the geometry of the scene, and\n(3) we utilize the depth feature diversity as well as the level of difficulty\nof learning depth in a student-teacher framework to select the most useful\nsamples to be annotated for semantic segmentation. We validate the proposed\nmodel on the Cityscapes dataset, where all three modules demonstrate\nsignificant performance gains, and we achieve state-of-the-art results for\nsemi-supervised semantic segmentation. The implementation is available at\nhttps://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth.\n

Related Topics

Segmentation Fault

Computer Science

Artificial Intelligence

Concepts

Segmentation Computer science Artificial intelligence Feature (linguistics) Process (computing) Semantics (computer science) Monocular Pattern recognition (psychology) Scale-space segmentation Semantic feature Machine learning Image segmentation Linguistics Operating system Philosophy Programming language

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2012.10782
PDF: https://arxiv.org/pdf/2012.10782
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4287550764

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4287550764

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2012.10782

Digital Object Identifier
Title: Three Ways to Improve Semantic Segmentation with Self-Supervised Depth\n Estimation

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2020

Year of publication
Publication date: 2020-12-19

Full publication date if available
Authors: Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian Köring, Suman Saha, Luc Van Gool

List of authors in order
Landing page: https://arxiv.org/abs/2012.10782

Publisher landing page
PDF URL: https://arxiv.org/pdf/2012.10782

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2012.10782

Direct OA link when available
Concepts: Segmentation, Computer science, Artificial intelligence, Feature (linguistics), Process (computing), Semantics (computer science), Monocular, Pattern recognition (psychology), Scale-space segmentation, Semantic feature, Machine learning, Image segmentation, Linguistics, Operating system, Philosophy, Programming language

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4287550764
doi	https://doi.org/10.48550/arxiv.2012.10782
ids.openalex	https://openalex.org/W4287550764
fwci	0.0
type	preprint
title	Three Ways to Improve Semantic Segmentation with Self-Supervised Depth\n Estimation
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T10531
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9968000054359436
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1707
topics[0].subfield.display_name	Computer Vision and Pattern Recognition
topics[0].display_name	Advanced Vision and Imaging
topics[1].id	https://openalex.org/T13114
topics[1].field.id	https://openalex.org/fields/22
topics[1].field.display_name	Engineering
topics[1].score	0.9915000200271606
topics[1].domain.id	https://openalex.org/domains/3
topics[1].domain.display_name	Physical Sciences
topics[1].subfield.id	https://openalex.org/subfields/2214
topics[1].subfield.display_name	Media Technology
topics[1].display_name	Image Processing Techniques and Applications
topics[2].id	https://openalex.org/T11606
topics[2].field.id	https://openalex.org/fields/22
topics[2].field.display_name	Engineering
topics[2].score	0.9872000217437744
topics[2].domain.id	https://openalex.org/domains/3
topics[2].domain.display_name	Physical Sciences
topics[2].subfield.id	https://openalex.org/subfields/2205
topics[2].subfield.display_name	Civil and Structural Engineering
topics[2].display_name	Infrastructure Maintenance and Monitoring
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C89600930
concepts[0].level	2
concepts[0].score	0.8734179735183716
concepts[0].wikidata	https://www.wikidata.org/wiki/Q1423946
concepts[0].display_name	Segmentation
concepts[1].id	https://openalex.org/C41008148
concepts[1].level	0
concepts[1].score	0.8016436100006104
concepts[1].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[1].display_name	Computer science
concepts[2].id	https://openalex.org/C154945302
concepts[2].level	1
concepts[2].score	0.6974334716796875
concepts[2].wikidata	https://www.wikidata.org/wiki/Q11660
concepts[2].display_name	Artificial intelligence
concepts[3].id	https://openalex.org/C2776401178
concepts[3].level	2
concepts[3].score	0.5613366961479187
concepts[3].wikidata	https://www.wikidata.org/wiki/Q12050496
concepts[3].display_name	Feature (linguistics)
concepts[4].id	https://openalex.org/C98045186
concepts[4].level	2
concepts[4].score	0.5337112545967102
concepts[4].wikidata	https://www.wikidata.org/wiki/Q205663
concepts[4].display_name	Process (computing)
concepts[5].id	https://openalex.org/C184337299
concepts[5].level	2
concepts[5].score	0.4920908808708191
concepts[5].wikidata	https://www.wikidata.org/wiki/Q1437428
concepts[5].display_name	Semantics (computer science)
concepts[6].id	https://openalex.org/C65909025
concepts[6].level	2
concepts[6].score	0.4608535170555115
concepts[6].wikidata	https://www.wikidata.org/wiki/Q1945033
concepts[6].display_name	Monocular
concepts[7].id	https://openalex.org/C153180895
concepts[7].level	2
concepts[7].score	0.45716726779937744
concepts[7].wikidata	https://www.wikidata.org/wiki/Q7148389
concepts[7].display_name	Pattern recognition (psychology)
concepts[8].id	https://openalex.org/C65885262
concepts[8].level	4
concepts[8].score	0.44215235114097595
concepts[8].wikidata	https://www.wikidata.org/wiki/Q7429708
concepts[8].display_name	Scale-space segmentation
concepts[9].id	https://openalex.org/C2781122975
concepts[9].level	2
concepts[9].score	0.4139872193336487
concepts[9].wikidata	https://www.wikidata.org/wiki/Q16928266
concepts[9].display_name	Semantic feature
concepts[10].id	https://openalex.org/C119857082
concepts[10].level	1
concepts[10].score	0.4014217257499695
concepts[10].wikidata	https://www.wikidata.org/wiki/Q2539
concepts[10].display_name	Machine learning
concepts[11].id	https://openalex.org/C124504099
concepts[11].level	3
concepts[11].score	0.39340871572494507
concepts[11].wikidata	https://www.wikidata.org/wiki/Q56933
concepts[11].display_name	Image segmentation
concepts[12].id	https://openalex.org/C41895202
concepts[12].level	1
concepts[12].score	0.0
concepts[12].wikidata	https://www.wikidata.org/wiki/Q8162
concepts[12].display_name	Linguistics
concepts[13].id	https://openalex.org/C111919701
concepts[13].level	1
concepts[13].score	0.0
concepts[13].wikidata	https://www.wikidata.org/wiki/Q9135
concepts[13].display_name	Operating system
concepts[14].id	https://openalex.org/C138885662
concepts[14].level	0
concepts[14].score	0.0
concepts[14].wikidata	https://www.wikidata.org/wiki/Q5891
concepts[14].display_name	Philosophy
concepts[15].id	https://openalex.org/C199360897
concepts[15].level	1
concepts[15].score	0.0
concepts[15].wikidata	https://www.wikidata.org/wiki/Q9143
concepts[15].display_name	Programming language
keywords[0].id	https://openalex.org/keywords/segmentation
keywords[0].score	0.8734179735183716
keywords[0].display_name	Segmentation
keywords[1].id	https://openalex.org/keywords/computer-science
keywords[1].score	0.8016436100006104
keywords[1].display_name	Computer science
keywords[2].id	https://openalex.org/keywords/artificial-intelligence
keywords[2].score	0.6974334716796875
keywords[2].display_name	Artificial intelligence
keywords[3].id	https://openalex.org/keywords/feature
keywords[3].score	0.5613366961479187
keywords[3].display_name	Feature (linguistics)
keywords[4].id	https://openalex.org/keywords/process
keywords[4].score	0.5337112545967102
keywords[4].display_name	Process (computing)
keywords[5].id	https://openalex.org/keywords/semantics
keywords[5].score	0.4920908808708191
keywords[5].display_name	Semantics (computer science)
keywords[6].id	https://openalex.org/keywords/monocular
keywords[6].score	0.4608535170555115
keywords[6].display_name	Monocular
keywords[7].id	https://openalex.org/keywords/pattern-recognition
keywords[7].score	0.45716726779937744
keywords[7].display_name	Pattern recognition (psychology)
keywords[8].id	https://openalex.org/keywords/scale-space-segmentation
keywords[8].score	0.44215235114097595
keywords[8].display_name	Scale-space segmentation
keywords[9].id	https://openalex.org/keywords/semantic-feature
keywords[9].score	0.4139872193336487
keywords[9].display_name	Semantic feature
keywords[10].id	https://openalex.org/keywords/machine-learning
keywords[10].score	0.4014217257499695
keywords[10].display_name	Machine learning
keywords[11].id	https://openalex.org/keywords/image-segmentation
keywords[11].score	0.39340871572494507
keywords[11].display_name	Image segmentation
language	en
locations[0].id	pmh:oai:arXiv.org:2012.10782
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2012.10782
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2012.10782
indexed_in	arxiv
authorships[0].author.id	https://openalex.org/A5053328232
authorships[0].author.orcid	https://orcid.org/0000-0002-7391-0676
authorships[0].author.display_name	Lukas Hoyer
authorships[0].author_position	first
authorships[0].raw_author_name	Hoyer, Lukas
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5078838951
authorships[1].author.orcid	https://orcid.org/0000-0001-5440-9678
authorships[1].author.display_name	Dengxin Dai
authorships[1].author_position	middle
authorships[1].raw_author_name	Dai, Dengxin
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5100384516
authorships[2].author.orcid	https://orcid.org/0000-0002-4124-1148
authorships[2].author.display_name	Yuhua Chen
authorships[2].author_position	middle
authorships[2].raw_author_name	Chen, Yuhua
authorships[2].is_corresponding	False
authorships[3].author.id	https://openalex.org/A5016670590
authorships[3].author.orcid
authorships[3].author.display_name	Adrian Köring
authorships[3].author_position	middle
authorships[3].raw_author_name	Köring, Adrian
authorships[3].is_corresponding	False
authorships[4].author.id	https://openalex.org/A5114686140
authorships[4].author.orcid	https://orcid.org/0009-0005-9440-6785
authorships[4].author.display_name	Suman Saha
authorships[4].author_position	middle
authorships[4].raw_author_name	Saha, Suman
authorships[4].is_corresponding	False
authorships[5].author.id	https://openalex.org/A5001254143
authorships[5].author.orcid	https://orcid.org/0000-0002-3445-5711
authorships[5].author.display_name	Luc Van Gool
authorships[5].author_position	last
authorships[5].raw_author_name	Van Gool, Luc
authorships[5].is_corresponding	False
has_content.pdf	True
has_content.grobid_xml	True
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2012.10782
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2025-10-10T00:00:00
display_name	Three Ways to Improve Semantic Segmentation with Self-Supervised Depth\n Estimation
has_fulltext	True
is_retracted	False
updated_date	2025-11-06T03:46:38.306776
primary_topic.id	https://openalex.org/T10531
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9968000054359436
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1707
primary_topic.subfield.display_name	Computer Vision and Pattern Recognition
primary_topic.display_name	Advanced Vision and Imaging
related_works	https://openalex.org/W2185902295, https://openalex.org/W2103507220, https://openalex.org/W3144569342, https://openalex.org/W2945274617, https://openalex.org/W4313052709, https://openalex.org/W2022929107, https://openalex.org/W2055202857, https://openalex.org/W4205800335, https://openalex.org/W2758994127, https://openalex.org/W2386644571
cited_by_count	0
locations_count	1
best_oa_location.id	pmh:oai:arXiv.org:2012.10782
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2012.10782
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2012.10782
primary_location.id	pmh:oai:arXiv.org:2012.10782
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2012.10782
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2012.10782
publication_date	2020-12-19
publication_year	2020
referenced_works_count	0
abstract_inverted_index.a	14, 23, 32, 70, 101
abstract_inverted_index.In	48
abstract_inverted_index.To	27
abstract_inverted_index.We	115
abstract_inverted_index.as	91, 93
abstract_inverted_index.be	110
abstract_inverted_index.by	40, 73
abstract_inverted_index.in	17, 100
abstract_inverted_index.is	22, 38, 140
abstract_inverted_index.of	81, 96
abstract_inverted_index.on	119
abstract_inverted_index.to	64, 104, 109
abstract_inverted_index.we	30, 50, 68, 85, 131
abstract_inverted_index.(1)	55
abstract_inverted_index.(2)	67
abstract_inverted_index.The	138
abstract_inverted_index.all	124
abstract_inverted_index.and	76, 130
abstract_inverted_index.for	3, 34, 112
abstract_inverted_index.key	53
abstract_inverted_index.the	79, 82, 87, 94, 106, 117, 120
abstract_inverted_index.deep	1
abstract_inverted_index.from	45, 58
abstract_inverted_index.most	107
abstract_inverted_index.well	92
abstract_inverted_index.data,	11
abstract_inverted_index.depth	43, 88, 99
abstract_inverted_index.large	7
abstract_inverted_index.level	95
abstract_inverted_index.major	15
abstract_inverted_index.masks	21
abstract_inverted_index.three	52, 125
abstract_inverted_index.using	78
abstract_inverted_index.where	123
abstract_inverted_index.which	12
abstract_inverted_index.during	61
abstract_inverted_index.gains,	129
abstract_inverted_index.highly	24
abstract_inverted_index.images	75
abstract_inverted_index.issue,	29
abstract_inverted_index.labels	77
abstract_inverted_index.scene,	83
abstract_inverted_index.select	105
abstract_inverted_index.strong	71
abstract_inverted_index.achieve	132
abstract_inverted_index.amounts	8
abstract_inverted_index.feature	89
abstract_inverted_index.learned	60
abstract_inverted_index.modules	126
abstract_inverted_index.present	31
abstract_inverted_index.propose	51
abstract_inverted_index.results	134
abstract_inverted_index.utilize	86
abstract_inverted_index.Training	0
abstract_inverted_index.and\n(3)	84
abstract_inverted_index.blending	74
abstract_inverted_index.dataset,	122
abstract_inverted_index.enhanced	39
abstract_inverted_index.features	59
abstract_inverted_index.geometry	80
abstract_inverted_index.learning	98
abstract_inverted_index.networks	2
abstract_inverted_index.presents	13
abstract_inverted_index.process.	26
abstract_inverted_index.requires	6
abstract_inverted_index.semantic	4, 36, 65, 113, 136
abstract_inverted_index.training	10
abstract_inverted_index.validate	116
abstract_inverted_index.annotated	111
abstract_inverted_index.available	141
abstract_inverted_index.challenge	16
abstract_inverted_index.diversity	90
abstract_inverted_index.framework	33, 103
abstract_inverted_index.implement	69
abstract_inverted_index.knowledge	57
abstract_inverted_index.monocular	42
abstract_inverted_index.practice,	18
abstract_inverted_index.Cityscapes	121
abstract_inverted_index.estimation	44
abstract_inverted_index.sequences.	47
abstract_inverted_index.of\nlabeled	9
abstract_inverted_index.particular,	49
abstract_inverted_index.performance	128
abstract_inverted_index.We\ntransfer	56
abstract_inverted_index.as\nlabeling	19
abstract_inverted_index.segmentation	5, 20
abstract_inverted_index.address\nthis	28
abstract_inverted_index.segmentation,	66
abstract_inverted_index.segmentation.	114, 137
abstract_inverted_index.contributions:	54
abstract_inverted_index.difficulty\nof	97
abstract_inverted_index.implementation	139
abstract_inverted_index.labor-intensive	25
abstract_inverted_index.proposed\nmodel	118
abstract_inverted_index.self-supervised	41, 62
abstract_inverted_index.semi-supervised	35
abstract_inverted_index.student-teacher	102
abstract_inverted_index.useful\nsamples	108
abstract_inverted_index.state-of-the-art	133
abstract_inverted_index.unlabeled\nimage	46
abstract_inverted_index.depth\nestimation	63
abstract_inverted_index.data\naugmentation	72
abstract_inverted_index.for\nsemi-supervised	135
abstract_inverted_index.segmentation,\nwhich	37
abstract_inverted_index.demonstrate\nsignificant	127
abstract_inverted_index.at\nhttps://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth.\n	142
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	6
citation_normalized_percentile.value	0.23826513
citation_normalized_percentile.is_in_top_1_percent	False
citation_normalized_percentile.is_in_top_10_percent	False