COCO-Text: Dataset and Benchmark for Text Detection and Recognition in\n Natural Images Article Swipe
YOU?
·
· 2016
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.1601.07140
· OA: W4295246343
This paper describes the COCO-Text dataset. In recent years large-scale\ndatasets like SUN and Imagenet drove the advancement of scene understanding and\nobject recognition. The goal of COCO-Text is to advance state-of-the-art in\ntext detection and recognition in natural images. The dataset is based on the\nMS COCO dataset, which contains images of complex everyday scenes. The images\nwere not collected with text in mind and thus contain a broad variety of text\ninstances. To reflect the diversity of text in natural scenes, we annotate text\nwith (a) location in terms of a bounding box, (b) fine-grained classification\ninto machine printed text and handwritten text, (c) classification into legible\nand illegible text, (d) script of the text and (e) transcriptions of legible\ntext. The dataset contains over 173k text annotations in over 63k images. We\nprovide a statistical analysis of the accuracy of our annotations. In addition,\nwe present an analysis of three leading state-of-the-art photo Optical\nCharacter Recognition (OCR) approaches on our dataset. While scene text\ndetection and recognition enjoys strong advances in recent years, we identify\nsignificant shortcomings motivating future work.\n