ETDPC-ETD500 Article Swipe
Related Concepts
Muntabir Hasan Choudhury
,
Lamia Salsabil
,
Jian Wu
,
William A. Ingram
,
Edward A. Fox
·
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.7910/dvn/msfvlq
· OA: W4398356401
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.7910/dvn/msfvlq
· OA: W4398356401
ETDPC has been developed to classify ETD pages into 13 categories. This model uses ETDPC-ETD500, containing 92,371 scanned ETD pages in PNGs. These pages were manually annotated. Later, OCR was performed on all pages using AWS Textract, a cloud-based service that detects and extracts texts from scanned documents. Textract converts images into JSON containing text, ID, type (i.e., words or lines), bbox, and confidence score values.
Related Topics
Finding more related topics…