Enrique Vidal
YOU?
Author Swipe
View article: The PARES Database: Information Extraction over Historical Parish Records
The PARES Database: Information Extraction over Historical Parish Records Open
Historical census records convey information that is key to perform genealogical research and demographic studies. Given the large number of documents of this type that exist, it is crucial to research methods that allow the automatic extr…
View article: Markov Chains in Mining Ventilation Systems: A Bibliometric and Systematic Literature Analysis
Markov Chains in Mining Ventilation Systems: A Bibliometric and Systematic Literature Analysis Open
View article: Evaluation of Multi-Object Detection Models for the Real-Time Identification of Goat Behavior
Evaluation of Multi-Object Detection Models for the Real-Time Identification of Goat Behavior Open
View article: Information extraction in handwritten historical logbooks
Information extraction in handwritten historical logbooks Open
View article: Open set classification of untranscribed handwritten text image documents
Open set classification of untranscribed handwritten text image documents Open
View article: End-to-End page-Level assessment of handwritten text recognition
End-to-End page-Level assessment of handwritten text recognition Open
[EN] The evaluation of Handwritten Text Recognition (HTR) systems has traditionally used metrics based on the edit distance between HTR and ground truth (GT) transcripts, at both the character and word levels. This is very adequate when th…
View article: Processing a large collection of historical tabular images
Processing a large collection of historical tabular images Open
View article: End-to-End Page-Level Assessment of Handwritten Text Recognition
End-to-End Page-Level Assessment of Handwritten Text Recognition Open
The evaluation of Handwritten Text Recognition (HTR) systems has traditionally used metrics based on the edit distance between HTR and ground truth (GT) transcripts, at both the character and word levels. This is very adequate when the exp…
View article: Revisiting Bag-of-Word Metrics to Assess End-To-End Text Image Recognition Results
Revisiting Bag-of-Word Metrics to Assess End-To-End Text Image Recognition Results Open
View article: End-to-End Page-Level Assessment Of Handwritten Text Recognition
End-to-End Page-Level Assessment Of Handwritten Text Recognition Open
View article: A proxy learning curve for the Bayes classifier
A proxy learning curve for the Bayes classifier Open
View article: Information Extraction in Handwritten Historical Logbooks
Information Extraction in Handwritten Historical Logbooks Open
Contains the datasets with tables used in the following paper: Information Extraction in Handwritten Historical Logbooks. The pages starting with "vol003" correspond to the Jeannette corpus, while the ones starting with "Albatross" corresp…
View article: Information Extraction in Handwritten Historical Logbooks
Information Extraction in Handwritten Historical Logbooks Open
Contains the datasets with tables used in the following paper: Information Extraction in Handwritten Historical Logbooks. The pages starting with "vol003" correspond to the Jeannette corpus, while the ones starting with "Albatross" corresp…
View article: Open Set Classification of Untranscribed Handwritten Documents
Open Set Classification of Untranscribed Handwritten Documents Open
Huge amounts of digital page images of important manuscripts are preserved in archives worldwide. The amounts are so large that it is generally unfeasible for archivists to adequately tag most of the documents with the required metadata so…
View article: PLANTAS Dataset
PLANTAS Dataset Open
The dataset "PLANTAS" (“Historia de las plantas”, Vol.1) were written using a quill-pen by Bernardo de Cienfuegos, one of the most outstanding Spanish botanists in the XVII century. The book was writing mainly in Spanish, but a significant…
View article: PLANTAS Dataset
PLANTAS Dataset Open
The dataset "PLANTAS" (“Historia de las plantas”, Vol.1) were written using a quill-pen by Bernardo de Cienfuegos, one of the most outstanding Spanish botanists in the XVII century. The book was writing mainly in Spanish, but a significant…
View article: The EU corpus
The EU corpus Open
EU is a corpora extracted from the Bulletin of the European Union, which exists in all official languages of the European Union and is publicly available on the Internet. It contains three different language pairs: English–French. English…
View article: The EU corpus
The EU corpus Open
EU is a corpora extracted from the Bulletin of the European Union, which exists in all official languages of the European Union and is publicly available on the Internet. It contains three different language pairs: English–French. English–…
View article: The EUTRANS-I corpus
The EUTRANS-I corpus Open
EUTRANS-I is a simple translation corpus which was produced and used in the EuTrans project. It corresponds to the so called "Traveller Task" which involves human-to-human communication situations in the front-desk of a hotel. Bilingual da…
View article: The EUTRANS-I corpus
The EUTRANS-I corpus Open
EUTRANS-I is a simple translation corpus which was produced and used in the EuTrans project. It corresponds to the so called "Traveller Task" which involves human-to-human communication situations in the front-desk of a hotel. Bilingual da…
View article: Vorau Abbey library Cod. 253 dataset for Document Layout Analysis
Vorau Abbey library Cod. 253 dataset for Document Layout Analysis Open
VORAU-253 is a music manuscript referred to as Cod. 253 of the Vorau Abbey library, which was provided by the Austrian Academy of Sciences. It is written in German Gothic notation and dated around year 1450. This manuscript is interesting …
View article: Vorau Abbey library Cod. 253 dataset for Document Layout Analysis
Vorau Abbey library Cod. 253 dataset for Document Layout Analysis Open
VORAU-253 is a music manuscript referred to as Cod. 253 of the Vorau Abbey library, which was provided by the Austrian Academy of Sciences. It is written in German Gothic notation and dated around year 1450. This manuscript is interesting …
View article: Evaluation of a Region Proposal Architecture for Multi-task Document\n Layout Analysis
Evaluation of a Region Proposal Architecture for Multi-task Document\n Layout Analysis Open
Automatically recognizing the layout of handwritten documents is an important\nstep towards useful extraction of information from those documents. The most\ncommon application is to feed downstream applications such as automatic text\nreco…
View article: Evaluation of a Region Proposal Architecture for Multi-task Document Layout Analysis
Evaluation of a Region Proposal Architecture for Multi-task Document Layout Analysis Open
Automatically recognizing the layout of handwritten documents is an important step towards useful extraction of information from those documents. The most common application is to feed downstream applications such as automatic text recogni…
View article: A Probabilistic Framework for Lexicon-based Keyword Spotting in Handwritten Text Images
A Probabilistic Framework for Lexicon-based Keyword Spotting in Handwritten Text Images Open
Query by String Keyword Spotting (KWS) is here considered as a key technology for indexing large collections of handwritten text images to allow fast textual access to the contents of these collections. Under this perspective, a probabilis…
View article: Finnish Court Records-sub500. A dataset of Finnish notarial records (19th Century)
Finnish Court Records-sub500. A dataset of Finnish notarial records (19th Century) Open
This dataset is a selection of 500 pages from the Renovated District Court Records (19th century), one of the largest collections in the National Archives of Finland. The documents consists of records of deeds, mortgages, traditional life-…
View article: Finnish Court Records-sub500. A dataset of Finnish notarial records (19th Century)
Finnish Court Records-sub500. A dataset of Finnish notarial records (19th Century) Open
This dataset is a selection of 500 pages from the Renovated District Court Records (19th century), one of the largest collections in the National Archives of Finland. The documents consists of records of deeds, mortgages, traditional life-…
View article: Pattern recognition techniques for provenance classification of archaeological ceramics using ultrasounds
Pattern recognition techniques for provenance classification of archaeological ceramics using ultrasounds Open
View article: Transforming scholarship in the archives through handwritten text recognition
Transforming scholarship in the archives through handwritten text recognition Open
Purpose An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR, demonstrates Transkribus , gives examples of use cases,…
View article: A set of benchmarks for Handwritten Text Recognition on historical documents
A set of benchmarks for Handwritten Text Recognition on historical documents Open