Optical Music Recognition with Convolutional Sequence-to-Sequence Models. Article Swipe
YOU?
·
· 2017
· Open Access
·
· DOI: https://doi.org/10.5072/zenodo.243774
Optical Music Recognition (OMR) is an important technology within Music Information Retrieval. Deep learning models show promising results on OMR tasks, but symbol-level annotated data sets of sufficient size to train such models are not available and difficult to develop. We present a deep learning architecture called a Convolutional Sequence-to-Sequence model to both move towards an end-to-end trainable OMR pipeline, and apply a learning process that trains on full sentences of sheet music instead of individually labeled symbols. The model is trained and evaluated on a human generated data set, with various image augmentations based on real-world scenarios. This data set is the first publicly available set in OMR research with sufficient size to train and evaluate deep learning models. With the introduced augmentations a pitch recognition accuracy of 81% and a duration accuracy of 94% is achieved, resulting in a note level accuracy of 80%. Finally, the model is compared to commercially available methods, showing a large improvements over these applications.
Related Topics To Compare & Contrast
- Type
- article
- Language
- en
- Landing Page
- https://arxiv.org/abs/1707.04877
- OA Status
- green
- Cited By
- 7
- References
- 7
- Related Works
- 20
- OpenAlex ID
- https://openalex.org/W2963043364