Efficient multiscale and multifont optical character recognition system based on robust feature description

Abstract : Optical Character Recognition (OCR) is the process of translating images of text into a comprehensible machine format. Generally, an OCR system is composed of binariza-tion, segmentation and recognition stages. Given an extracted binary character, the recognition stage ensures its description and decides its corresponding ASCII code. In this paper, we propose a new OCR system that aims to high speed, Multiscale and Multifont character recognition. Our proposal is based essentially on robust description using a new Unified Character Descriptor (UCD). In addition, a character type-face and font-size recognition is performed to choose the adequate template for faster matching process. Obtained OCR Accuracy of our proposed System is 1.5x higher then that reached by Tesseract on the LRDE dataset.
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal-upec-upem.archives-ouvertes.fr/hal-01309987
Contributor : Rostom Kachouri <>
Submitted on : Tuesday, May 3, 2016 - 6:59:44 PM
Last modification on : Thursday, February 7, 2019 - 5:23:56 PM
Long-term archiving on : Tuesday, May 24, 2016 - 4:48:14 PM

File

IPTA15_OCR(accepté).pdf
Files produced by the author(s)

Identifiers

Citation

Mahmoud Soua, Rostom Kachouri, Mohamed Akil. Efficient multiscale and multifont optical character recognition system based on robust feature description. 5th International Conference on Image Processing Theory, Tools and Applications , Nov 2015, Orléans, France. ⟨10.1109/IPTA.2015.7367214⟩. ⟨hal-01309987⟩

Share

Metrics

Record views

221

Files downloads

421