The present growth of digitization of documents demands an immediate solution to enable the archived valuable materials searchable and usable by users in order to achieve its objective. Our team has developed robust and efficient solutions to full the objectives.Learn More
A web framework for optical character recognition on 15 Indic scripts as well as English has been introduced. This web framework can be used by everyone for text recognition free-of-cost. An API has also been introduced to be used as 3rd tool in standalone applications or otherwise.Learn More
Recognizing scene text is a challenging problem, more than the recognition of scanned documents. Given the rapid growth of camera-based applications readily available on mobile phones, understanding scene text is more important than ever. Our goal is to fill this gap in understanding the scene.Learn More
The Optical Character Recognition Tool
Input File Format
Output File Format
OCR System supports 12 Indic Scripts and English Languages
Figure 1: (A) The architecture of a traditional OCR, which starts with symbol/character extraction and classification. (B) Our approach. We bypass the two harder modules of the traditional OCR. We directly output a Unicode sequence, given a word image.