Print Page - Optical character recognition

Title: Optical character recognition
Post by: sapero on July 05, 2007, 04:27:04 PM

Optical character recognition, usually abbreviated to OCR, is a type of computer software designed to translate images of handwritten or typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or Unicode).

I'm writing a boot software for automated banner-clicking, and need also OCR module. Why not my own? :)
This project supports only images with one line of text (the scanner is so simple), overlapped letters are not supported.
The algo is very simple:
1. Convert image colors to black/white
2. find first black pixel, then fllod-fill all overlapping pixels with another color (like red), calculating min/max positions (letter size)
3. resize image (the letter) writing pixels to 8x8 array of bytes
4. convert 8x8 bytes array to int64 (byte->bit) and find the result in database.

This example has implemented communication with the user, if a symbol could not be found in the database, a dialog displays this unknown symbol and asks for character.

edit: fixed bugs for unicode version

Title: Re: Optical character recognition
Post by: pistol350 on July 06, 2007, 04:14:19 AM

This is a great project!
Thank you for sharing your work! ;D

Title: Re: Optical character recognition
Post by: sapero on July 06, 2007, 05:45:10 AM

Thanks. I've updated the attachment, forgot to finish converting strings from previous only unicode version.
Note, this version is very slow, it uses the slowest gdi function Get/SetPixel, use it only for very small images!

Title: Re: Optical character recognition
Post by: pistol350 on July 06, 2007, 06:02:13 AM

Quote
Note, this version is very slow, it uses the slowest gdi function Get/SetPixel, use it only for very small images!

Yep!
I noticed that while trying to use other images.

Title: Re: Optical character recognition
Post by: sapero on July 09, 2007, 08:51:35 AM

The second version is much much faster - thank CreateDIBSection+GetDIBits the input bitmap bits are accessible in memory.
Added FindNextLine method that returns rectangle of next row of text.
Added GetLineTextFromRectA method that returns text from rectangle, usually a text row rect.
Added GetLineTextA - find next line rectangle and return the text (calls two above methods)
GetTextA - returns the whole text, scanning the bitmal row by row, appending \n to all lines.

todo: spaces !

IonicWind Software

Aurora Compiler => Software Projects => Topic started by: sapero on July 05, 2007, 04:27:04 PM