October 26, 2025, 12:40:38 AM

News:

IWBasic runs in Windows 11!


Optical character recognition

Started by sapero, July 05, 2007, 04:27:04 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

sapero

July 05, 2007, 04:27:04 PM Last Edit: July 06, 2007, 05:35:35 AM by sapero
Optical character recognition, usually abbreviated to OCR, is a type of computer software designed to translate images of handwritten or typewritten text (usually captured by a scanner) into machine-editable text, or to translate pictures of characters into a standard encoding scheme representing them (e.g. ASCII or Unicode).

I'm writing a boot software for automated banner-clicking, and need also OCR module. Why not my own? :)
This project supports only images with one line of text (the scanner is so simple), overlapped letters are not supported.
The algo is very simple:
1. Convert image colors to black/white
2. find first black pixel, then fllod-fill all overlapping pixels with another color (like red), calculating min/max positions (letter size)
3. resize image (the letter) writing pixels to 8x8 array of bytes
4. convert 8x8 bytes array to int64 (byte->bit) and find the result in database.

This example has implemented communication with the user, if a symbol could not be found in the database, a dialog displays this unknown symbol and asks for character.

edit: fixed bugs for unicode version

pistol350

This is a great project!
Thank you for sharing your work! ;D
Regards,

Peter B.

sapero

Thanks. I've updated the attachment, forgot to finish converting strings from previous only unicode version.
Note, this version is very slow, it uses the slowest gdi function Get/SetPixel, use it only for very small images!

pistol350

Quote
Note, this version is very slow, it uses the slowest gdi function Get/SetPixel, use it only for very small images!

Yep!
I noticed that while trying to use other images.
Regards,

Peter B.

sapero

The second version is much much faster - thank CreateDIBSection+GetDIBits the input bitmap bits are accessible in memory.
Added FindNextLine method that returns rectangle of next row of text.
Added GetLineTextFromRectA method that returns text from rectangle, usually a text row rect.
Added GetLineTextA - find next line rectangle and return the text (calls two above methods)
GetTextA - returns the whole text, scanning the bitmal row by row, appending \n to all lines.

todo: spaces !