Jump to content

Sky Slate Blueberry Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate
Photo

Optical Character Recognition (OCR) - gocr [CLI]


  • Please log in to reply
8 replies to this topic
daonlyfreez
  • Members
  • 995 posts
  • Last active: Jan 23 2013 08:16 AM
  • Joined: 16 Mar 2005
gocr

Command line utility that reads pnm/pgm/pbm/ppm/pcx files (a grayscale version of your picture to scan) and outputs found text in: ISO8859_1/TeX/HTML/XML/UTF8/ASCII...

Can recognize barcodes!

Optical Character Recognition --- gocr 0.40
using: gocr [options] pnm_file_name # use - for stdin
options:
-h - get this help
-i name - input image file (pnm,pgm,pbm,ppm,pcx,...)
-i - - read PNM from stdin (djpeg -gray a.jpg | gocr -)
-o name - output file (redirection of stdout)
-e name - logging file (redirection of stderr)
-x name - progress output (file or fifo)
-p name - database path (including final slash, default is ./db/)
-f fmt - output format (ISO8859_1 TeX HTML XML UTF8 ASCII)
-l num - threshold grey level 0<160<=255 (0 = autodetect)
-d num - dust_size (remove smaller clusters, -1 = autodetect)
-s num - spacewidth/dots (0 = autodetect)
-v num - verbose [summed]
1 print more info
2 list shapes of boxes (see -c)
4 list pattern of boxes (see -c)
8 print pattern after recognition
16 print line infos
32 debug outXX.pgm
-c string - list of chars (_ = not recognized chars, debug)
-C string - char filter (ex. hexdigits: 0-9A-Fx, only ASCII)
-m num - operation modes, ~ = switch off
2 use database (early development)
4 layout analysis, zoning (development)
8 ~ compare non recognized chars
16 ~ divide overlapping chars
32 ~ context correction
64 char packing (development)
130 extend database, prompts user (128+2, early development)
256 switch off the OCR engine (makes sense together with -m 2)
-n 1 only numbers
examples:
gocr -v 33 text1.pbm # some infos + out30.bmp
gocr -v 7 -c _YV text1.pbm # list unknown, Y and V chars
djpeg -pnm -gray text.jpg | gocr - # use jpeg-file via pipe


If you need it, you can download djpeg.exe here (direct download)...

8)
Posted Image mirror 1mirror 2mirror 3ahk4.me • PM or Posted Image

Chris
  • Administrators
  • 10727 posts
  • Last active:
  • Joined: 02 Mar 2004
It's good to know there's something like this out there. Now there's a chance to extract text from event the most stubborn controls, and perhaps even from some games if they use a recognizable font.

daonlyfreez
  • Members
  • 995 posts
  • Last active: Jan 23 2013 08:16 AM
  • Joined: 16 Mar 2005
This is the result of scanning a screenshot of the AutoHotkey homepage.

Far from perfect, but useable...

Notice the attempt to read 'AutoHotkey' in the top-left image (aA'uioHoiR'eý), and the quite good result on 'Automation. Hotkeys. Scripting.' in the same image (Automation, Hotkeys. ScrIpting.)

. , '',, Download Quick-start Tutorial
aA'uioHoiR'eý'''_
_ _ _ DOCUmentat i On SUPPOrt
. , , , . . . __.,_ all wordsQ,_ exak phrase
' ' ''' '' Change I og Forum I w,_, SearCh
Automation, Hotkeys. ScrIpting.
r_lews
The latest versIon Is l ,O ,35 , 12 (released June 18, 2005) __ changelog I download
Wolfgang Res_el wrote an akIcle for c't Maga_Ine to Introduce and demonstrate the benefIts of AutoHotkey , Read the EnglIsh translatIon ,
Introduction
AutoHotkey Is a free, open-source utIIIty for WIndows , WIth It, you can _,
_ Automate almost anythIng by sendIng keystrokes and mouse clIcks , You can wrIte a mouse or keyboard macro by hand or use the
macro recorder,
_ Create hotkeys for keyboard, )oystIck, and mouse , VIrtually any key, button, or combInatIon can become a hotkey ,
_ Expand abbrevIatIons as you type them , For example, typIng ''btw'' can automatIcally produce ''by the way'' ,
_ Create custom data entry forms, user Interfaces, and menu bars , See GUI for detaIIs ,
_ Remap keys and buttons on your keyboard, )oystIck, and mouse ,
_ Run exIstIng AutoIt v2 scrIpts and enhance them wIth new capabIIItIes ,
_ Convert any scrIpt Into an EXE fle that can be run on computers that don't have AutoHotkey Installed ,
GettIng started mIght be easIer than you thInk , Check out the quIck-start tutorIal ,


Posted Image mirror 1mirror 2mirror 3ahk4.me • PM or Posted Image

Rajat
  • Members
  • 1904 posts
  • Last active: Jul 17 2015 07:45 AM
  • Joined: 28 Mar 2004
sounds great!!

MIA

CleanNews.in : Bite sized latest news headlines from India with zero bloat


feejo
  • Members
  • 280 posts
  • Last active: May 29 2009 06:39 PM
  • Joined: 16 Jun 2007
How dose it work with AHK?

engunneer
  • Moderators
  • 9162 posts
  • Last active: Sep 12 2014 10:36 PM
  • Joined: 30 Aug 2005
as with most cli utilities in this section of the forum, AHK can call it via the run command. There are a few example scripts for gocr in the forum.

Edit: typo

IvanZT
  • Members
  • 5 posts
  • Last active: Sep 13 2007 09:47 AM
  • Joined: 12 Jul 2007

as with most cli utilities in this section of tthe orum, AHK can call it via the run command. There are a few example scripts for gocr in the forum.

Well, I've searched and did'nt found the code exapmles. :(
Could you give a link, pls?

daonlyfreez
  • Members
  • 995 posts
  • Last active: Jan 23 2013 08:16 AM
  • Joined: 16 Mar 2005
Search for "gocr"...

Search found 31 matches


Posted Image mirror 1mirror 2mirror 3ahk4.me • PM or Posted Image

tachyio
  • Members
  • 1 posts
  • Last active: Jul 08 2011 04:03 PM
  • Joined: 08 Jul 2011
I was thinking about our human eyes and how they function. Actually our eyes aren't that great, resolution is limited, lighting may be bad...but somehow our brain's "software" is able to compensate for these defects using context.

Maybe we could improve on OCR by providing higher resolution pictures of the text, even computer screens have very low resolutions. So a higher res screen might perform better by providing more pixel data? iPhone 4?

Or a contextually aware algorithm, so for instance the word "whItE" in a sentence would be auto corrected to "white". :)