Using HttpRequest.ahk with an online OCR API

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
Ciceroids
Posts: 11
Joined: 16 Apr 2016, 11:57

Using HttpRequest.ahk with an online OCR API

23 Nov 2016, 13:57

Can anyone help in configuring HttpRequest to work with a great online OCR service?

I am using version 2.49 of HttpRequest (available from https://dl.dropboxus...httpRequest.ahk) and have failed so far to configure it to work with the free online OCR service at https://ocr.space/OCRAPI. The particular use-case I am working on involves uploading an image file for text recognition so that the text results are received back in my script. No matter what parameters I throw at the ocr API using HttpRequest, the API returns an error message indicating that no file or URL source has been specified but sometimes I get internal errors generated by HttpRequest itself.

There is some example code of how to use the service’s API on their site but no example using AutoHotkey. I did test my source image file using the curl command-line utility and this worked without a hitch. Of the five parameters shown in their curl example, only three are mandatory: the service’s URL, the filename of the source image and the user’s API key.

The service’s target URL (https://api.ocr.space/Parse/Image) becomes the first parameter in the call to HttpRequest and the API key is one of the posted headers (apikey: helloworld). The only problem therefore relates to the source image file and what to put in the second, third and fourth parameters in HttpRequest.

The introductory descriptions of parameters at https://autohotkey.com/board/topic/6798 ... nicodex64/ include the example ‘Upload: C:\My Awesome Pic.png’ as an example of what to include in the Options parameter list if you want to upload a file. Since the second (Data) parameter normally specifies what is to be posted, it is left unclear as to what should be put in the Data parameter where the Upload option is used. Can it be left blank, or do you specify the source image’s file path again and, if so, do you prefix that with a keyname such as ‘File:’?

The curl utility evidently does some more background housekeeping than is performed by HttpRequest, for example, specification of the length and type of the source being posted. In my case I included ‘Content_length: fileSize`nContent_Type: application/octet-stream’ in the Headers parameter list. The value used for Content_type would have been the default supplied by HttpRequest but I included it anyway.

The four params are thus:

URL := “https://api.ocr.space/Parse/Image
Data := filePath
Headers := “Content_Length: “ . fileSize . “`nContent_Type: application/octet-stream`napikey: helloworld”
Options := “Method: POST`nUpload: “ . filepath

Using these params, HttpRequest generates an error, saying it can’t get the file size using GetFileSizeEx. OK, I comment out reference to that function in HttpRequest.ahk since I am supplying the correct fileSize param anyway. This generates another HttpRequest error, saying that the ReadFile function is failing. At this point, the full path of the source image file is verified and sanity is preserved.

Perhaps the repetition of the file path in the Data param is causing confusion. I set Data := “” and this time, the ocr API returns an error saying that no source file or url has been specified. The purpose of HttpRequest is to provide all the coding required to communicate with Web-based API. All you have to do is to provide the right parameters. Should be easy…

At this point, I start to realise that the Data, Headers and Options parameters might all be configured in different ways, eg.

Data: “”, “c:\MyAwesomePic.png” or “File: c:\MyAwesomePic.png”
Headers: include Content_Length, omit Content_length, include Content_Type, omit Content_Type
Options: include ‘Upload: c:\\MyAwesomePic.png” or exclude it.

I tried all possible combinations of these values. Same error messages. It then struck me that despite the reference to Content_Type being application/octet-stream, I might have to do something extra to the file in order to get it picked up by the API, for example, should the reference to the file in the Data parameter be a reference to a file object representing the file and not simply the file path? I looked at some Python code which successfully implements the API and noticed that a file object was indeed being sent as the data for posting. So, I created an object for the source file using FileOpen and placed the object in the Data parameter. This did not work.

I did, of course, contact the service’s support team who declined to help on the basis that no-one there had any experience with AutoHotkey. Can any of you good people help?
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: Using HttpRequest.ahk with an online OCR API

24 Nov 2016, 02:00

I tried with CreateFormData, but it didn't work, I don't know why..

Code: Select all

oForm := { apikey: "f53342743188957"
         , language: "en"
         , isOverlayRequired: "true"
         , file: ["screenshot.jpg"] }

CreateFormData_WinInet(data, contentType, oForm)

url := "https://api.ocr.space/Parse/Image"
hdr := "Content-Type: " . contentType
HttpRequest(url, data, hdr)
MsgBox, % data
I've been using CreateFormData for many sites, without any problem. For example upload image to http://uploads.im:
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: Using HttpRequest.ahk with an online OCR API

24 Nov 2016, 02:39

Solved..
language: "en"
The "en" is not a valid value, it should be "eng".. :crazy:
User avatar
haichen
Posts: 631
Joined: 09 Feb 2014, 08:24

Re: Using HttpRequest.ahk with an online OCR API

24 Nov 2016, 04:23

I took the code from suresh https://autohotkey.com/boards/viewtopic ... 687#p85687 because BinArr() doesn't work for me. May be this is helpful for others.
Perhaps you can add BinArr() directly to your CreateFormData or a link to it.

eng ... ha..ha
I would'nt have found that.

Code: Select all

oForm := { apikey: "fXXXXXXXXX7"
         , language: "eng"
         , cache: "false"
         , contentType: "false"
         , processData: "false"
         , type: "json"
         , file: ["Unbenannt.png"] }

CreateFormData(PostData, ContentType, oForm)
url := "https://api.ocr.space/Parse/Image"
whr := ComObjCreate("WinHttp.WinHttpRequest.5.1")
whr.Open("POST", url, true)
whr.SetRequestHeader("Content-Type", ContentType)
whr.SetRequestHeader("Referer", "https://api.ocr.space/")
whr.SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko")
whr.Option(6) := False ; No auto redirect
whr.Send(PostData)
whr.WaitForResponse()
json_str:= whr.ResponseText
MsgBox, % json_str

return



; CreateFormData() by tmplinshi, AHK Topic: https://autohotkey.com/boards/viewtopic.php?t=7647
; Thanks to Coco: https://autohotkey.com/boards/viewtopic.php?p=41731#p41731
; Modified version by SKAN, 09/May/2016
;Version from Suresh https://autohotkey.com/boards/viewtopic.php?p=85687#p85687


CreateFormData(ByRef retData, ByRef retHeader, objParam) {
	New CreateFormData(retData, retHeader, objParam)
}

Class CreateFormData {

	__New(ByRef retData, ByRef retHeader, objParam) {

		Local CRLF := "`r`n", i, k, v, str, pvData
		; Create a random Boundary
		Local Boundary := this.RandomBoundary()
		Local BoundaryLine := "------------------------------" . Boundary

    this.Len := 0 ; GMEM_ZEROINIT|GMEM_FIXED = 0x40
    this.Ptr := DllCall( "GlobalAlloc", "UInt",0x40, "UInt",1, "Ptr"  )          ; allocate global memory

		; Loop input paramters
		For k, v in objParam
		{
			If IsObject(v) {
				For i, FileName in v
				{
					str := BoundaryLine . CRLF
					     . "Content-Disposition: form-data; name=""" . k . """; filename=""" . FileName . """" . CRLF
					     . "Content-Type: " . this.MimeType(FileName) . CRLF . CRLF
          this.StrPutUTF8( str )
          this.LoadFromFile( Filename )
          this.StrPutUTF8( CRLF )
				}
			} Else {
				str := BoundaryLine . CRLF
				     . "Content-Disposition: form-data; name=""" . k """" . CRLF . CRLF
				     . v . CRLF
        this.StrPutUTF8( str )
			}
		}

		this.StrPutUTF8( BoundaryLine . "--" . CRLF )

    ; Create a bytearray and copy data in to it.
    retData := ComObjArray( 0x11, this.Len ) ; Create SAFEARRAY = VT_ARRAY|VT_UI1
    pvData  := NumGet( ComObjValue( retData ) + 8 + A_PtrSize )
    DllCall( "RtlMoveMemory", "Ptr",pvData, "Ptr",this.Ptr, "Ptr",this.Len )

    this.Ptr := DllCall( "GlobalFree", "Ptr",this.Ptr, "Ptr" )                   ; free global memory 

    retHeader := "multipart/form-data; boundary=----------------------------" . Boundary
	}

  StrPutUTF8( str ) {
    Local ReqSz := StrPut( str, "utf-8" ) - 1
    this.Len += ReqSz                                  ; GMEM_ZEROINIT|GMEM_MOVEABLE = 0x42
    this.Ptr := DllCall( "GlobalReAlloc", "Ptr",this.Ptr, "UInt",this.len + 1, "UInt", 0x42 )   
    StrPut( str, this.Ptr + this.len - ReqSz, ReqSz, "utf-8" )
  }
  
  LoadFromFile( Filename ) {
    Local objFile := FileOpen( FileName, "r" )
    this.Len += objFile.Length                     ; GMEM_ZEROINIT|GMEM_MOVEABLE = 0x42 
    this.Ptr := DllCall( "GlobalReAlloc", "Ptr",this.Ptr, "UInt",this.len, "UInt", 0x42 )
    objFile.RawRead( this.Ptr + this.Len - objFile.length, objFile.length )
    objFile.Close()       
  }

	RandomBoundary() {
		str := "0|1|2|3|4|5|6|7|8|9|a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z"
		Sort, str, D| Random
		str := StrReplace(str, "|")
		Return SubStr(str, 1, 12)
	}

	MimeType(FileName) {
		n := FileOpen(FileName, "r").ReadUInt()
		Return (n        = 0x474E5089) ? "image/png"
		     : (n        = 0x38464947) ? "image/gif"
		     : (n&0xFFFF = 0x4D42    ) ? "image/bmp"
		     : (n&0xFFFF = 0xD8FF    ) ? "image/jpeg"
		     : (n&0xFFFF = 0x4949    ) ? "image/tiff"
		     : (n&0xFFFF = 0x4D4D    ) ? "image/tiff"
		     : "application/octet-stream"
	}

}


tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: Using HttpRequest.ahk with an online OCR API

24 Nov 2016, 05:00

haichen wrote:because BinArr() doesn't work for me.
:think: hmm, maybe someday I'll replace with SKAN' version. Thanks for the info.
User avatar
haichen
Posts: 631
Joined: 09 Feb 2014, 08:24

Re: Using HttpRequest.ahk with an online OCR API

24 Nov 2016, 05:15

:D ..How to get the text:

Code: Select all

;...
;https://github.com/cocobelgica/AutoHotkey-JSON
parsed := JSON.Load(json_str)
MsgBox, % parsed.ParsedResults[1].ParsedText 
return
;...
#include json.ahk
User avatar
haichen
Posts: 631
Joined: 09 Feb 2014, 08:24

Re: Using HttpRequest.ahk with an online OCR API

28 Nov 2016, 11:40

I want send the image url to the ocr. This is easy and works well.. until i want to use a site where i've to login. If i'm logged in i can get the url of the image (rightclick), but sending the url to the ocr-api don't work. I tried to login with winhttp request but how to send it then to ocr?
Ciceroids
Posts: 11
Joined: 16 Apr 2016, 11:57

Re: Using HttpRequest.ahk with an online OCR API

28 Nov 2016, 13:03

Many thanks for the helpful responses.
tmplinshi
Posts: 1604
Joined: 01 Oct 2013, 14:57

Re: Using HttpRequest.ahk with an online OCR API

28 Nov 2016, 21:33

haichen wrote:I want send the image url to the ocr. This is easy and works well.. until i want to use a site where i've to login. If i'm logged in i can get the url of the image (rightclick), but sending the url to the ocr-api don't work. I tried to login with winhttp request but how to send it then to ocr?
If the image cannot access by everyone, then you should download the image first.

Return to “Ask for Help (v1)”

Who is online

Users browsing this forum: No registered users and 255 guests