Extracting info from PDF fields and copying into text fields Topic is solved

Get help with using AutoHotkey and its commands and hotkeys
JKnight_xbt33
Posts: 99
Joined: 18 Sep 2019, 02:06

Extracting info from PDF fields and copying into text fields

05 Feb 2020, 05:17

HI all,
In the picture below, I am trying to get info from the PDF fields on the right (blacked out) and copy them over into the text fields of the window on the left.
PDF & window with text fields.PNG
PDF & window with text fields.PNG (336.37 KiB) Viewed 1145 times
I've looked at a few threads but they are saying use a PDF converter. I don't want to do that, autohotkey alone will be most efficient.

Is there a possible way to get info from a PDF with autohotkey? e.g. COM Object for PDF files?

Appreciated
J
User avatar
Chunjee
Posts: 675
Joined: 18 Apr 2014, 19:05
GitHub: Chunjee

Re: Extracting info from PDF fields and copying into text fields

05 Feb 2020, 11:19

I think I use this to convert them to .txt then find what I want via RegEx
https://www.pdf2txt.com/
FanaticGuru
Posts: 1451
Joined: 30 Sep 2013, 22:25

Re: Extracting info from PDF fields and copying into text fields

05 Feb 2020, 13:48

JKnight_xbt33 wrote:
05 Feb 2020, 05:17
Is there a possible way to get info from a PDF with autohotkey? e.g. COM Object for PDF files?

If you have the full paid version of Adobe Acrobat, you can do this through COM objects.

He is some example code that I already had that has to do with fields:

Code: Select all

F12::
	AFormAut := ComObjCreate("AFormAut.App")
	for Field in AFormAut.Fields
	{
		fNum := A_Index - 1
		fName := Field.Name
		fValue := Field.Value
		MsgBox % "Field Number = " fNum "`nField Name = " fName "`nField Value = " fValue
	}
	AFormAut.Fields("Your Name").Value := "Fanatic Guru"
return

F11::
	App := ComObjCreate("AcroExch.App")
	AVDoc := App.GetActiveDoc()
	PDDoc := AVDoc.GetPDDoc()
	JSO	:= PDDoc.GetJSObject
	Loop % JSO.NumFields
	{
		fNum := A_Index - 1
		fName := JSO.GetNthFieldName(fNum)
		fValue := JSO.GetField(fName).Value
		MsgBox % "Field Number = " fNum "`nField Name = " fName "`nField Value = " fValue
	}
	JSO.GetField("Your Name").Value := "Fanatic Guru"
return


Esc::ExitApp
FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts

AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon

[Function] Timer - Create and Manage Timers
JKnight_xbt33
Posts: 99
Joined: 18 Sep 2019, 02:06

Re: Extracting info from PDF fields and copying into text fields

06 Feb 2020, 07:12

Thanks for your answer @FanaticGuru

Alas I have the free version.

1. Do you know how to call the free version as comobject?

I've looked at a few threads to use comobject on pdf but they all use the pro version ("AcroExch.App")

2. Once I know how to call free PDF versions I wanted to adjust your f11 code to use. However I don't understand what the below sections mean. Would you be able to break down what it means?

Code: Select all

fNum := A_Index - 1
fName := JSO.GetNthFieldName(fNum)
fValue := JSO.GetField(fName).Value

------------------------------------------------------------------------------------------------------------------------------------------------------------


@Chunjee thanks I couldn't download your editor due to admin block at work. However I used an online pdf to text converter which worked fine.

However it converts into one thick block of text with most fields that I want separated by colons e.g. Case ID:

Is there a way to extract specific blocks of text e.g. in the below example text how would i extract the info in red?

Case ID:2000000-000000000Serial Number Unit: xxxx-dddd-2232222Describe details of
reported issue Safe track ref.xxxxbb22The following PTC information was reported:
The unit flashed and stops working. BU SN xxxx-dddd-2232222


Thanks as always for your help
Appreciated
J
User avatar
Chunjee
Posts: 675
Joined: 18 Apr 2014, 19:05
GitHub: Chunjee

Re: Extracting info from PDF fields and copying into text fields

06 Feb 2020, 10:26

Yes you can use ahk's Regex to capture that red text. Using the pattern Case ID\:([\d-]*) I was able to separate out that first number: https://regex101.com/r/5rujKe/1/
You would need a long pattern and different capture groups to get all three or you can do three focused patterns to grab each one individually. You can read more about how to use RegEx in ahk here: https://www.autohotkey.com/docs/commands/RegExMatch.htm


Appreciated
J
FanaticGuru
Posts: 1451
Joined: 30 Sep 2013, 22:25

Re: Extracting info from PDF fields and copying into text fields

06 Feb 2020, 14:19

JKnight_xbt33 wrote:
06 Feb 2020, 07:12
Alas I have the free version.

1. Do you know how to call the free version as comobject?

The free version does not have much in the way of COM object interface. There are some very limited stuff that mostly have to do with navigating a PDF.

There might be a possibility of getting information about fields through the IAccessible interface but it is quite the rabbit hole that I have never had the need to go down as I have the full version of Adobe Acrobat.

FG
Hotkey Help - Help Dialog for Currently Running AHK Scripts

AHK Startup - Consolidate Multiply AHK Scripts with one Tray Icon

[Function] Timer - Create and Manage Timers
JKnight_xbt33
Posts: 99
Joined: 18 Sep 2019, 02:06

Re: Extracting info from PDF fields and copying into text fields

07 Feb 2020, 06:20

Fair enough, I've decided to think outside the box for my solution.

1. I could somehow do a loopreadline on the pdf and make each field into a pseudo array index. Then send these index arrays into the edit control fields of the programme of interest.
so far my code for this isn't working:

Code: Select all

^!w::
Loop, Read, C:\Users\JK\Documents\8-IT\BL_xxxxxxx_Txx_200000xxxx_update.pdf 
While % IDArr%A_index%
Send % "field" . IDArr%A_index% "`n"
2. Manually simulate the task. Use tab key to move between each field then copy and paste the the content to the edit control field of my programme of interest.
JKnight_xbt33
Posts: 99
Joined: 18 Sep 2019, 02:06

Re: Extracting info from PDF fields and copying into text fields  Topic is solved

17 Feb 2020, 04:16

So I've managed to bootlet a manual version of this. Even though its a simulation of what I would do manually to get the info from the PDF file its still pretty damn fast.

All I had to do is grab each field in then PDF file and assign it a variable, then send each variable to the other window of interest by doing SendRaw %

Code: Select all

^#3::
WinMinimizeAll
Sleep, 800

SetTitleMatchMode 2


winTitle := "Compl - Internet Explorer"
winClass := "ahk_class IEFrame"
hWnd1 := WinExist(winTitle) ; Get open PTC hWnd
ptrS := A_PtrSize ? "Ptr" : "UInt"

winTitle2 := "<Assigned On Save> -  - Occurrence Details - Q-Pulse"
winClass := "ahk_class WindowsForms10.Window.8.app.0.2eed1ca_r9_ad1"
hWnd2 := WinExist(winTitle2) ; Get open complaint hWnd
ptrS := A_PtrSize ? "Ptr" : "UInt"

if (hWnd1) {
	DllCall("ShowWindow", ptrS, hWnd1, "Int", 3) ; Maximize the PTC window
	DllCall("ShowWindow", ptrS, hWnd2, "Int", 6) ; Minimize the complaint window
	       }
		   
Sleep, 1000
Click, 679, 942 ;get complaint id
sleep, 100
Send ^{a}
sleep, 100
Send ^{c}
sleep, 100
caseID := clipboard 

sleep, 10
send {TAB 2} ;get country
sleep, 100
Send ^{a}
sleep, 100
Send ^{c}
sleep, 100
country := clipboard

sleep, 10
send {TAB 9} ;get complaint details
sleep, 100
Send ^{a}
sleep, 100
Send ^{c}
sleep, 100
details := clipboard

sleep, 10
send {TAB} ;get incident report details
sleep, 100
Send ^{a}
sleep, 100
Send ^{c}
sleep, 100
incident := clipboard

;2. paste variables into relevant Q-pulse fields


if (hWnd2) 
{
DllCall("ShowWindow", ptrS, hWnd2, "Int", 3) ; Maximize the complaint window
DllCall("ShowWindow", ptrS, hWnd1, "Int", 6) ; Minimize PTC window
}
Sleep, 1000

ControlClick, WindowsForms10.EDIT.app.0.2eed1ca_r9_ad18
SendRaw % caseID
sleep, 2000

ControlClick, WindowsForms10.EDIT.app.0.2eed1ca_r9_ad116
send {end}
SendRaw % caseID
sleep, 2000

ControlClick, WindowsForms10.EDIT.app.0.2eed1ca_r9_ad15
SendRaw % country
sleep, 2000

ControlClick, WindowsForms10.EDIT.app.0.2eed1ca_r9_ad110
SendRaw % incident
sleep, 2000

ControlClick, WindowsForms10.EDIT.app.0.2eed1ca_r9_ad16
SendRaw % details
sleep, 2000

return

Return to “Ask For Help”

Who is online

Users browsing this forum: Chunjee, scriptor2016, tatagi and 28 guests