 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
elchapin
Joined: 06 Mar 2007 Posts: 64 Location: Columbus, OH, USA
|
Posted: Tue Jun 10, 2008 1:37 pm Post subject: character limit with a_loopfield? |
|
|
Is there a character limit with A_LoopField?
I have 122301 characters in one field, and I'm parsing the fields from a text file. But this field gets cut off and the remaining part is treated as another field...
If there is a limit, is there a workaround?
-Thanks _________________ My startup is Telesaur - a telecommuting job site. |
|
| Back to top |
|
 |
Zippo() Guest
|
Posted: Tue Jun 10, 2008 2:23 pm Post subject: |
|
|
This works for me...
| Code: | Loop, 150000
a .= "b"
a .= "`n"
FileAppend, %a%, test.txt
a=
FileRead, c, test.txt
Loop, Parse, c, `n
MsgBox % StrLen(A_LoopField) |
Are you sure you aren't encountering an extra delimiter in that large field? |
|
| Back to top |
|
 |
elchapin
Joined: 06 Mar 2007 Posts: 64 Location: Columbus, OH, USA
|
Posted: Tue Jun 10, 2008 2:34 pm Post subject: |
|
|
Zappo- Thanks for testing that...
I'm pretty sure the delimiters aren't found within the field. I'm using ÿ (alt+0255) as the delimiter, but I'm going to try to isolate the problem with your example... since your test works, it must be something in the field.
One other thought, do new lines effect parsing at all? I'm using UltraEdit, and it wraps the line at around 4100 characters... _________________ My startup is Telesaur - a telecommuting job site. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 8688
|
Posted: Tue Jun 10, 2008 2:43 pm Post subject: Re: character limit with a_loopfield? |
|
|
| elchapin wrote: | | Is there a character limit with A_LoopField? |
No! The #maxmem directive affects it - whose default value is 65MB per variable:
| Code: | #MaxMem 65 ; Default Value
VarSetCapacity( A,(32*1024*1024-1),65 ), VarSetCapacity( B,(32*1024*1024-0),66 ), L := A "`n" B
Loop, Parse, L, `n
MsgBox, 0, % StrLen(A_LoopField), %A_LoopField% |
_________________ URLGet - Internet Explorer based Downloader |
|
| Back to top |
|
 |
Zippo() Guest
|
Posted: Tue Jun 10, 2008 2:54 pm Post subject: |
|
|
MmmHmmm it something in the field. I don't think using a Unicode character as a delemiter is going to work too well.
Is it possible to swap that out with a regular ANSI character (like @ or ^ or something)?
@SKAN: But the line is only just over 120kb  |
|
| Back to top |
|
 |
elchapin
Joined: 06 Mar 2007 Posts: 64 Location: Columbus, OH, USA
|
Posted: Tue Jun 10, 2008 2:55 pm Post subject: Re: character limit with a_loopfield? |
|
|
| SKAN wrote: |
| Code: | #MaxMem 65 ; Default Value
VarSetCapacity( A,(32*1024*1024-1),65 ), VarSetCapacity( B,(32*1024*1024-0),66 ), L := A "`n" B
Loop, Parse, L, `n
MsgBox, 0, % StrLen(A_LoopField), %A_LoopField% |
|
Just to make sure I understand...
Does that mean that the character limit is 33554431 and MB limit is 65 for variable "A"?
Thanks for pointing out VarSetCapacity, SKAN. _________________ My startup is Telesaur - a telecommuting job site. |
|
| Back to top |
|
 |
elchapin
Joined: 06 Mar 2007 Posts: 64 Location: Columbus, OH, USA
|
Posted: Tue Jun 10, 2008 2:58 pm Post subject: |
|
|
| Zippo() wrote: | MmmHmmm it something in the field. I don't think using a Unicode character as a delemiter is going to work too well.
Is it possible to swap that out with a regular ANSI character (like @ or ^ or something)?
@SKAN: But the line is only just over 120kb  |
Unfortunately, I don't think I can swap it out. The field contains a bunch of OCR text from emails. But it doesn't hurt to try...
Is there an incompatibility with Unicode characters? _________________ My startup is Telesaur - a telecommuting job site. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 8688
|
Posted: Tue Jun 10, 2008 3:05 pm Post subject: Re: character limit with a_loopfield? |
|
|
| elchapin wrote: | Just to make sure I understand...
Does that mean that the character limit is 33554431 and MB limit is 65 for variable "A"? |
A is 32 MB - 1 byte
and
B is 32 MB
If I do not minus 1 from A then variable L will be longer than 64 MB as I am including a linefeed while concatenating A and B
 _________________ URLGet - Internet Explorer based Downloader |
|
| Back to top |
|
 |
Zippo() Guest
|
Posted: Tue Jun 10, 2008 3:12 pm Post subject: |
|
|
I don't know how much of an incompatibility there really is now as many work-arounds have been posted. Natively AHK has problems with Unicode.
A quick search on it might fix you up if it is too much of a pain to change the delimiters.  |
|
| Back to top |
|
 |
elchapin
Joined: 06 Mar 2007 Posts: 64 Location: Columbus, OH, USA
|
Posted: Tue Jun 10, 2008 8:19 pm Post subject: |
|
|
I tried to pinpoint the problem by creating a text file with "^" for delimiters. Two fields are used: title^body^.
Here's a link to the text file named "sample6.txt": http://drop.io/7rqgfsl
If you don't want to download the text file, this is basically what it is:
title^body^
test^zzzzzzzzzz...(z 65519 times)...zzzzzzzzzz This is where the problem starts^
And here's script:
| Code: |
Loop, read, sample6.txt
{
If A_LoopReadLine 1
FileAppend, `n, output.csv
; Loop, parse, current line being read, character that divides field, character to omit
Loop, parse, A_LoopReadLine, ^
{
If A_Index = 1
{
CurrentField = `"%A_LoopField%`",
FileAppend, %CurrentField%, output.csv
}
else
If A_Index = 2
{
continue
}
}
}
return
|
It should write the first field of each line to a file named "output.csv", but this is what I get instead:
| Quote: |
"title",
"test",
"ere the problem starts", |
The end of the field on line two is added! Agh!  _________________ My startup is Telesaur - a telecommuting job site. |
|
| Back to top |
|
 |
Zippo() Guest
|
Posted: Wed Jun 11, 2008 2:00 am Post subject: |
|
|
Now the problem is the soft line breaks I think.
Is this what you want?
| Code: | FileRead, OutputVar, sample6.txt
FileAppend, `n, output.cvs
Loop, Parse, OutputVar, `n
{
Loop, Parse, A_LoopField, ^
{
If A_Index > 1
Continue
FileAppend, %A_LoopField%`n, output.csv
}
}
OutputVar= |
|
|
| Back to top |
|
 |
Lexikos
Joined: 17 Oct 2006 Posts: 7295 Location: Australia
|
Posted: Wed Jun 11, 2008 11:26 am Post subject: Re: character limit with a_loopfield? |
|
|
| SKAN wrote: | | elchapin wrote: | | Is there a character limit with A_LoopField? |
No! The #maxmem directive affects it |
- #MaxMem affects the automatic expansion of variables.
- Think of a built-in variable as a function: you cannot assign to it, only retrieve a value. The concept of automatic expansion does not apply, so neither does #MaxMem.
- When Loop, Parse begins, it creates a copy of the input variable's contents. I haven't delved very deeply, but it seems A_LoopField actually points to a location within this copy. It makes sense for performance: the string is copied only once, at the beginning of the loop. This also means that no restriction should be applied, since the text is already in memory.
| Code: | ; Set maximum to 1MB.
#MaxMem 1
MB:=1024*1024
; Create a 4MB variable.
VarSetCapacity(A, 4*MB, 65)
MsgBox % "StrLen(A) = " StrLen(A)
; Insert a delimiter at 2MB.
NumPut(Asc("|"), A, 2*MB, "char"), VarSetCapacity(A,-1)
; A_LoopField is not restricted by #MaxMem.
Loop, Parse, A, |
MsgBox % "StrLen(A_LoopField) = " StrLen(A_LoopField)
; B must be expanded to fit A_LoopField, but #MaxMem causes it to fail.
Loop, Parse, A, |
B := A_LoopField |
| SKAN wrote: | | whose default value is 65MB per variable: | The default is 64MB. |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 8688
|
Posted: Wed Jun 11, 2008 11:30 am Post subject: Re: character limit with a_loopfield? |
|
|
| Lexikos wrote: | | The default is 64MB. |
Uh! sorry.. that was a typo. Thanks for the clarification.  _________________ URLGet - Internet Explorer based Downloader |
|
| Back to top |
|
 |
elchapin
Joined: 06 Mar 2007 Posts: 64 Location: Columbus, OH, USA
|
Posted: Wed Jun 11, 2008 12:27 pm Post subject: |
|
|
| Zippo() wrote: | Now the problem is the soft line breaks I think.
Is this what you want?
| Code: | FileRead, OutputVar, sample6.txt
FileAppend, `n, output.cvs
Loop, Parse, OutputVar, `n
{
Loop, Parse, A_LoopField, ^
{
If A_Index > 1
Continue
FileAppend, %A_LoopField%`n, output.csv
}
}
OutputVar= |
|
Zippo(), thanks! That worked! It never occurred to me that I could use `n to parse a variable! Wow, that is beautiful!
SKAN / Lexikos - thanks for taking the time to explain things. That helps alot.  _________________ My startup is Telesaur - a telecommuting job site. |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|