AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

character limit with a_loopfield?

 
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
elchapin



Joined: 06 Mar 2007
Posts: 64
Location: Columbus, OH, USA

PostPosted: Tue Jun 10, 2008 1:37 pm    Post subject: character limit with a_loopfield? Reply with quote

Is there a character limit with A_LoopField?

I have 122301 characters in one field, and I'm parsing the fields from a text file. But this field gets cut off and the remaining part is treated as another field...

If there is a limit, is there a workaround?

-Thanks
_________________
My startup is Telesaur - a telecommuting job site.
Back to top
View user's profile Send private message Visit poster's website
Zippo()
Guest





PostPosted: Tue Jun 10, 2008 2:23 pm    Post subject: Reply with quote

This works for me...
Code:
Loop, 150000
   a .= "b"

a .= "`n"

FileAppend, %a%, test.txt
a=
FileRead, c, test.txt

Loop, Parse, c, `n
   MsgBox % StrLen(A_LoopField)


Are you sure you aren't encountering an extra delimiter in that large field?
Back to top
elchapin



Joined: 06 Mar 2007
Posts: 64
Location: Columbus, OH, USA

PostPosted: Tue Jun 10, 2008 2:34 pm    Post subject: Reply with quote

Zappo- Thanks for testing that...

I'm pretty sure the delimiters aren't found within the field. I'm using ÿ (alt+0255) as the delimiter, but I'm going to try to isolate the problem with your example... since your test works, it must be something in the field.

One other thought, do new lines effect parsing at all? I'm using UltraEdit, and it wraps the line at around 4100 characters...
_________________
My startup is Telesaur - a telecommuting job site.
Back to top
View user's profile Send private message Visit poster's website
SKAN



Joined: 26 Dec 2005
Posts: 8688

PostPosted: Tue Jun 10, 2008 2:43 pm    Post subject: Re: character limit with a_loopfield? Reply with quote

elchapin wrote:
Is there a character limit with A_LoopField?


No! The #maxmem directive affects it - whose default value is 65MB per variable:

Code:
#MaxMem 65 ; Default Value
VarSetCapacity( A,(32*1024*1024-1),65 ), VarSetCapacity( B,(32*1024*1024-0),66 ), L := A "`n" B
Loop, Parse, L, `n
  MsgBox, 0, % StrLen(A_LoopField), %A_LoopField%

_________________
URLGet - Internet Explorer based Downloader
Back to top
View user's profile Send private message Send e-mail
Zippo()
Guest





PostPosted: Tue Jun 10, 2008 2:54 pm    Post subject: Reply with quote

MmmHmmm it something in the field. I don't think using a Unicode character as a delemiter is going to work too well.

Is it possible to swap that out with a regular ANSI character (like @ or ^ or something)?

@SKAN: But the line is only just over 120kb Very Happy
Back to top
elchapin



Joined: 06 Mar 2007
Posts: 64
Location: Columbus, OH, USA

PostPosted: Tue Jun 10, 2008 2:55 pm    Post subject: Re: character limit with a_loopfield? Reply with quote

SKAN wrote:

Code:
#MaxMem 65 ; Default Value
VarSetCapacity( A,(32*1024*1024-1),65 ), VarSetCapacity( B,(32*1024*1024-0),66 ), L := A "`n" B
Loop, Parse, L, `n
  MsgBox, 0, % StrLen(A_LoopField), %A_LoopField%


Just to make sure I understand...
Does that mean that the character limit is 33554431 and MB limit is 65 for variable "A"?

Thanks for pointing out VarSetCapacity, SKAN.
_________________
My startup is Telesaur - a telecommuting job site.
Back to top
View user's profile Send private message Visit poster's website
elchapin



Joined: 06 Mar 2007
Posts: 64
Location: Columbus, OH, USA

PostPosted: Tue Jun 10, 2008 2:58 pm    Post subject: Reply with quote

Zippo() wrote:
MmmHmmm it something in the field. I don't think using a Unicode character as a delemiter is going to work too well.

Is it possible to swap that out with a regular ANSI character (like @ or ^ or something)?

@SKAN: But the line is only just over 120kb Very Happy


Unfortunately, I don't think I can swap it out. The field contains a bunch of OCR text from emails. But it doesn't hurt to try...

Is there an incompatibility with Unicode characters?
_________________
My startup is Telesaur - a telecommuting job site.
Back to top
View user's profile Send private message Visit poster's website
SKAN



Joined: 26 Dec 2005
Posts: 8688

PostPosted: Tue Jun 10, 2008 3:05 pm    Post subject: Re: character limit with a_loopfield? Reply with quote

elchapin wrote:
Just to make sure I understand...
Does that mean that the character limit is 33554431 and MB limit is 65 for variable "A"?


A is 32 MB - 1 byte
and
B is 32 MB

If I do not minus 1 from A then variable L will be longer than 64 MB as I am including a linefeed while concatenating A and B

Smile
_________________
URLGet - Internet Explorer based Downloader
Back to top
View user's profile Send private message Send e-mail
Zippo()
Guest





PostPosted: Tue Jun 10, 2008 3:12 pm    Post subject: Reply with quote

I don't know how much of an incompatibility there really is now as many work-arounds have been posted. Natively AHK has problems with Unicode.

A quick search on it might fix you up if it is too much of a pain to change the delimiters. Smile
Back to top
elchapin



Joined: 06 Mar 2007
Posts: 64
Location: Columbus, OH, USA

PostPosted: Tue Jun 10, 2008 8:19 pm    Post subject: Reply with quote

I tried to pinpoint the problem by creating a text file with "^" for delimiters. Two fields are used: title^body^.

Here's a link to the text file named "sample6.txt": http://drop.io/7rqgfsl

If you don't want to download the text file, this is basically what it is:

title^body^
test^zzzzzzzzzz...(z 65519 times)...zzzzzzzzzz This is where the problem starts^


And here's script:

Code:

Loop, read, sample6.txt
{
If A_LoopReadLine  1
   FileAppend, `n, output.csv

    ; Loop, parse, current line being read, character that divides field, character to omit
    Loop, parse, A_LoopReadLine, ^
    {
        If A_Index = 1
           {
              CurrentField = `"%A_LoopField%`",
              FileAppend, %CurrentField%, output.csv
           }
        else
        If A_Index = 2
           {
                  continue
           }
    }
}
return


It should write the first field of each line to a file named "output.csv", but this is what I get instead:

Quote:


"title",
"test",
"ere the problem starts",


The end of the field on line two is added! Agh! Sad
_________________
My startup is Telesaur - a telecommuting job site.
Back to top
View user's profile Send private message Visit poster's website
Zippo()
Guest





PostPosted: Wed Jun 11, 2008 2:00 am    Post subject: Reply with quote

Now the problem is the soft line breaks I think.

Is this what you want?
Code:
FileRead, OutputVar, sample6.txt
FileAppend, `n, output.cvs

Loop, Parse, OutputVar, `n
{
   Loop, Parse, A_LoopField, ^
   {
      If A_Index > 1
         Continue

      FileAppend, %A_LoopField%`n, output.csv
   }
}

OutputVar=
Back to top
Lexikos



Joined: 17 Oct 2006
Posts: 7295
Location: Australia

PostPosted: Wed Jun 11, 2008 11:26 am    Post subject: Re: character limit with a_loopfield? Reply with quote

SKAN wrote:
elchapin wrote:
Is there a character limit with A_LoopField?

No! The #maxmem directive affects it

  1. #MaxMem affects the automatic expansion of variables.

  2. Think of a built-in variable as a function: you cannot assign to it, only retrieve a value. The concept of automatic expansion does not apply, so neither does #MaxMem.

  3. When Loop, Parse begins, it creates a copy of the input variable's contents. I haven't delved very deeply, but it seems A_LoopField actually points to a location within this copy. It makes sense for performance: the string is copied only once, at the beginning of the loop. This also means that no restriction should be applied, since the text is already in memory.
Code:
; Set maximum to 1MB.
#MaxMem 1
MB:=1024*1024

; Create a 4MB variable.
VarSetCapacity(A, 4*MB, 65)
MsgBox % "StrLen(A) = " StrLen(A)

; Insert a delimiter at 2MB.
NumPut(Asc("|"), A, 2*MB, "char"), VarSetCapacity(A,-1)

; A_LoopField is not restricted by #MaxMem.
Loop, Parse, A, |
    MsgBox % "StrLen(A_LoopField) = " StrLen(A_LoopField)

; B must be expanded to fit A_LoopField, but #MaxMem causes it to fail.
Loop, Parse, A, |
    B := A_LoopField

SKAN wrote:
whose default value is 65MB per variable:
The default is 64MB.
Back to top
View user's profile Send private message Visit poster's website
SKAN



Joined: 26 Dec 2005
Posts: 8688

PostPosted: Wed Jun 11, 2008 11:30 am    Post subject: Re: character limit with a_loopfield? Reply with quote

Lexikos wrote:
The default is 64MB.


Uh! sorry.. that was a typo. Thanks for the clarification. Smile
_________________
URLGet - Internet Explorer based Downloader
Back to top
View user's profile Send private message Send e-mail
elchapin



Joined: 06 Mar 2007
Posts: 64
Location: Columbus, OH, USA

PostPosted: Wed Jun 11, 2008 12:27 pm    Post subject: Reply with quote

Zippo() wrote:
Now the problem is the soft line breaks I think.

Is this what you want?
Code:
FileRead, OutputVar, sample6.txt
FileAppend, `n, output.cvs

Loop, Parse, OutputVar, `n
{
   Loop, Parse, A_LoopField, ^
   {
      If A_Index > 1
         Continue

      FileAppend, %A_LoopField%`n, output.csv
   }
}

OutputVar=


Zippo(), thanks! That worked! It never occurred to me that I could use `n to parse a variable! Wow, that is beautiful!

SKAN / Lexikos - thanks for taking the time to explain things. That helps alot. Very Happy
_________________
My startup is Telesaur - a telecommuting job site.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group