Substring / StrReplace - huge 1 GB files - inline replace?

Get help with using AutoHotkey and its commands and hotkeys
User avatar
Taurus
Posts: 89
Joined: 20 Jan 2015, 10:31

Substring / StrReplace - huge 1 GB files - inline replace?

26 Apr 2020, 08:06

Hi,

I need a Substring / StrReplace for huge 1 GB files inline replace. Because this command copy the result into a new memory. Otherwise i get a "out of memory" error.

Is there anything? ;)
:beard: Full Stack Developer > Dev for a better world | PHP for Web | AHK H for Local | with KISS (Keep IT Short and Simple) on Win 10 Pro (Version 2004) x64
BNOLI
Posts: 548
Joined: 23 Mar 2020, 03:55

Re: Substring / StrReplace - huge 1 GB files - inline replace?

26 Apr 2020, 10:47

I remember that we've used Perl to deal with huge amounts of billing data bc for its performance and ability to digest everything we through in its way :shifty:
Last edited by BNOLI on 26 Apr 2020, 12:00, edited 1 time in total.
Remember to use [code]CODE[/code]-tags for your multi-line scripts. Stay safe, stay inside, and remember washing your hands for 20 sec !
User avatar
Taurus
Posts: 89
Joined: 20 Jan 2015, 10:31

Re: Substring / StrReplace - huge 1 GB files - inline replace?

26 Apr 2020, 10:57

I created code for counting, but still need substring/StrReplace.

Code: Select all

substr_count(ByRef Text, ByRef Zeichen) ; Count without double-amout of RAM because of return like StrReplace(Text, Zeichen, Zeichen, Amount)
{
	i := Pos := 0
	while(Pos := InStr(Text, Zeichen,,++Pos))
		++i
	return i
}
its 0.1 sec difference between this and StrReplace on 200.000 founded.
:beard: Full Stack Developer > Dev for a better world | PHP for Web | AHK H for Local | with KISS (Keep IT Short and Simple) on Win 10 Pro (Version 2004) x64
TAC109
Posts: 595
Joined: 02 Oct 2013, 19:41
Location: New Zealand

Re: Substring / StrReplace - huge 1 GB files - inline replace?

26 Apr 2020, 18:49

Use :arrow: #MaxMem 4095 in your script. Also consider using AutoHotkey U64 if not already.
My scripts:-
XRef - Produces Cross Reference lists for scripts
ReClip - A Text Reformatting and Clip Management utility
User avatar
Taurus
Posts: 89
Joined: 20 Jan 2015, 10:31

Re: Substring / StrReplace - huge 1 GB files - inline replace?

27 Apr 2020, 03:58

TAC109 wrote:
26 Apr 2020, 18:49
Use :arrow: #MaxMem 4095 in your script. Also consider using AutoHotkey U64 if not already.
Already using #MaxMem. But can't use x64 because of database drivers. Loop, parse, also runs out of memory. I think, i have to find a way to get on x64...
:beard: Full Stack Developer > Dev for a better world | PHP for Web | AHK H for Local | with KISS (Keep IT Short and Simple) on Win 10 Pro (Version 2004) x64
ahk7
Posts: 318
Joined: 06 Nov 2013, 16:35

Re: Substring / StrReplace - huge 1 GB files - inline replace?

27 Apr 2020, 04:36

Have you considered using command line tools, sed (http://gnuwin32.sourceforge.net/packages/sed.htm), fart (http://fart-it.sourceforge.net/) - they are usually pretty fast and you can use RunWait and/or https://www.autohotkey.com/docs/commands/Run.htm#StdOut
JosUd
Posts: 13
Joined: 26 Apr 2020, 07:32

Re: Substring / StrReplace - huge 1 GB files - inline replace?

27 Apr 2020, 05:00

Taurus wrote:
26 Apr 2020, 08:06
I need a Substring / StrReplace for huge 1 GB files inline replace. Because this command copy the result into a new memory. Otherwise i get a "out of memory" error.
But you don't need to process the entire file at once, right? Because `FileReadLine` (https://www.autohotkey.com/docs/commands/FileReadLine.htm) can also process one line at a time.

That way you can loop through the file (https://www.autohotkey.com/docs/commands/LoopReadFile.htm), make the necessary replacements on that particular line, and then move to the next line. All without reading 1GB in memory.
just me
Posts: 7403
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Substring / StrReplace - huge 1 GB files - inline replace?

27 Apr 2020, 07:15

... or you can define chunks of your choice using the File Object:

Code: Select all

#NoEnv
InpFile   := "......"
SearchFor := "......"
ReplaceBy := "......"
ChunkLen  := 4 * 1024 * 1024
MatchLen  := StrLen(SearchFor)
Chunk     := ""
OutVar    := ""
; ----------------------------------------------------------------------------------------
IF := FileOpen(InpFile, "r")
VarSetCapacity(OutVar, IF.Length)
; ----------------------------------------------------------------------------------------
While !IF.AtEOF {
   Chunk .= IF.Read(ChunkLen)
   If !(MatchPos := InStr(Chunk, SearchFor, 1, 0)) {
      OutVar .= SubStr(Chunk, 1, -MatchLen)
      Chunk := SubStr(Chunk, 1 - MatchLen)
   }
   Else {
      OutVar .= StrReplace(SubStr(Chunk, 1, MatchPos + MatchLen - 1), SearchFor, ReplaceBy)
      Chunk := SubStr(Chunk, MatchPos + MatchLen)
   }
}
OutVar .= Chunk
; ----------------------------------------------------------------------------------------
IF.CLose()
*Not tested!*


Never use FileReadLine to sequentially read large files!!!
User avatar
Taurus
Posts: 89
Joined: 20 Jan 2015, 10:31

Re: Substring / StrReplace - huge 1 GB files - inline replace?

27 Apr 2020, 09:22

just me wrote:
27 Apr 2020, 07:15
... or you can define chunks of your choice using the
; ----------------------------------------------------------------------------------------
IF.CLose()[/code]*Not tested!*


Never use FileReadLine to sequentially read large files!!!

Interesting. I will try to split it like you say. I thought there is another/faster builtin way, but it doesn't look like that.

And of course, never use FileReadLine. It's slow/bad.
:beard: Full Stack Developer > Dev for a better world | PHP for Web | AHK H for Local | with KISS (Keep IT Short and Simple) on Win 10 Pro (Version 2004) x64
JosUd
Posts: 13
Joined: 26 Apr 2020, 07:32

Re: Substring / StrReplace - huge 1 GB files - inline replace?

28 Apr 2020, 00:51

Taurus wrote:
27 Apr 2020, 09:22
And of course, never use FileReadLine. It's slow/bad.
Can you tell why? I ask so I can learn since I haven't experienced problems with it.
User avatar
Cuadrix
Posts: 224
Joined: 07 May 2017, 08:26

Re: Substring / StrReplace - huge 1 GB files - inline replace?

28 Apr 2020, 03:25

JosUd wrote:
28 Apr 2020, 00:51
Can you tell why? I ask so I can learn since I haven't experienced problems with it.
You haven't experienced any problems because your files aren't 1GB.
Try looping through million lines of anything and see how long it takes.
Imagine doing a conditional check for each and every line in a file containing millions of lines.

Your answer is not a solution that's efficient enough.
----
just me
Posts: 7403
Joined: 02 Oct 2013, 08:51
Location: Germany

Re: Substring / StrReplace - huge 1 GB files - inline replace?

28 Apr 2020, 05:26

@JosUd,

FileReadLine:
Remarks

It is strongly recommended to use this command only for small files, or in cases where only a single line of text is needed. To scan and process a large number of lines (one by one), use a file-reading loop for best performance. To read an entire file into a variable, use FileRead.
Files of 1GB contain about 10 (ANSI) respectively 5 (Unicode) million lines of about 100 characters.
JosUd
Posts: 13
Joined: 26 Apr 2020, 07:32

Re: Substring / StrReplace - huge 1 GB files - inline replace?

04 May 2020, 01:14

just me wrote:
28 Apr 2020, 05:26
FileReadLine:
Remarks

It is strongly recommended to use this command only for small files, or in cases where only a single line of text is needed. To scan and process a large number of lines (one by one), use a file-reading loop for best performance. To read an entire file into a variable, use FileRead.
Files of 1GB contain about 10 (ANSI) respectively 5 (Unicode) million lines of about 100 characters.
Thanks for the additional information and reference. I have some reading to do. :)

Return to “Ask For Help”

Who is online

Users browsing this forum: Bing [Bot], coolfukenguy, Google [Bot], laulau, mikeyww, Smoothglobal, Steuerfachwissen and 54 guests