LineStr() : Extract any line or consecutive lines from text

Post your working scripts, libraries and tools
User avatar
SKAN
Posts: 816
Joined: 29 Sep 2013, 16:58

LineStr() : Extract any line or consecutive lines from text

01 Apr 2020, 15:48

I was inspired to to write this function when I was writing Usage examples #5 for my following function:
xStr() : for general text extraction and parsing XML / HTML

LinStr() is a wrapper for SubStr().
The difference between SubStr() and LineStr() is former extracts chars while LineStr() extracts lines.
LineStr() mimics the parameter usage of SubStr().
LineStr() auto-detects if line terminator is CRLF or LF,
One additional parameter D (Delimiter) accepts custom char(s) as delimiter
  • Code: Select all

    MsgBox % LineStr("Item 1|Item 2|Item 3|Item 4", 2, 1, "|")


The function

Code: Select all

LineStr(ByRef S, P, C:="", D:="") {   ;  LineStr v0.9c,   by SKAN on D341/D34M @ tiny.cc/linestr
Local L := StrLen(S),   DL := StrLen(D:=(D ? D : Instr(S,"`r`n") ? "`r`n" : "`n") ),   F, P1, P2 
Return SubStr(S,(P1:=L?(P!=1&&InStr(S,D,,0))?(F:=InStr(S,D,,P>0,Abs(P-1)))?F+DL:P-1<1?1:0:(F:=1)
:0),(P2:=(P1&&C!=0)?C!=""?(F:=InStr(S,D,,(C>0?F+DL:0),Abs(C)))?F-1:C>0?L:1:L:0)>P1?P2-P1+1:0)
}
The function - Readable version


Usage examples

Code: Select all

H := "Line_1`r`nLine_2`r`nLine_3`r`nLine_4`r`nLine_5`r`nLine_6`r`nLine_7`r`nLine_8`r`nLine_9"

MsgBox,,Example 1, % "Extract first line`n`n" .                      LineStr(H, 1, 1)
MsgBox,,Example 2, % "Extract first two lines`n`n" .                 LineStr(H, 1, 2)
MsgBox,,Example 3, % "Extract line 5`n`n" .                          LineStr(H, 5, 1)   
MsgBox,,Example 4, % "Extract all from fifth line `n`n" .            LineStr(H, 5)
MsgBox,,Example 5, % "Extract 3 lines from fifth line`n`n" .         LineStr(H, 5, 3)
MsgBox,,Example 6, % "Extract last line`n`n" .                       LineStr(H, 0)             
MsgBox,,Example 7, % "Extract last 2 lines`n`n" .                    LineStr(H, -1)
MsgBox,,Example 8, % "Extract 2 lines preceding last line`n`n" .     LineStr(H, -2,2)        
MsgBox,,Example 9, % "Extract all lines except first and last`n`n" . LineStr(H, 2,-1)


Code to generate 31.3 MB (2424243 lines) text file for testing..
User avatar
SKAN
Posts: 816
Joined: 29 Sep 2013, 16:58

Re: LineStr() : Extract any line / any set of lines from text

02 Apr 2020, 07:29

@ozzii :) :thumbup:

Code updated: Changed first parameter of LineStr() as ByRef.
About 40% increase in speed when I tested with a large text file.
I've also included a snippet to generate a 31.3 MB ANSI text file which can be used for testing.
User avatar
Delta Pythagorean
Posts: 567
Joined: 13 Feb 2017, 13:44
GitHub: DelPyth
Location: Somewhere in the US

Re: LineStr() : Extract any line or consecutive lines from text

03 Apr 2020, 04:02

Example:

Code: Select all

; The following adds 10 lines with the following wording: LineNum: <NUM>	| Contents: <RAND_STR_ALPHA>`n
Loop, % (10) {
	Loop, % (5) {
		; Get a random letter and list it for sh*ts and giggles.
		Random, CHR, % A := Asc("a"), % A + 25
		Rand .= Chr(CHR)
	}
	Str .= "LineNum: " . A_Index . "`t| Contents: " . Rand . "`n"
	Rand := ""
}

Contents	.= "Lines 3 and 4:"				.	"`n"
Contents	.= GetStringByLine(Str, 3, 2)	.	"`n`n"
Contents	.= "Lines 2 through 5:"			.	"`n"
Contents	.= GetStringByLine(Str, 2, 3)	.	"`n`n"
Contents	.= "Line 1:"					.	"`n"
Contents	.= GetStringByLine(Str, 1)		.	"`n`n"
Contents	.= "Last Line:"					.	"`n"
Contents	.= GetStringByLine(Str, 0)
Contents	.= "Amount of Lines:"			.	"`n"
Contents	.= GetStringByLine(Str, -1)

MsgBox, % Contents
Function:

Code: Select all

GetStringByLine(String, LineNumber, Range := 1, Delimiter := "`n", Except := "`r") {
	Split := StrSplit(String, Delimiter, Except)
	If (LineNumber == 0) {
		Out := Split[Split.MaxIndex() - 1] . "`n"
		Return, (Out)
	} Else If (LineNumber == -1) {
		; https://www.autohotkey.com/boards/viewtopic.php?t=31061#p144935
		; The following gets the ammount of items in an array.
		; MaxIndex and Count provide some strange results.
		Count := NumGet(&Split + 4 * A_PtrSize)
		Return, (Count)
	}
	For LineIndex, LineString in Split {
		If ((LineIndex >= LineNumber) && (LineIndex <= (LineNumber + (Range - 1)))) {
			Out .= LineString . "`n"
		}
	}
	Return, (Out)
}
Edit Reasons
Last edited by Delta Pythagorean on 07 Apr 2020, 02:47, edited 3 times in total.

- [AHK].......: 1.1.33.02 Unicode 64-bit
- [OS].........: Windows 10.0.18362
- [GITHUB]...: github.com/DeltaPyth
- [PAYPAL]....: paypal.me/DelPyth
- [DISCORD]..: Delta#3324

Remember to use [code]CODE[/code] for your multi-line scripts.
Stay safe, stay inside, and remember to wash your hands for 20 seconds!
ozzii
Posts: 369
Joined: 30 Oct 2013, 06:04

Re: LineStr() : Extract any line or consecutive lines from text

04 Apr 2020, 03:47

@Delta Pythagorean
An example of use (the number of parameters are not the same)?
User avatar
rommmcek
Posts: 1099
Joined: 15 Aug 2014, 15:18

Re: LineStr() : Extract any line or consecutive lines from text

04 Apr 2020, 09:17

@SKAN: Very fast! Retrieving lines beyond ca. 2/3 of the file/string is faster counting backwards! But the main problem for large file remains: Loading to variable is relatively slow. FileOpen avoids this, but can't pinpoint a line in the middle of the file.

@Delta Pythagorean: You should use different function name not the same as the one above!
You assume every file/string ends with linefeed, which is wrong. Use instead:
Out := Split[Split.MaxIndex()] . "`n" instead!
Besides you don't have to loop through entire array (you are wasting time) use insead:

Code: Select all

     loop, % Range
        Out.= Split[A_Index+LineNumber-1] "`n"
However it's slower nonetheless. He counts only delimiters then retrieves substring requiring less work!
User avatar
elModo7
Posts: 186
Joined: 01 Sep 2017, 02:38
GitHub: elModo7
Location: Spain
Contact:

Re: LineStr() : Extract any line or consecutive lines from text

04 Apr 2020, 10:04

Thanks for sharing!
:beer:
User avatar
SpeedMaster
Posts: 394
Joined: 12 Nov 2016, 16:09

Re: LineStr() : Extract any line or consecutive lines from text

04 Apr 2020, 11:51

Great ! Thanks for sharing! :D
Is it also possible to add an option to also remove them on the fly? :think:
I mean something that acts like the pop() function for objects.

LineStr(ByRef T, S, C:="", D:="`n", Remove:=false)

Cheers
User avatar
SKAN
Posts: 816
Joined: 29 Sep 2013, 16:58

Re: LineStr() : Extract any line or consecutive lines from text

04 Apr 2020, 18:59

@rommmcek
@SpeedMaster

Thanks for the feedback. :)


rommmcek wrote: the main problem for large file remains: Loading to variable is relatively slow. FileOpen avoids this, but can't pinpoint a line in the middle of the file.
I wonder how FileReadLine operates. Would be awesome it worked like LineStr().
I will see if I can come up with something. Even if it is possible for me, I don't think it would be faster.. Would save memory with large text files though.
rommmcek wrote: @Delta Pythagorean: You should use different function name not the same as the one above!
Yes. @Delta Pythagorean. Can you please rename?


SpeedMaster wrote: Is it also possible to add an option to also remove them on the fly? :think:
I mean something that acts like the pop() function for objects.
I've modeled LineStr() after SubStr() and I feel it is best to keep it that way.
I'm wondering if I should set an errorlevel with some useful info, though.

@elModo7 :) :thumbup:
User avatar
Delta Pythagorean
Posts: 567
Joined: 13 Feb 2017, 13:44
GitHub: DelPyth
Location: Somewhere in the US

Re: LineStr() : Extract any line or consecutive lines from text

04 Apr 2020, 22:09

Sure. It's renamed.

- [AHK].......: 1.1.33.02 Unicode 64-bit
- [OS].........: Windows 10.0.18362
- [GITHUB]...: github.com/DeltaPyth
- [PAYPAL]....: paypal.me/DelPyth
- [DISCORD]..: Delta#3324

Remember to use [code]CODE[/code] for your multi-line scripts.
Stay safe, stay inside, and remember to wash your hands for 20 seconds!
User avatar
SpeedMaster
Posts: 394
Joined: 12 Nov 2016, 16:09

Re: LineStr() : Extract any line or consecutive lines from text

05 Apr 2020, 09:10

SKAN wrote:
04 Apr 2020, 18:59
I've modeled LineStr() after SubStr() and I feel it is best to keep it that way.
You're right. I hadn't thought of that. :facepalm: This function should mimic substr() as much as possible. :thumbup:

I found a similar function that can solve my problem with a dumpfile.

LineDelete() by Cuadrix https://www.autohotkey.com/boards/viewtopic.php?t=46520

LineDelete(InputVar, Start_Pos [, ENd_Pos, Options := "B", DumpVar]) ; B for "Between"


:!: last line for LineDelete() is -1
:!: last line for LineStr() is 0

usage with both functions

Code: Select all

H := "Line_1`nLine_2`nLine_3`nLine_4`nLine_5`nLine_6`nLine_7`nLine_8`nLine_9"

																		LineDelete(H,1,1,,DumpVar)													
MsgBox,,Example 1, % "Extract first line`n`n" .							LineStr(H, 1, 1)	"`n`n" . DumpVar

																		LineDelete(H, 1,2,,DumpVar) 
MsgBox,,Example 2, % "Extract first two lines`n`n" .					LineStr(H, 1, 2)	"`n`n" . DumpVar

																		LineDelete(H, 5,5,,DumpVar) ; or LineDelete(H, 4,6,"B",DumpVar) (B = between)
MsgBox,,Example 3, % "Extract line 5`n`n" .								LineStr(H, 5, 1)	"`n`n" . DumpVar   

																		LineDelete(H, 5,-1,,DumpVar) ; (-1 = last line)
MsgBox,,Example 4, % "Extract all from fifth line `n`n" .				LineStr(H, 5)		"`n`n" . DumpVar 

																		LineDelete(H, 5,7,,DumpVar) ; or LineDelete(H, 4,8,"B",DumpVar) (lines between 4 and 8)
MsgBox,,Example 5, % "Extract 3 lines from fifth line`n`n" .         	LineStr(H, 5, 3)	"`n`n" . DumpVar 

																		LineDelete(H, -1,-1,,DumpVar) ; from last line to last line
MsgBox,,Example 6, % "Extract last line`n`n" .                       	LineStr(H, 0)		"`n`n" . DumpVar 

																		LineDelete(H, -2,-1,,DumpVar)
MsgBox,,Example 7, % "Extract last 2 lines`n`n" .                    	LineStr(H, -1)		"`n`n" . DumpVar 

																		LineDelete(H, -3,-2,,DumpVar)
MsgBox,,Example 8, % "Extract 2 lines preceding last line`n`n" .     	LineStr(H, -2,2)	"`n`n" . DumpVar

																		LineDelete(H, 1,-1,"B",DumpVar) ;  (B = Between) or LineDelete(H, 2,-2,,DumpVar)
MsgBox,,Example 9, % "Extract all lines except first and last`n`n" .	LineStr(H, 2,-1)	"`n`n" . DumpVar 

Cheers
Last edited by SpeedMaster on 05 Apr 2020, 10:08, edited 1 time in total.
User avatar
rommmcek
Posts: 1099
Joined: 15 Aug 2014, 15:18

Re: LineStr() : Extract any line or consecutive lines from text

05 Apr 2020, 09:37

@SKAN: I assume FileReadLine uses internaly FileOpen, because first lines of a file (no matter how big) are returned virtually instantly. However for subsequent lines it seems to use equivalent of FileOpen(file).FileReadLine() one by one, so lines towards the end of a file are retrieved much slower then by LineStr()
ozzii
Posts: 369
Joined: 30 Oct 2013, 06:04

Re: LineStr() : Extract any line or consecutive lines from text

06 Apr 2020, 11:29

Delta Pythagorean wrote:
03 Apr 2020, 04:02
Example:
The fist example is not OK.
It display just the line #3.
User avatar
rommmcek
Posts: 1099
Joined: 15 Aug 2014, 15:18

Re: LineStr() : Extract any line or consecutive lines from text

06 Apr 2020, 11:47

There is a typo use GetStringByLine(Str, 3, 2) instead!
Note: "Last line" (Range 0) retrieves penultimate one (last line is empty)! This is a bug!
User avatar
SKAN
Posts: 816
Joined: 29 Sep 2013, 16:58

Re: LineStr() : Extract any line or consecutive lines from text

06 Apr 2020, 14:42

SpeedMaster wrote: I found a similar function that can solve my problem with a dumpfile.
LineDelete() by Cuadrix https://www.autohotkey.com/boards/viewtopic.php?t=46520
I wasn't aware of that function. Well written.
I like the LF/CRLF auto detect feature of that function and have implemented it in LineStr()

Code updated: LineStr() v09b. CRLF will be auto-detected.

@SpeedMaster : Many thanks. :thumbup:
@rommmcek : Thanks. I will let know if I come up with something useful. :thumbup:
User avatar
Cerberus
Posts: 169
Joined: 12 Jan 2016, 15:46

Re: LineStr() : Extract any line or consecutive lines from text

06 Apr 2020, 22:16

@SKAN Very interesting! Do you think reading a file to memory, then using your function would be faster than using FileReadLine on a line in the middle of a very large text file?
User avatar
SKAN
Posts: 816
Joined: 29 Sep 2013, 16:58

Re: LineStr() : Extract any line or consecutive lines from text

06 Apr 2020, 22:49

Cerberus wrote:Very interesting!
Thanks. :)
Cerberus wrote:Do you think reading a file to memory, then using your function would be faster than using FileReadLine on a line in the middle of a very large text file?
It depends on how large the file is.
If it is few 10's of MB's (at least not exceeding #MaxMem) then it should be okay!
When you use FileRead to load the file, AHK will Convert the text to ANSI/UNICODE according to the AHK version being used.
This slows down the loading of contents into variable.

I'm currently exploring the possibilities for an alternate to FileReadLine.
I will let know if I come up with something.
User avatar
Cerberus
Posts: 169
Joined: 12 Jan 2016, 15:46

Re: LineStr() : Extract any line or consecutive lines from text

08 Apr 2020, 21:28

@SKAN Thanks!
So, if I understand this correctly, with a 10 MB file, your method should be faster than FileReadLine? Or are both methods about aequally fast?
User avatar
SKAN
Posts: 816
Joined: 29 Sep 2013, 16:58

Re: LineStr() : Extract any line or consecutive lines from text

09 Apr 2020, 05:53

Cerberus wrote:
08 Apr 2020, 21:28
@SKAN Thanks!
So, if I understand this correctly, with a 10 MB file, your method should be faster than FileReadLine? Or are both methods about aequally fast?
LineStr() should be faster and more importantly flexible... for eg. reading n tail lines from a file isn't possible with FileReadLine
User avatar
Cerberus
Posts: 169
Joined: 12 Jan 2016, 15:46

Re: LineStr() : Extract any line or consecutive lines from text

12 Apr 2020, 23:04

SKAN wrote:
09 Apr 2020, 05:53
Cerberus wrote:
08 Apr 2020, 21:28
@SKAN Thanks!
So, if I understand this correctly, with a 10 MB file, your method should be faster than FileReadLine? Or are both methods about aequally fast?
LineStr() should be faster and more importantly flexible... for eg. reading n tail lines from a file isn't possible with FileReadLine
Great!

Return to “Scripts and Functions”

Who is online

Users browsing this forum: Chiefkes and 22 guests