Page 1 of 1

remove blank lines

Posted: 07 Oct 2014, 13:36
by DataLife
Using this text file https://dl.dropboxusercontent.com/u/976 ... esting.ahk

I need to remove all blank lines.

I have tried this

Code: Select all

FileRead,Var,Testing.ahk
Loop  
 {
  StringReplace, Var, Var, `r`n`r`n, `r`n, UseErrorLevel
  if ErrorLevel = 0  ; No more replacements needed.
   break
 }
I have also tried "AHK No Blank Lines" by awwannaknow found here http://www.autohotkey.com/board/topic/7 ... ank-lines/

both leave blank lines.

Also, putting the text in a var then attempting to remove the blank lines does not work.

Code: Select all

var =
(
The Egyptian pyramids are ancient pyramid-shaped masonry structures located in Egypt.

There are 138 pyramids discovered in Egypt as of 2008.[1][2] Most were built as tombs for the country's Pharaohs and their consorts during the Old and Middle Kingdom periods.[3][4][5]

The earliest known Egyptian pyramids are found at Saqqara, northwest of Memphis. The earliest among these is the Pyramid of Djoser (constructed 2630 BC–2611 BC) which was built during the third dynasty. This pyramid and its surrounding complex were designed by the architect Imhotep, and are generally considered to be the world's oldest monumental structures constructed of dressed masonry.[6] The estimate of the number of workers to build the pyramids range from a few thousand, twenty thousand, and up to 100,000

  
  
 

The most famous Egyptian pyramids are those found at Giza, on the outskirts of Cairo. Several of the Giza pyramids are counted among the largest structures ever built.[9] The Pyramid of Khufu at Giza is the largest Egyptian pyramid. It is the only one of the Seven Wonders of the Ancient World still in existence.
  
By the time of the early dynastic period of Egyptian history, those with sufficient means were buried in bench-like structures known as mastabas.[10][11]

The second historically documented Egyptian pyramid is attributed to the architect Imhotep, who planned what Egyptologists believe to be a tomb for the pharaoh Djoser. Imhotep is credited with being the first to conceive the notion of stacking mastabas on top of each other – creating an edifice composed of a number of "steps" that decreased in size towards its apex. The result was the Step Pyramid of Djoser – which was designed to serve as a gigantic stairway by which the soul of the deceased pharaoh could ascend to the heavens. Such was the importance of Imhotep's achievement that he was deified by later Egyptians
)

Loop  
 {
  StringReplace, Var, Var, `r`n`r`n, `r`n, UseErrorLevel
  if ErrorLevel = 0  ; No more replacements needed.
   break
 }
 
 MsgBox % var
 
I know I can

Code: Select all

var =
(
The Egyptian pyramids are ancient pyramid-shaped masonry structures located in Egypt.

There are 138 pyramids discovered in Egypt as of 2008.[1][2] Most were built as tombs for the country's Pharaohs and their consorts during the Old and Middle Kingdom periods.[3][4][5]

The earliest known Egyptian pyramids are found at Saqqara, northwest of Memphis. The earliest among these is the Pyramid of Djoser (constructed 2630 BC–2611 BC) which was built during the third dynasty. This pyramid and its surrounding complex were designed by the architect Imhotep, and are generally considered to be the world's oldest monumental structures constructed of dressed masonry.[6] The estimate of the number of workers to build the pyramids range from a few thousand, twenty thousand, and up to 100,000

  
  
 

The most famous Egyptian pyramids are those found at Giza, on the outskirts of Cairo. Several of the Giza pyramids are counted among the largest structures ever built.[9] The Pyramid of Khufu at Giza is the largest Egyptian pyramid. It is the only one of the Seven Wonders of the Ancient World still in existence.
  
By the time of the early dynastic period of Egyptian history, those with sufficient means were buried in bench-like structures known as mastabas.[10][11]

The second historically documented Egyptian pyramid is attributed to the architect Imhotep, who planned what Egyptologists believe to be a tomb for the pharaoh Djoser. Imhotep is credited with being the first to conceive the notion of stacking mastabas on top of each other – creating an edifice composed of a number of "steps" that decreased in size towards its apex. The result was the Step Pyramid of Djoser – which was designed to serve as a gigantic stairway by which the soul of the deceased pharaoh could ascend to the heavens. Such was the importance of Imhotep's achievement that he was deified by later Egyptians
)



Loop, Parse, Var,`n, 
 {
  if A_LoopField <> 
	var2 := (var2 "`n" A_loopfield )
}
MsgBox % var2 
but this is very slow on large variables.
Anyone have any suggestions on how to remove all blank lines on large variables very fast.

thanks
DataLife

Re: remove blank lines

Posted: 07 Oct 2014, 13:58
by Blackholyman
try something like this

Code: Select all

string=
(
The Egyptian pyramids are ancient pyramid-shaped masonry structures located in Egypt.

There are 138 pyramids discovered in Egypt as of 2008.[1][2] Most were built as tombs for the country's Pharaohs and their consorts during the Old and Middle Kingdom periods.[3][4][5]

The earliest known Egyptian pyramids are found at Saqqara, northwest of Memphis. The earliest among these is the Pyramid of Djoser (constructed 2630 BC–2611 BC) which was built during the third dynasty. This pyramid and its surrounding complex were designed by the architect Imhotep, and are generally considered to be the world's oldest monumental structures constructed of dressed masonry.[6] The estimate of the number of workers to build the pyramids range from a few thousand, twenty thousand, and up to 100,000

  
  
 

The most famous Egyptian pyramids are those found at Giza, on the outskirts of Cairo. Several of the Giza pyramids are counted among the largest structures ever built.[9] The Pyramid of Khufu at Giza is the largest Egyptian pyramid. It is the only one of the Seven Wonders of the Ancient World still in existence.
  
By the time of the early dynastic period of Egyptian history, those with sufficient means were buried in bench-like structures known as mastabas.[10][11]

The second historically documented Egyptian pyramid is attributed to the architect Imhotep, who planned what Egyptologists believe to be a tomb for the pharaoh Djoser. Imhotep is credited with being the first to conceive the notion of stacking mastabas on top of each other – creating an edifice composed of a number of "steps" that decreased in size towards its apex. The result was the Step Pyramid of Djoser – which was designed to serve as a gigantic stairway by which the soul of the deceased pharaoh could ascend to the heavens. Such was the importance of Imhotep's achievement that he was deified by later Egyptians
)

newstring := RegExReplace(string, "\v+", "`n")
msgbox % newstring
return

Re: remove blank lines

Posted: 07 Oct 2014, 14:03
by kon
Removes lines that only contain whitespace characters too:
NewString := RegExReplace(String, "(^|\r?\n)\K(\s*\r?\n)+")

Re: remove blank lines

Posted: 07 Oct 2014, 14:28
by DataLife
Blackholyman wrote:try something like this

Code: Select all

string=
(
The Egyptian pyramids are ancient pyramid-shaped masonry structures located in Egypt.

There are 138 pyramids discovered in Egypt as of 2008.[1][2] Most were built as tombs for the country's Pharaohs and their consorts during the Old and Middle Kingdom periods.[3][4][5]

The earliest known Egyptian pyramids are found at Saqqara, northwest of Memphis. The earliest among these is the Pyramid of Djoser (constructed 2630 BC–2611 BC) which was built during the third dynasty. This pyramid and its surrounding complex were designed by the architect Imhotep, and are generally considered to be the world's oldest monumental structures constructed of dressed masonry.[6] The estimate of the number of workers to build the pyramids range from a few thousand, twenty thousand, and up to 100,000

  
  
 

The most famous Egyptian pyramids are those found at Giza, on the outskirts of Cairo. Several of the Giza pyramids are counted among the largest structures ever built.[9] The Pyramid of Khufu at Giza is the largest Egyptian pyramid. It is the only one of the Seven Wonders of the Ancient World still in existence.
  
By the time of the early dynastic period of Egyptian history, those with sufficient means were buried in bench-like structures known as mastabas.[10][11]

The second historically documented Egyptian pyramid is attributed to the architect Imhotep, who planned what Egyptologists believe to be a tomb for the pharaoh Djoser. Imhotep is credited with being the first to conceive the notion of stacking mastabas on top of each other – creating an edifice composed of a number of "steps" that decreased in size towards its apex. The result was the Step Pyramid of Djoser – which was designed to serve as a gigantic stairway by which the soul of the deceased pharaoh could ascend to the heavens. Such was the importance of Imhotep's achievement that he was deified by later Egyptians
)

newstring := RegExReplace(string, "\v+", "`n")
msgbox % newstring
return
Perfect, thanks very much

Re: remove blank lines

Posted: 07 Oct 2014, 14:30
by DataLife
kon wrote:Removes lines that only contain whitespace characters too:
NewString := RegExReplace(String, "(^|\r?\n)\K(\s*\r?\n)+")
Blackholymans code seems to do that also, am I correct?

Can you give me an example?

Edit 2:28pm CST
Nevermind I found an example. thanks very much, saved me a huge headache.

Re: remove blank lines

Posted: 07 Oct 2014, 14:34
by kon
Try Blackholyman's example, except read "string" from a file. (Using FileRead)
In his example the spaces are removed by AHK automatically from blank lines.
If the string is read from a file, the spaces/tabs on blank lines will be preserved and therefore the lines that contain only whitespace characters will not be removed.

Edit: See Splitting a Long Line into a Series of Shorter Ones, specifically RTrim0.

Re: remove blank lines

Posted: 07 Oct 2014, 15:51
by DataLife
One more thing. When using the sort command to sort a variable alphabetically it takes into account blank spaces at the beginning of each line.

I know I can use a parsing loop to remove the spaces with

Code: Select all

var := RegExReplace(var, "(^\s*|\s*$)") 
then rebuild the variable with the new lines without leading spaces but on large files this takes a long time.

Is there a way to tell the sort command to ignore spaces at the beginning or maybe use RegExReplace to remove them?

Code: Select all

var =
(
                 The Egyptian pyramids are ancient pyramid-shaped 

   There are 138 pyramids discovered in Egypt as of 2008.[1][2] 

      The earliest known Egyptian pyramids are found at Saqqara,

  
  
 

          The most famous Egyptian pyramids are those found at Giza
  
   By the time of the early dynastic period of Egyptian history

  The second historically documented Egyptian pyramid is attributed
)

var := RegExReplace(var, "\v+", "`n")
var := RegExReplace(var, "(^|\r?\n)\K(\s*\r?\n)+")

sort, var
MsgBox % var

Re: remove blank lines

Posted: 07 Oct 2014, 21:38
by kon

Code: Select all

var =
(RTrim0
                 The Egyptian pyramids are ancient pyramid-shaped

   There are 138 pyramids discovered in Egypt as of 2008.[1][2]

      The earliest known Egyptian pyramids are found at Saqqara,

 
 
 

          The most famous Egyptian pyramids are those found at Giza
 
   By the time of the early dynastic period of Egyptian history

  The second historically documented Egyptian pyramid is attributed
)

var := RegExReplace(var, "(^|\r?\n)\K(\s*\r?\n?)+")

Sort, var
MsgBox % var

Re: remove blank lines

Posted: 07 Oct 2014, 22:12
by lexikos
DataLife wrote:Also, putting the text in a var then attempting to remove the blank lines does not work.
About continuation sections, the manual wrote:Join: Specifies how lines should be connected together. If this option is omitted, each line except the last will be followed by a linefeed character (`n).
DataLife wrote:StringReplace, Var, Var, `r`n`r`n, `r`n, UseErrorLevel
See the problem?

Re: remove blank lines

Posted: 07 Oct 2014, 23:00
by ahcahc
var := RegExReplace(var,"`am)^\s+")

Re: remove blank lines

Posted: 07 Oct 2014, 23:42
by DataLife
ahcahc wrote:var := RegExReplace(var,"`am)^\s+")
Works perfectly.
That is amazing, one line of code removes all leading spaces on every line.

I am going to study the ReExReplace and see if I can figure out what is going on.

thanks very much.
DataLife

Re: remove blank lines

Posted: 08 Oct 2014, 04:19
by Guest
Using https://github.com/hi5/TF TF_RemoveBlankLines("Testing.ahk") ; use "!Testing.ahk" to overwrite source file

Re: remove blank lines

Posted: 08 Oct 2014, 23:20
by DataLife
ahcahc wrote:var := RegExReplace(var,"`am)^\s+")
I studied this line and I understand how this works.

options....
`a - matches newlines
m - multiline
) end of options

pattern...
^ - ties the pattern to the beginning of a string
\s - whitespace characters
+ - A plus sign matches one or more of the preceding character


Since I am new to Regex I am struggling with something simple.

I want to also remove whitespace from the end of the string.

So I tried...
$ - ties the pattern to the end of a string

Code: Select all

var := RegExReplace(var,"`am)^\s$")
I think this does not work because of
there must be no other characters before or after it).
that would mean the whitespace could have no characters before or after it.

I know I can do this...

Code: Select all

var := RegExReplace(var,"`am)^\s+")
var := RegExReplace(var, "(^\s*|\s*$)")
but if I can remove blank lines and all whitespace before and after each line with one regex I would like to.

thanks
DataLife

Re: remove blank lines

Posted: 09 Oct 2014, 00:31
by ahcahc
var := RegExReplace(var,"`am)^\s+|\s+?(?=\R$)")
var := RegExReplace(RegExReplace(var,"`am)^\s+"),"\s*$")

Re: remove blank lines

Posted: 09 Oct 2014, 03:05
by DataLife
ahcahc wrote:var := RegExReplace(var,"`am)^\s+|\s+?(?=\R$)")
var := RegExReplace(RegExReplace(var,"`am)^\s+"),"\s*$")
that works.
Very interesting.

thanks very much
DataLife

Re: remove blank lines

Posted: 10 Oct 2014, 04:23
by lexikos
It's not "with one regex", though...

This should do it:

Code: Select all

t=
(

  one
  
t w o   

   three
   
)
t := RegExReplace(t, "^\s+|\s*(\R|$)\s*", "$1")

Loop Parse, t, `n
    s .= "|" A_LoopField "|`n"
MsgBox % s
^\s+ removes all whitespace at the start of the string (not each line).
\s* matches as much whitespace as possible, possibly including multiple newlines.
(\R|$) matches a newline or ensures that the space is at the very end of the string. If not at the end of the string, back-tracking is required (because the newline was already matched by \s*).
\s* matches whitespace after a newline (i.e. at the beginning of the next line).

This should also work:

Code: Select all

t := RegExReplace(t, "`am)^\s+|\h+$|\R\z")
^\s+ matches whitespace which starts at the beginning of a line. For blank lines, the match includes the line-ending and continues into the next line.
\h+$ matches horizontal whitespace at the end of a line (using \s here would remove all newlines).
\R\z removes a final newline from the very end of the string, if present. \z matches only at the end of the string, like $ when the m option isn't used.

Re: remove blank lines

Posted: 11 Oct 2014, 04:44
by DataLife
@lexikos

Both work great!

thanks for the explanation.
All of this RegEx stuff is in black and white in the help manual but putting it all together is difficult for me.

thanks
DataLife

Re: remove blank lines

Posted: 11 Oct 2014, 13:11
by garry
thank you for the regex examples
I was just playing for a "big msgbox" to see better the result ( only a 2nd-GUI EDIT )

Code: Select all

;========== MSGBOXx Editx 2ndGUIx ===========================
;- shows result in a big screen editfieldx ------------------
;- regex example to remove blank lines
;-http://ahkscript.org/boards/viewtopic.php?f=5&t=4810

/*
preselect:=a_desktop
FileSelectFile, F1,1,%preselect%,Select your file,*.txt;*.bas
if F1 =
   return
Fileread,string,%f1%
*/

;/*
string=
(

             Line1

Line2

Line3

)
;*/

;- regex to remove blank lines ---------
e4x := RegExReplace(string, "`am)^\s+|\h+$|\R\z")

;- show this result in a "big-msgbox" (Edit)
gosub,msgbox2
msgbox, 262208, ,%e4x%
return
;-------------------------------------------------------------

;-------------------------------------------------------------
Msgbox2:
WA=%A_screenwidth%
HA=%A_screenheight%
H1 :=(HA*95)/100
W1 :=(WA*99)/100
EH1:=(HA*94)/100
EW1:=(WA*98)/100

#NoEnv
SendMode Input
SetWorkingDir %A_ScriptDir%
Gui,2:default
GUI,2:+AlwaysOnTop
;- colorsx colorx --
edcol1=black
edcol2=yellow
c0=D4D0C8  ;- gray normal msgbox
c1=000000  ;- black
c2=FFE07B
c3=E2C577
c4=CEB88A  ;-brown
c5=D5E66D  ;-yellow
c6=BFBFBF  ;-gray

Gui,2: Color, ControlColor,%c1%
Gui,2: Font, CDefault, FixedSys

Name2=MSGBOX-2
Gui,2:add,Edit,   x10  y10 c%edcol2% h%eh1% w%ew1%  -wrap hscroll,%e4x%
Gui,2:show,x1 y1 h%h1% w%w1%,%name2%
Send,^{end}
Send,{enter}
return

2Guiclose:
exitapp
;======================== end script ==========================