Text Handling - Split a sentence into several small ones [WordWrapper]

Post your working scripts, libraries and tools for AHK v1.1 and older
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Text Handling - Split a sentence into several small ones [WordWrapper]

16 Dec 2018, 04:29

Yes, this is a quite strange situation.
But, it was real one for me.
And, Solved somewhat like this.

I know it is not the best (good) but, it is my best
And, If possible, I would like to know some more nicer approach.

< Situation >
-A sentence is given.
-I have to split it several lines, in this sample code the line count is "5"
-when you split it, you have to do your best to keep it even length for each line
-while split, keep a word not broken, and the first character should not a space

Given sentence is;
"The Open Web Application Security Project(OWASP) provides a very good list of the Top 10 web application security flaws, including an summary of the nature, severity and impact of each."

And my records goes;

First try
-simply character by character
-exactly 37 characters each line
-its arrangement is the best though
-it looks like silly, it splits whole word, and space showed at the first of line

Code: Select all

; the first number is character counts of its line, the 2nd number is counts of words

;37, 6    The Open Web Application Security Pro
;37, 6    ject(OWASP) provides a very good list
;37, 7     of the Top 10 web application securi
;37, 7    ty flaws, including an summary of the
;37, 6     nature, severity and impact of each.

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
characterBasedSentenceArrangement(문장, 줄수)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
	StringSplit, 글자, 문장
	전체글자수 := StrLen( 문장)
	줄당글자수 := Floor( 전체글자수/줄수) 				
	잔여글자수 := Mod( 전체글자수, 줄당글자수)
	문장재구성 := Array()	
	Loop % 줄수
	{
		글자한개추가 := 0
		If ( A_Index <= 잔여글자수)
			글자한개추가 := 1
		이번줄내용 := "" 
		Loop % 줄당글자수 + 글자한개추가
		{
			색인 ++		
			이번줄내용 .= 글자%색인% 
		}
		문장재구성.Insert( 이번줄내용 )
	}
	For Each, 한줄내용 in 문장재구성				
	{
		RegExReplace( RegExReplace( 한줄내용, "\s+", " "), "\S+", "", 한줄단어수 )	
		결과 .= StrLen( 한줄내용) ", " 한줄단어수 "    " 한줄내용 "`n"		
	}
	StringTrimRight, 결과, 결과, 1    
	Return 결과
}
Second try
-word by word
-exactly 6 words each line
-arrangement is worst, because each words has its own length
-no broken words at all

Code: Select all

;49, 6    The Open Web Application Security Project(OWASP) 
;29, 6    provides a very good list of 
;36, 6    the Top 10 web application security 
;35, 6    flaws, including an summary of the 
;37, 6    nature, severity and impact of each.

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
wordBasedSentenceArrangement(문장, 줄수)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
	StringSplit, 단어, 문장, % A_Space
	전체단어수 := 단어0
	줄당단어수 := 전체단어수//줄수 				    
	잔여단어수 := Mod( 전체단어수, 줄당단어수)
	문장재구성 := Array()	
	Loop % 줄수
	{
		단어한개추가 := 0
		If ( A_Index <= 잔여단어수)
			단어한개추가 := 1
		이번줄내용 := "" 
		Loop % 줄당단어수 + 단어한개추가
		{
			색인 ++		
			이번줄내용 .= 단어%색인% " "
		}
		문장재구성.Insert( 이번줄내용 )
	}
	For Each, 한줄 in 문장재구성				
	{
		RegExReplace( RegExReplace( 한줄, "\s+", " "), "\S+", "", 한줄단어수 )	
		결과 .= StrLen( 한줄) ", " 한줄단어수 "    " 한줄 "`n"		
	}
	StringTrimRight, 결과, 결과, 1  
	Return 결과
}
Last try
-modification of the first
-it is my best

Code: Select all

;34, 5    The Open Web Application Security 
;41, 6    Project(OWASP) provides a very good list 
;39, 7    of the Top 10 web application security 
;35, 6    flaws, including an summary of the 
;36, 6    nature, severity and impact of each.

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
characterBasedSentenceArrangementModified(문장, 줄수)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
	StringSplit, 글자, 문장
	쪽집게 := Array()		
	Loop % 글자0
	{
		StringLeft, 좌측전체, 문장, % A_Index 								
		StringTrimLeft,	우측전체, 문장 , % A_Index  					
		단어전반 := RegExReplace( 좌측전체, ".*?([^ ]*)$", "${1}")
		단어후반 := RegExReplace( 우측전체, "^([^ ]*).*", "${1}")
		쪽집게.Insert([단어전반, 단어후반])                                 
	}
	전체글자수 := StrLen( 문장)
	줄당글자수 := Floor( 전체글자수/줄수) 			
	잔여글자수 := Mod( 전체글자수, 줄당글자수)
	문장재구성 := Array()		
	색인 := 1
	Loop % 줄수
	{
		글자한개추가 := 0
		If ( A_Index <= 잔여글자수)
			글자한개추가 := 1
		이번줄글자개수 := 줄당글자수 + 글자한개추가
		이번줄맨끝색인 := 색인 + 이번줄글자개수
		If ( A_Index - 1 < 줄수 )                                                             		
		{
			맨끝단어전반길이 := StrLen( 쪽집게[이번줄맨끝색인][1])      		
			맨끝단어후반길이 := StrLen( 쪽집게[이번줄맨끝색인][2])
			If ( 맨끝단어전반길이 < 맨끝단어후반길이)                               		
				이번줄글자개수 := 이번줄글자개수 - 맨끝단어전반길이
			Else                                                                                      	
				이번줄글자개수 := 이번줄글자개수 + 맨끝단어후반길이 + 1
		}
		Else
			이번줄글자개수 := 전체글자카운트 - 전체글자색인      
		이번줄내용 := "" 
		Loop % 이번줄글자개수 + 1
		{
			이번줄내용 .= 글자%색인% 	
			색인 ++	
		}
		문장재구성.Insert( 이번줄내용)
	}
	For Each, 한줄 in 문장재구성				
	{
		RegExReplace( RegExReplace( 한줄, "\s+", " "), "\S+", "", 한줄단어수 )	
		결과 .= StrLen( 한줄) ", " 한줄단어수 "    " 한줄 "`n"		
	}
	StringTrimRight, 결과, 결과, 1  
	Return 결과
}
Any good tips ?

Thanks

[EDIT]
Ah.... I just figured out, for the last line of them.
Its condition is very simple.
If it has at least one word(including last dot) that is fine enough.
So, the solutions could be some.. easier or more difficult ? I do not know...
Last edited by IMEime on 17 Dec 2018, 05:43, edited 2 times in total.
CyL0N
Posts: 211
Joined: 27 Sep 2018, 09:58

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 01:46

Wordwrap then...
I use this one, it's rather simple & works perfectly... https://www.rosettacode.org/wiki/Word_wrap#AutoHotkey

Although,i don't quiet understand the need to split a sentence to a specific number of lines instead of a maximum character width ,given you're dealing with a sentence... Regardless though the function i linked is just as useful, you could simply loop the character width until the resulting wordwrap is a desired number of lines.

Code: Select all

str = The Open Web Application Security Project(OWASP) provides a very good list of the Top 10 web application security flaws, including an summary of the nature, severity and impact of each.

MsgBox % WrapText(str, 30)

;or to specific number of lines...
maxNumberOfLines := 5
Loop
	lineWrapped := WrapText(str,A_Index)
Until StringCharCount(lineWrapped,"`n") < maxNumberOfLines
MsgBox % lineWrapped




WrapText(Text, LineLength) {
	StringReplace, Text, Text, `r`n, %A_Space%, All
	while (p := RegExMatch(Text, "(.{1," LineLength "})(\s|\R+|$)", Match, p ? p + StrLen(Match) : 1))
		Result .= Match1 ((Match2 = A_Space || Match2 = A_Tab) ? "`n" : Match2)
	return, Result
}

;Returns the number of ooccurrences of a character in a string
StringCharCount(string, char){
StringReplace, string, string, %char%, %char%, UseErrorLevel
Return ErrorLevel
}



Cheers.
Last edited by CyL0N on 17 Dec 2018, 07:19, edited 1 time in total.
live ? long & prosper : regards
ozzii
Posts: 481
Joined: 30 Oct 2013, 06:04

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 03:55

@CyLON

must be

Code: Select all

Until StringCharCount(lineWrapped,"`n") < maxNumberOfLines
or else you have one line more :D
User avatar
jeeswg
Posts: 6902
Joined: 19 Dec 2016, 01:58
Location: UK

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 04:02

I wrote some similar code, here:
add word wrap to a string - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=59461
homepage | tutorials | wish list | fun threads | donate
WARNING: copy your posts/messages before hitting Submit as you may lose them due to CAPTCHA
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 05:25

CyL0N wrote:
17 Dec 2018, 01:46
Wordwrap then...
I use this one, it's rather simple & works perfectly... https://www.rosettacode.org/wiki/Word_wrap#AutoHotkey
Wow.. Thanks good info.

I Do have no idea any kind of previous efforts, I just wrote my code blindly.
I even do not know its name "WordWrapper".
That looks good code.
Thanks again.
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 05:27

jeeswg wrote:
17 Dec 2018, 04:02
I wrote some similar code, here:
add word wrap to a string - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=59461
Good
I'll take care of it with leisure time.
Thanks..
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 05:41

changed the name of this post a little bit including "WordWrapper"
cheers !
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones [WordWrapper]

17 Dec 2018, 06:27

wow nice codes

Code: Select all

myString := "The Open Web Application Security Project(OWASP) provides a very good list of the Top 10 web application security flaws, including an summary of the nature, severity and impact of each."
myTargetLineCounts := 5
Loop 
{
	myWidth := A_Index
	myCase := WrapText(myString, myWidth)
	myLineCount := StringCharCount( myCase, "`n") + 1
	If ( myPreviousCase = myCase )  ;  remove duplicate
		Continue
	myPreviousCase := myCase
	If ( myLineCount < myTargetLineCounts)		
		Break
	If ( myLineCount > myTargetLineCounts)		
		Continue
	myCaseIndex ++
	myResults .= """" myWidth """" "  Case_" myCaseIndex  "`n" myCase "`n`n"
}
MsgBox % myResults

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
WrapText( Text, LineLength) 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
;	WordWrapper
;	https://www.rosettacode.org/wiki/Word_wrap#AutoHotkey
	StringReplace, Text, Text, `r`n, % A_Space, All
	While (p := RegExMatch( Text, "(.{1," LineLength "})(\s|\R+|$)", Match, p ? p + StrLen( Match) : 1))
		Result .= Match1 ((Match2 = A_Space || Match2 = A_Tab) ? "`n" : Match2)
	Return Result
}
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
StringCharCount( string, char)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
;	Thanks "CyL0N"
;	Returns the number of ooccurrences of a character in a string
	StringReplace, string, string, %char%, %char%, UseErrorLevel
	Return ErrorLevel
}
Now I have 5 cases !!!
"40" Case_1
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

"42" Case_2
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the nature,
severity and impact of each.

"43" Case_3
The Open Web Application Security
Project(OWASP) provides a very good list of
the Top 10 web application security flaws,
including an summary of the nature,
severity and impact of each.

"44" Case_4
The Open Web Application Security
Project(OWASP) provides a very good list of
the Top 10 web application security flaws,
including an summary of the nature, severity
and impact of each.

"47" Case_5
The Open Web Application Security
Project(OWASP) provides a very good list of the
Top 10 web application security flaws,
including an summary of the nature, severity
and impact of each.
So, I am looking for Comparison/Assessment methods.
I am not good at Math/Statistics/DistributionCalculations... so, I have to Google.
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones [WordWrapper]

17 Dec 2018, 09:25

Finished with simple Math.
My choice should be "Case 1"

Code: Select all

myString := "The Open Web Application Security Project(OWASP) provides a very good list of the Top 10 web application security flaws, including an summary of the nature, severity and impact of each."
myTargetLineCounts := 5
Loop 
{
	myWidth := A_Index
	myCase := WrapText(myString, myWidth)
	myLineCount := StringCharCount( myCase, "`n") + 1
	If ( myPreviousCase = myCase )  ;  remove duplicate
		Continue
	myPreviousCase := myCase
	If ( myLineCount < myTargetLineCounts)	
		Break
	If ( myLineCount > myTargetLineCounts)		
		Continue
	myLineArray := StrSplit( myCase, "`n")
	myLineArray.Remove( myLineArray.MaxIndex() )		  ;  except the last line
	myLineArrayTotalWords := 0
	myDeviationValue := 0
	For Each, x in myLineArray			
		myLineArrayTotalWords += StrLen( x)
	myLineArrayMeanWordCount := myLineArrayTotalWords/myLineArray.MaxIndex()  
	For Each, x in myLineArray			
		myDeviationValue += ( Abs( StrLen( x) - myLineArrayMeanWordCount))**2
	myCaseIndex ++
	myResults .= "Case_" myCaseIndex  "     deviationValue_"  Round( myDeviationValue) "`n" myCase "`n`n"
}
MsgBox % myResults 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
WrapText( Text, LineLength) 
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
;	WordWrapper
;	https://www.rosettacode.org/wiki/Word_wrap#AutoHotkey
	StringReplace, Text, Text, `r`n, % A_Space, All
	While (p := RegExMatch( Text, "(.{1," LineLength "})(\s|\R+|$)", Match, p ? p + StrLen( Match) : 1))
		Result .= Match1 ((Match2 = A_Space || Match2 = A_Tab) ? "`n" : Match2)
	Return Result
}
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
StringCharCount( string, char)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
;	Thanks "CyL0N"
;	Returns the number of ooccurrences of a character in a string
	StringReplace, string, string, %char%, %char%, UseErrorLevel
	Return ErrorLevel
}
Case_1 deviationValue_33
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

Case_2 deviationValue_45
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the nature,
severity and impact of each.

Case_3 deviationValue_75
The Open Web Application Security
Project(OWASP) provides a very good list of
the Top 10 web application security flaws,
including an summary of the nature,
severity and impact of each.

Case_4 deviationValue_77
The Open Web Application Security
Project(OWASP) provides a very good list of
the Top 10 web application security flaws,
including an summary of the nature, severity
and impact of each.

Case_5 deviationValue_117
The Open Web Application Security
Project(OWASP) provides a very good list of the
Top 10 web application security flaws,
including an summary of the nature, severity
and impact of each.
Thanks a lot Guys !!!
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones.

17 Dec 2018, 18:28

jeeswg wrote:
17 Dec 2018, 04:02
I wrote some similar code, here:
add word wrap to a string - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=6&t=59461
Well, I have just tested Yours.
Yes, It's working nice. Perfect.
I like it too.
Thanks.
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones [WordWrapper]

18 Dec 2018, 05:07

I have changed my code a little bit (actually totally changed)
It should be changed into Rosetta's but, I am done. I quit here.

This one gives me "the Best" result.
I have no Idea how this could be happened - and I do not care about it
I just wrote my code, blindly as usually
When it is working, I am satisfied with that, 100%

Regards

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
myWordWrapper(mySentence, myChracterCount)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
{
	Loop
	{
		myIndex += StrLen( myModified) = 0 ? 1 : StrLen( myModified) 	
		myNotModified := SubStr( mySentence, myIndex, myChracterCount)
		If (myIndex + StrLen( myNotModified) >= StrLen( mySentence))
		{
			myResult .= myNotModified "`n"	
			Break
		}
		myModified := RegExReplace( myNotModified, "\S+$")				 
		If (myModified = "")											
			myResult .= myNotModified "`n"
		Else
			myResult .= myModified "`n" 
	}
	Return SubStr( myResult, 1, - 1)
}
Last edited by IMEime on 18 Dec 2018, 10:36, edited 1 time in total.
IMEime
Posts: 750
Joined: 20 Sep 2014, 06:15

Re: Text Handling - Split a sentence into several small ones [WordWrapper]

18 Dec 2018, 10:33

Some of it
-from the lower deviation value
-ignoring the last line (5th)
-just for fun
myCode deviationValue_21
The Open Web Application Security
Project(OWASP) provides a very good
list of the Top 10 web application
security flaws, including an summary of
the nature, severity and impact of each.

myCode deviationValue_33
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

JEE deviationValue_33
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

Rosetta deviationValue_33
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

myCode deviationValue_45
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the nature,
severity and impact of each.

JEE deviationValue_45
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the nature,
severity and impact of each.
-Not Ignoring the last line
-the order is not so beautiful, Because I'm bad at Array sorting (but the trend is not so bad)
myCode deviationValue_32
The Open Web Application Security
Project(OWASP) provides a very good
list of the Top 10 web application
security flaws, including an summary of
the nature, severity and impact of each.

myCode deviationValue_34
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

JEE deviationValue_33
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

Rosetta deviationValue_33
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the
nature, severity and impact of each.

JEE deviationValue_129
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the nature,
severity and impact of each.

Rosetta deviationValue_129
The Open Web Application Security
Project(OWASP) provides a very good list
of the Top 10 web application security
flaws, including an summary of the nature,
severity and impact of each.

Return to “Scripts and Functions (v1)”

Who is online

Users browsing this forum: mikeyww and 194 guests