AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Remove repetations from a file

 
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Ask for Help
View previous topic :: View next topic  
Author Message
sosaited



Joined: 24 Feb 2005
Posts: 233

PostPosted: Tue Oct 24, 2006 8:34 am    Post subject: Remove repetations from a file Reply with quote

I have a file that has entries like these:

Code:
Extension=avi
FileName=smallville
Extension=exe,zip
Extension=mpg,mpeg,avi,wmv,mov
Extension=exe,zip
FileName=bearshare
Extension=exe,zip
FileName=BSPROINSTALL.exe
Extension=
FileName=
Extension=avi
FileName=smallville
Extension=ahk,ini
FileName=index
Extension=htm,html
FileName=rc4
Extension=exe,zip
FileName=bearshare
Extension=doc
FileName=help
Extension=doc,txt
FileName=help
Extension=exe,zip
FileName=winamp
Extension=exe,zip
FileName=icq
Extension=htm,html
FileName=msdn messagebox
Extension=htm,html
FileName=messagebox
Extension=exe,zip
FileName=putty
Extension=jpg,jpeg,bmp,png
FileName=passport
Extension=ttf,fon,zip
FileName=Footlight
Extension=ttf,zip
FileName=Formal_436_BT
Extension=exe,zip
FileName=ocr
Extension=exe,zip
FileName=winamp
Extension=xls
FileName=bss
Extension=xls
FileName=
Extension=jpg,bmp
FileName=mydesk
Extension=jpg,bmp
FileName=my pictures
Extension=avi
FileName=smallville
Extension=iso,zip,rar
FileName=mafia
Extension=exe,zip
FileName=iso
Extension=Folder
FileName=mafia
Extension=exe,zip
FileName=magiciso
Extension=exe,zip
FileName=crack
Extension=exe
FileName=iso
Extension=exe,zip
FileName=iso
Extension=exe,zip
FileName=ultraiso
Extension=exe,zip


I need to create a script that will REMOVE ALL THE REPETATIONS from the file. So far, I am unable to create something working. Can anyone help me please.
Thanks
_________________
My small "thanks" to AHK in shape of these dedicated 3d images (Topic already in "General" Forum)
Back to top
View user's profile Send private message Send e-mail MSN Messenger
SKAN



Joined: 26 Dec 2005
Posts: 6223

PostPosted: Tue Oct 24, 2006 8:51 am    Post subject: Reply with quote

sosaited wrote:
I am unable to create something working


Shocked

Code:
Fileread, Text, text.txt
Sort text, U
FileAppend, %Text%, NewText.txt



This was the output:

Quote:
Extension=
Extension=ahk,ini
Extension=avi
Extension=doc
Extension=doc,txt
Extension=exe
Extension=exe,zip
Extension=exe,zip
Extension=Folder
Extension=htm,html
Extension=iso,zip,rar
Extension=jpg,bmp
Extension=jpg,jpeg,bmp,png
Extension=mpg,mpeg,avi,wmv,mov
Extension=ttf,fon,zip
Extension=ttf,zip
Extension=xls
FileName=
FileName=bearshare
FileName=BSPROINSTALL.exe
FileName=bss
FileName=crack
FileName=Footlight
FileName=Formal_436_BT
FileName=help
FileName=icq
FileName=index
FileName=iso
FileName=mafia
FileName=magiciso
FileName=messagebox
FileName=msdn messagebox
FileName=my pictures
FileName=mydesk
FileName=ocr
FileName=passport
FileName=putty
FileName=rc4
FileName=smallville
FileName=ultraiso
FileName=winamp


Edit: Let me know if you want to retain the orgininal order!
_________________
Back to top
View user's profile Send private message
sosaited



Joined: 24 Feb 2005
Posts: 233

PostPosted: Tue Oct 24, 2006 8:57 am    Post subject: Reply with quote

I wish it was this simple. (Though, I have to admit that I never checked the U parameter of the Sort command. Thanks for that)

The Problem is that The Sequence of the lines must stay the same.
_________________
My small "thanks" to AHK in shape of these dedicated 3d images (Topic already in "General" Forum)
Back to top
View user's profile Send private message Send e-mail MSN Messenger
SKAN



Joined: 26 Dec 2005
Posts: 6223

PostPosted: Tue Oct 24, 2006 9:17 am    Post subject: Reply with quote

I earlier wrote:
Let me know if you want to retain the orgininal order!

sosaited wrote:
The Sequence of the lines must stay the same.


Code:
Fileread, Text, text.txt
Loop, Parse, Text, `n
  {
   IfInString, NewText, %A_LoopField%, Continue
   NewText = %NewText%`n%A_LoopField%
  }
StringTrimLeft, NewText, NewText, 1
FileAppend, %NewText%, NewText.txt


Smile
_________________
Back to top
View user's profile Send private message
Titan



Joined: 11 Aug 2004
Posts: 5376
Location: /b/

PostPosted: Tue Oct 24, 2006 10:03 am    Post subject: Reply with quote

Here's another method that will perform better and keep the list in the original order:
Code:
old = original.txt ; the orignal file
new = sorted.txt ; where to save the new list
FileDelete, %new% ; delete this file if it already exists

Loop, Read, %old%, %new% ; Loop (read file contents)
{
   line = %A_LoopReadLine%`n ; the current line
   If !InStr(list, line) { ; if this line was not written before...
      FileAppend, %line% ; save it to the file
      list = %list%%line% ; and add it to the temporary list so it's not written again
   }
}
list = ; empty the list to save memory

Run, %new% ; opens the file in the default editor

_________________

Back to top
View user's profile Send private message Visit poster's website
SKAN



Joined: 26 Dec 2005
Posts: 6223

PostPosted: Tue Oct 24, 2006 10:06 am    Post subject: Reply with quote

@Titan: Wow! That is fine piece of code well commented! Very Happy
_________________
Back to top
View user's profile Send private message
PhiLho



Joined: 27 Dec 2005
Posts: 6721
Location: France (near Paris)

PostPosted: Tue Oct 24, 2006 10:30 am    Post subject: Reply with quote

Goyyah, I believe the Extension/FileName pairs of lines should be kept together...
Here is my take:
Code:
Fileread text, TestFileInput.txt
searchPos := 1
Loop
{
   matchPos := RegExMatch(text, "Extension=.*\r\nFileName=.*\r\n", unit, searchPos)
   If (matchPos = 0)   ; No match
      Break
   searchPos := matchPos + StrLen(unit)
   If (InStr(text, unit, true, searchPos) = 0)   ; Not found later
      FileAppend %unit%, TestFileOutput.txt
}
Notes:
- File must end with `r`n. I can remove the constraint, but it is simpler this way.
- I keep the last occurence of repetitions.
- If only keeping the Extension/FileName as unit is important, it is simpler and faster to do:
Code:
Fileread text, TestFileInput.txt
text := RegExReplace(text, "Extension=(.*)\r\nFileName=(.*)\r\n", "!$1!$2~")
Sort text, UD~
text := RegExReplace(text, "!(.*?)!(.*?)~", "Extension=$1`r`nFileName=$2`r`n")
FileAppend %text%, TestFileOutput.txt
Note: actually, I had to cheat to make them to work. The given sample has three consecutive Extension lines and ends with an Extension line, breaking my assumption above. I removed the last line and kept only one of three lines. If these are patterns to be respected, my code above have to be changed a bit.
_________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")
Back to top
View user's profile Send private message Visit poster's website
sosaited



Joined: 24 Feb 2005
Posts: 233

PostPosted: Tue Oct 24, 2006 10:44 pm    Post subject: Reply with quote

PhiLho wrote:
Code:
RegExMatch

I am sorry if this is an "embarassing" question, but I dont know anything about RegExMatch in AHK, and in case of a function, you didnt define it.
_________________
My small "thanks" to AHK in shape of these dedicated 3d images (Topic already in "General" Forum)
Back to top
View user's profile Send private message Send e-mail MSN Messenger
SKAN



Joined: 26 Dec 2005
Posts: 6223

PostPosted: Wed Oct 25, 2006 4:44 am    Post subject: Reply with quote

sosaited wrote:
I am sorry if this is an "embarassing" question, but I dont know anything about RegExMatch in AHK, and in case of a function, you didnt define it.


RegExMatch() & RegExReplace() are would-be-built-in functions, just like InStr(). They are in beta stage now and people like PhiLho\Titan are testing it vigorously before the final release can be available.

You should download the beta version, if you want to test PhiLho's code!
Smile
_________________
Back to top
View user's profile Send private message
PhiLho



Joined: 27 Dec 2005
Posts: 6721
Location: France (near Paris)

PostPosted: Wed Oct 25, 2006 11:47 am    Post subject: Reply with quote

Goyyah wrote:
You should download the beta version, if you want to test PhiLho's code!
Smile
THERE... Wink
_________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2")
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    AutoHotkey Community Forum Index -> Ask for Help All times are GMT
Page 1 of 1

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group