 |
AutoHotkey Community Let's help each other out
|
| View previous topic :: View next topic |
| Author |
Message |
sosaited
Joined: 24 Feb 2005 Posts: 233
|
Posted: Tue Oct 24, 2006 8:34 am Post subject: Remove repetations from a file |
|
|
I have a file that has entries like these:
| Code: | Extension=avi
FileName=smallville
Extension=exe,zip
Extension=mpg,mpeg,avi,wmv,mov
Extension=exe,zip
FileName=bearshare
Extension=exe,zip
FileName=BSPROINSTALL.exe
Extension=
FileName=
Extension=avi
FileName=smallville
Extension=ahk,ini
FileName=index
Extension=htm,html
FileName=rc4
Extension=exe,zip
FileName=bearshare
Extension=doc
FileName=help
Extension=doc,txt
FileName=help
Extension=exe,zip
FileName=winamp
Extension=exe,zip
FileName=icq
Extension=htm,html
FileName=msdn messagebox
Extension=htm,html
FileName=messagebox
Extension=exe,zip
FileName=putty
Extension=jpg,jpeg,bmp,png
FileName=passport
Extension=ttf,fon,zip
FileName=Footlight
Extension=ttf,zip
FileName=Formal_436_BT
Extension=exe,zip
FileName=ocr
Extension=exe,zip
FileName=winamp
Extension=xls
FileName=bss
Extension=xls
FileName=
Extension=jpg,bmp
FileName=mydesk
Extension=jpg,bmp
FileName=my pictures
Extension=avi
FileName=smallville
Extension=iso,zip,rar
FileName=mafia
Extension=exe,zip
FileName=iso
Extension=Folder
FileName=mafia
Extension=exe,zip
FileName=magiciso
Extension=exe,zip
FileName=crack
Extension=exe
FileName=iso
Extension=exe,zip
FileName=iso
Extension=exe,zip
FileName=ultraiso
Extension=exe,zip |
I need to create a script that will REMOVE ALL THE REPETATIONS from the file. So far, I am unable to create something working. Can anyone help me please.
Thanks _________________ My small "thanks" to AHK in shape of these dedicated 3d images (Topic already in "General" Forum)  |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 6223
|
Posted: Tue Oct 24, 2006 8:51 am Post subject: |
|
|
| sosaited wrote: | | I am unable to create something working |
| Code: | Fileread, Text, text.txt
Sort text, U
FileAppend, %Text%, NewText.txt |
This was the output:
| Quote: | Extension=
Extension=ahk,ini
Extension=avi
Extension=doc
Extension=doc,txt
Extension=exe
Extension=exe,zip
Extension=exe,zip
Extension=Folder
Extension=htm,html
Extension=iso,zip,rar
Extension=jpg,bmp
Extension=jpg,jpeg,bmp,png
Extension=mpg,mpeg,avi,wmv,mov
Extension=ttf,fon,zip
Extension=ttf,zip
Extension=xls
FileName=
FileName=bearshare
FileName=BSPROINSTALL.exe
FileName=bss
FileName=crack
FileName=Footlight
FileName=Formal_436_BT
FileName=help
FileName=icq
FileName=index
FileName=iso
FileName=mafia
FileName=magiciso
FileName=messagebox
FileName=msdn messagebox
FileName=my pictures
FileName=mydesk
FileName=ocr
FileName=passport
FileName=putty
FileName=rc4
FileName=smallville
FileName=ultraiso
FileName=winamp |
Edit: Let me know if you want to retain the orgininal order! _________________
 |
|
| Back to top |
|
 |
sosaited
Joined: 24 Feb 2005 Posts: 233
|
Posted: Tue Oct 24, 2006 8:57 am Post subject: |
|
|
I wish it was this simple. (Though, I have to admit that I never checked the U parameter of the Sort command. Thanks for that)
The Problem is that The Sequence of the lines must stay the same. _________________ My small "thanks" to AHK in shape of these dedicated 3d images (Topic already in "General" Forum)  |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 6223
|
Posted: Tue Oct 24, 2006 9:17 am Post subject: |
|
|
| I earlier wrote: | | Let me know if you want to retain the orgininal order! |
| sosaited wrote: | | The Sequence of the lines must stay the same. |
| Code: | Fileread, Text, text.txt
Loop, Parse, Text, `n
{
IfInString, NewText, %A_LoopField%, Continue
NewText = %NewText%`n%A_LoopField%
}
StringTrimLeft, NewText, NewText, 1
FileAppend, %NewText%, NewText.txt |
 _________________
 |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5376 Location: /b/
|
Posted: Tue Oct 24, 2006 10:03 am Post subject: |
|
|
Here's another method that will perform better and keep the list in the original order: | Code: | old = original.txt ; the orignal file
new = sorted.txt ; where to save the new list
FileDelete, %new% ; delete this file if it already exists
Loop, Read, %old%, %new% ; Loop (read file contents)
{
line = %A_LoopReadLine%`n ; the current line
If !InStr(list, line) { ; if this line was not written before...
FileAppend, %line% ; save it to the file
list = %list%%line% ; and add it to the temporary list so it's not written again
}
}
list = ; empty the list to save memory
Run, %new% ; opens the file in the default editor |
_________________
 |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 6223
|
Posted: Tue Oct 24, 2006 10:06 am Post subject: |
|
|
@Titan: Wow! That is fine piece of code well commented!  _________________
 |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Tue Oct 24, 2006 10:30 am Post subject: |
|
|
Goyyah, I believe the Extension/FileName pairs of lines should be kept together...
Here is my take:
| Code: | Fileread text, TestFileInput.txt
searchPos := 1
Loop
{
matchPos := RegExMatch(text, "Extension=.*\r\nFileName=.*\r\n", unit, searchPos)
If (matchPos = 0) ; No match
Break
searchPos := matchPos + StrLen(unit)
If (InStr(text, unit, true, searchPos) = 0) ; Not found later
FileAppend %unit%, TestFileOutput.txt
}
| Notes:
- File must end with `r`n. I can remove the constraint, but it is simpler this way.
- I keep the last occurence of repetitions.
- If only keeping the Extension/FileName as unit is important, it is simpler and faster to do:
| Code: | Fileread text, TestFileInput.txt
text := RegExReplace(text, "Extension=(.*)\r\nFileName=(.*)\r\n", "!$1!$2~")
Sort text, UD~
text := RegExReplace(text, "!(.*?)!(.*?)~", "Extension=$1`r`nFileName=$2`r`n")
FileAppend %text%, TestFileOutput.txt
| Note: actually, I had to cheat to make them to work. The given sample has three consecutive Extension lines and ends with an Extension line, breaking my assumption above. I removed the last line and kept only one of three lines. If these are patterns to be respected, my code above have to be changed a bit. _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
sosaited
Joined: 24 Feb 2005 Posts: 233
|
Posted: Tue Oct 24, 2006 10:44 pm Post subject: |
|
|
I am sorry if this is an "embarassing" question, but I dont know anything about RegExMatch in AHK, and in case of a function, you didnt define it. _________________ My small "thanks" to AHK in shape of these dedicated 3d images (Topic already in "General" Forum)  |
|
| Back to top |
|
 |
SKAN
Joined: 26 Dec 2005 Posts: 6223
|
Posted: Wed Oct 25, 2006 4:44 am Post subject: |
|
|
| sosaited wrote: | | I am sorry if this is an "embarassing" question, but I dont know anything about RegExMatch in AHK, and in case of a function, you didnt define it. |
RegExMatch() & RegExReplace() are would-be-built-in functions, just like InStr(). They are in beta stage now and people like PhiLho\Titan are testing it vigorously before the final release can be available.
You should download the beta version, if you want to test PhiLho's code!
 _________________
 |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Wed Oct 25, 2006 11:47 am Post subject: |
|
|
| Goyyah wrote: | You should download the beta version, if you want to test PhiLho's code!
 | THERE...  _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
|
|
You can post new topics in this forum You can reply to topics in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|