Jump to content


Photo

Remove String from Line


  • Please log in to reply
7 replies to this topic

#1 jmk123

jmk123
  • Members
  • 14 posts

Posted 13 October 2012 - 04:47 AM

I am trying to create an AHK script that will filter through a text file and keep only the data I want. Each line of the text file I want to filter has content that I do not want, and each line is repeated. So far, I have written the code to remove the duplicate lines, but I'm not sure how to get rid of the part at the start of each line that I don't want to keep.

Each line looks like this (with the "x"s being any number or letter):
[xx] xx.xxxxx [xx]: part I want to keep is here

I want to keep everything after the second colon. Is there any easy way to do this?

Here's what I have so far, for removing duplicate lines:
#MaxMem 256
SendMode Input  ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.

FileSelectFile, file_name, 3, , File to remove duplicates from
FileSelectFile, save_name, S3, Save.txt, File to save to

FileRead, Text, %file_name%
Sort, Text, U
FileAppend, %Text%`n, %save_name%

FileDelete, %file_name%


#2 MasterFocus

MasterFocus
  • Moderators
  • 4126 posts

Posted 13 October 2012 - 04:54 AM

You can try using TF_RegExReplace() from the TF library - <!-- m -->http://www.autohotke... ... gExReplace<!-- m -->
Probably something like TF_RegExReplace( "File.txt" , "\[\d+]:(.+)$" , "$1" )
AHK's RegEx Quick Reference: <!-- m -->http://www.autohotke...Ex-QuickRef.htm<!-- m -->

#3 dmg

dmg
  • Members
  • 1737 posts

Posted 13 October 2012 - 06:05 AM

As long as your formatting remains consistent, ie you always want what is after the second colon (that is in fact a colon in your example), then you can use stringsplit with a colon as the delimiter:
string := "[xx] xx.xxxxx [xx]: part I want to keep is here"



stringsplit, part, string, :



msgbox, % part3
This will give you whatever is to the right of the second colon in any text string, and will ignore the rest of the string.

#4 Coco

Coco
  • Members
  • 590 posts

Posted 13 October 2012 - 06:21 AM

Really depends on the formatting just as dmg has stated, here is an option as well:

str := "[12] 56.7xF0L [q3]: part I want to keep is here"
MsgBox, % RegExReplace(str, "^\[\w+?\:\w+?\]\s\w+?\.\w+?\s\[\w+?\]\:\s", "")
return


#5 jmk123

jmk123
  • Members
  • 14 posts

Posted 13 October 2012 - 06:53 AM

Thanks for the responses guys. The formatting of every line is the same as the example I provided.

Coco, I tried to implement your solution, but I'm getting an error. The code you've written works fine, but I'm having trouble implementing it into my script.

Here's the code I'm trying to use:
#NoEnv  ; Recommended for performance and compatibility with future AutoHotkey releases.
#MaxMem 256
SendMode Input  ; Recommended for new scripts due to its superior speed and reliability.
SetWorkingDir %A_ScriptDir%  ; Ensures a consistent starting directory.

FileSelectFile, file_name, 3, , File to remove duplicates from
FileSelectFile, save_name, S3, Chat.txt, File to save to

FileRead, Text, %file_name%
Sort, Text, U
FileAppend, %Text%`n, Intermediate.txt

Loop, read, Intermediate.txt, %save_name%
{
	NewStr := RegExReplace(%A_LoopReadLine%, "^\[\w+?\:\w+?\]\s\w+?\.\w+?\s\[\w+?\]\:\s", "")
	FileAppend, NewStr
}

FileDelete, %file_name%
FileDelete, Intermediate.txt

When I try to run this script, I get the following error:
Error: The following variable name contains an illegal character:
"[xx] xx.xxxxx [xx]: part I want to keep is here"

The current thread will exit.


#6 TLM

TLM
  • Members
  • 3586 posts

Posted 13 October 2012 - 07:30 AM

A couple of things, 1 that is no fault of anyone here ( you may have stumbled onto a bug ).

1st, you don't have to put percents around the loop variable A_LoopReadLine,
and that RegEx Needle is unnecessarily long.
Loop, read, Intermediate.txt, %save_name%
{
   NewStr := RegExReplace( [color=#BF0000]A_LoopReadLine[/color], "\[.*:\s" )
   FileAppend, NewStr
}
However, by pulling in the value as a variable you inadvertently stumbled onto an anomaly.

" [ " and its variations is a variable :shock: :lol: :lol: :lol:
Anyone else notice this? Seems to only work in AHk_L, it returns an error in basic.

#7 Gogo

Gogo
  • Guests

Posted 13 October 2012 - 09:13 AM

I'd use this: ( no Intermediate.txt file )
Sort, text, U
FileAppend % RegExReplace(text, "m`a)^.*\]: ") . "`n" , %save_name%


#8 jmk123

jmk123
  • Members
  • 14 posts

Posted 13 October 2012 - 09:23 AM

Thanks TLM and Gogo, both of your suggestions work perfectly. :)