How to remove the word by regular expressions? Topic is solved

Get help with using AutoHotkey (v2 or newer) and its commands and hotkeys
hancre
Posts: 251
Joined: 02 Jul 2021, 20:51

How to remove the word by regular expressions?

Post by hancre » 31 Mar 2024, 22:06

This following code removes only english words and doesn't work for the non-english words such as 시작, こんにちは, 時作 é è ? ! ".

How can I remove all letters and special characters(? ! " .) Thanks for any help in advance.

Code: Select all

; case-sensitive
MyMenu := Menu()
MyMenu.Add '1. Remove inputword', RemoveWord


Capslock & f:: {     
    global IB, textClipboard
    IB:='',  textClipboard:= ''
    A_Clipboard := ''
    Send('^c')
    if (ClipWait(1))    {
        textClipboard := A_Clipboard
        txt := SubStr(textClipboard, 1, 65)
        IB := InputBox('Clipboard:`n' txt '`n`nEnter text to change.', 'Change text', 'w530 h150')
        if (ib.Result = 'OK' && ib.Value != '')
            MyMenu.Show
            Send('^v')
    }  else  {
        MsgBox 'An error occurred while waiting for the clipboard.', 'Error', 'Icon!'
    }
}


RemoveWord(ItemName, ItemPos, MyMenu) {
  A_Clipboard := RegExReplace(A_Clipboard, '\b' ib.Value '\b\h*', "") 
   }
    

User avatar
boiler
Posts: 17387
Joined: 21 Dec 2014, 02:44

Re: How to remove the word by regular expressions?

Post by boiler » 31 Mar 2024, 23:46

Do you mean you want to remove everything but numbers?

hancre
Posts: 251
Joined: 02 Jul 2021, 20:51

Re: How to remove the word by regular expressions?

Post by hancre » 01 Apr 2024, 00:08

boiler wrote:
31 Mar 2024, 23:46
Do you mean you want to remove everything but numbers?
Ah. I want to remove any inputdata (word, not string)
If I type Number, it should also be removed.
Thanks for your interest. ^^

User avatar
Seven0528
Posts: 412
Joined: 23 Jan 2023, 04:52
Location: South Korea
Contact:

Re: How to remove the word by regular expressions?

Post by Seven0528 » 01 Apr 2024, 00:36

Code: Select all

#Requires AutoHotkey v2.0
#SingleInstance Force
haystack := "그러니까 이런 거 말이죠?"
needle := "이런 거"
newStr := regExReplace(haystack, "(?<=\h|^)\Q" needle "\E(?=\h|$)\h*")
msgbox newStr ;  "그러니까 말이죠?"
 English description
 日本語説明
 한국어 설명
  • English is not my native language. Please forgive any awkward expressions.
  • 영어는 제 모국어가 아닙니다. 어색한 표현이 있어도 양해해 주세요.

User avatar
Seven0528
Posts: 412
Joined: 23 Jan 2023, 04:52
Location: South Korea
Contact:

Re: How to remove the word by regular expressions?

Post by Seven0528 » 01 Apr 2024, 01:00

Code: Select all

haystack := "漢字が含まれていると困ります。"
needle := "漢字が"
if (haystack ~= "\p{Han}") ;  if (regExMatch(haystack, "\p{Han}"))
    newStr := regExReplace(haystack, "\Q" needle "\E")
else
    newStr := regExReplace(haystack, "(?<=\h|^)\Q" needle "\E(?=\h|$)\h*")
msgbox newStr ;  "含まれていると困ります。"
 Ah, I think it would indeed become much more complex if we had to consider Chinese characters(漢字), especially in the context of Japanese where we need to consider the relationship between 漢字 and ひらがな. There isn't an easy way to determine word boundaries.
漢字 can be detected using \p{Han}.
If we are considering the presence of 漢字, perhaps removing assertions would help.

(For reference, 한글 can be specified as [ㄱ-ㅣ가-힣], and as for ひらがな, カタカナ, and 半角カタカナ, their Unicode ranges tend to be somewhat scattered, so I won't list them here.)
  • English is not my native language. Please forgive any awkward expressions.
  • 영어는 제 모국어가 아닙니다. 어색한 표현이 있어도 양해해 주세요.

Rohwedder
Posts: 7768
Joined: 04 Jun 2014, 08:33
Location: Germany

Re: How to remove the word by regular expressions?

Post by Rohwedder » 01 Apr 2024, 03:10

@hancre write! What exactly you want this input text to become:

Code: Select all

#Requires AutoHotkey v2.0
InputText := " Hello 'World' 1 12 123 42 4 2 𝟒𝟐 𝟰 𝟮 시작, こんにちは, 時作 é è ? ! `" . "
MsgBox InputText

User avatar
Seven0528
Posts: 412
Joined: 23 Jan 2023, 04:52
Location: South Korea
Contact:

Re: How to remove the word by regular expressions?

Post by Seven0528 » 01 Apr 2024, 03:48

Code: Select all

haystack := 'Hello Nice To Meet You'
needle := 'Meet'
msgbox regExReplace(haystack, '\b' needle '\b\h*') ;  'Hello Nice To You'
haystack := '안녕 만나서 반가워'
needle := '만나서'
msgbox regExReplace(haystack, '\b' needle '\b\h*') ;  '안녕 만나서 반가워'
 @hancre is asking why this regular expression only works in English sentences.
They're also asking how to handle special characters like \.*?+[{|()^$ in the needle part of the regular expression so that they are treated literally.
(I don't think @hancre's question was vague; rather, they were quite clear about what they wanted to ask.)

To answer this, \b denotes the boundary between \w and \W, so it doesn't detect non-alphanumeric word boundaries.
As for covering the entire string, you can use \Q and \E to encapsulate the string.
  • English is not my native language. Please forgive any awkward expressions.
  • 영어는 제 모국어가 아닙니다. 어색한 표현이 있어도 양해해 주세요.

Rohwedder
Posts: 7768
Joined: 04 Jun 2014, 08:33
Location: Germany

Re: How to remove the word by regular expressions?  Topic is solved

Post by Rohwedder » 01 Apr 2024, 04:04

Try:

Code: Select all

haystack := '안녕 만나서 반가워'
needle := '만나서'
msgbox regExReplace(haystack, '(*UCP)\b' needle '\b\h*') ;  '안녕 반가워'

User avatar
Seven0528
Posts: 412
Joined: 23 Jan 2023, 04:52
Location: South Korea
Contact:

Re: How to remove the word by regular expressions?

Post by Seven0528 » 01 Apr 2024, 04:06

Holly molly...
  • English is not my native language. Please forgive any awkward expressions.
  • 영어는 제 모국어가 아닙니다. 어색한 표현이 있어도 양해해 주세요.

User avatar
Seven0528
Posts: 412
Joined: 23 Jan 2023, 04:52
Location: South Korea
Contact:

Re: How to remove the word by regular expressions?

Post by Seven0528 » 01 Apr 2024, 04:12

 The perfect answer here, @hancre.
Thank you for letting me know about (*UCP), @Rohwedder!
https://www.autohotkey.com/docs/v2/misc/RegEx-QuickRef.htm#Common
For performance, \d, \D, \s, \S, \w, \W, \b and \B recognize only ASCII characters by default. If the pattern begins with (*UCP), Unicode properties will be used to determine which characters match. For example, \w becomes equivalent to [\p{L}\p{N}_] and \d becomes equivalent to \p{Nd}.

Code: Select all

newStr := regExReplace(haystack, "(*UCP)\b\Q" needle "\E\b\h*")
  • English is not my native language. Please forgive any awkward expressions.
  • 영어는 제 모국어가 아닙니다. 어색한 표현이 있어도 양해해 주세요.

hancre
Posts: 251
Joined: 02 Jul 2021, 20:51

Re: How to remove the word by regular expressions?

Post by hancre » 02 Apr 2024, 02:53

@Rohwedder @Seven0528

Thank you for your help. Wow.. It works.

I have an additional question. How can I keep the blank cased by inpudata like the attached image?
haystack := '안녕 만나서 반가워'
needle := '만나서'

text doesn't show the right result.
so I attached an image file.

sorry. I tried to add < \K(?=\S)") > to the current Regex. but it doesn't work.
Attachments
temp_20240402_001.jpg
temp_20240402_001.jpg (6.04 KiB) Viewed 208 times

User avatar
Seven0528
Posts: 412
Joined: 23 Jan 2023, 04:52
Location: South Korea
Contact:

Re: How to remove the word by regular expressions?

Post by Seven0528 » 02 Apr 2024, 03:07

 Are you asking if you want to replace the specific characters with whitespace characters?
However, in this case, how should we consider whether the characters are half-width or full-width?

Code: Select all

"Hello Nice To Meet You"
"Hello      To Meet You"

Code: Select all

"안녕 만나서 반가워"
"안녕     반가워"
  • English is not my native language. Please forgive any awkward expressions.
  • 영어는 제 모국어가 아닙니다. 어색한 표현이 있어도 양해해 주세요.

hancre
Posts: 251
Joined: 02 Jul 2021, 20:51

Re: How to remove the word by regular expressions?

Post by hancre » 02 Apr 2024, 03:24

@Seven0528
Oh. I prefer the same width. the width of sample looks different.
sorry to make you confused.

Post Reply

Return to “Ask for Help (v2)”