Page 1 of 1

RegEx to Match Expression With Quotation Marks

Posted: 20 Nov 2017, 17:06
by adamas
Guys, I'm struggling with RegEx a bit (yet again). I'm trying to put together a regex to apply to expression like this

Code: Select all

"videoId": "KhmrdFsY6Ls"
and match (so RegExMatch) whatever's in-between the "videoId": " part and the last ". So in the example above I need the regex to match the KhmrdFsY6Ls portion. I think I'm struggling here especially, due to the fact that there's a bunch of quotation marks in the string, so no matter how I try AutoHotkey gives me an error.

Here's the string I have so far that totally does not work:

Code: Select all

RegExMatch(VideoLengthRaw, "O)"videoId": "(\d+)"", RegExResultHoldover), VideoLength := RegExResultHoldover[1]
Any ideas? :headwall:

Re: RegEx to Match Expression With Quotation Marks

Posted: 20 Nov 2017, 17:14
by User

Code: Select all

text = "videoId": "KhmrdFsY6Ls"

RegExMatch(text, """.*?"".*?""(.*?)""", Matched)

msgbox, % Matched1

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 03:35
by adamas
User wrote:

Code: Select all

text = "videoId": "KhmrdFsY6Ls"

RegExMatch(text, """.*?"".*?""(.*?)""", Matched)

msgbox, % Matched1
Hey! Is there any way to make it narrow and specific to the string? The issue is that there are a ton of of information in the text where I'm trying to regex it that have the same pattern as the string I need. Example:
"nextPageToken": "CAEQAA",
"regionCode": "US",
"pageInfo": {
"totalResults": 245,
"resultsPerPage": 1
So I specifically need to match the text between "videoId": " and the last ", because otherwise it catches any other string but the one I need (since many similar come before the one I'm looking to match).

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 04:30
by Odlanir
Some like this ?

Code: Select all

text =
(
"nextPageToken": "CAEQAA",
"regionCode": "US",
"videoId": "KhmrdFsY6Ls"
"pageInfo": {
"totalResults": 245,
"resultsPerPage": 1
)

RegExMatch(text, "OmU).*videoId"":\s*?""(.*)"".*?", out)
MsgBox % out[1]

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 05:15
by adamas
Odlanir wrote:Some like this ?

Code: Select all

text =
(
"nextPageToken": "CAEQAA",
"regionCode": "US",
"videoId": "KhmrdFsY6Ls"
"pageInfo": {
"totalResults": 245,
"resultsPerPage": 1
)

RegExMatch(text, "OmU).*videoId"":\s*?""(.*)"".*?", out)
MsgBox % out[1]
This does exactly what I need, THANK YOU so very, very, very much! You live you learn, as they say : )))))

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 09:31
by jeeswg
- That text looks like JSON format.
- If you end up doing more with JSON text from the YouTube API, you will find it a lot easier to use a JSON library to convert the string into an object, and refer to object keys. Something like var := obj.videoID.
- It might sound complicated, but it actually makes things a lot easier.
- I had done a lot with InStr/SubStr previously, and it was a relief using arrays instead.

help converting text to CSV - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 19#p155919
Help using json? - AutoHotkey Community
https://autohotkey.com/boards/viewtopic ... 34#p157734

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 10:39
by User
adamas wrote:.
@adamas, this seems to work too:

Code: Select all

text =
(
"nextPageToken": "CAEQAA",
"regionCode": "US",
"videoId": "KhmrdFsY6Ls"
"pageInfo": {
"totalResults": 245,
"resultsPerPage": 1
)

RegExMatch(text, """videoId"".*?""(.*?)""", Matched)

MsgBox % Matched1

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 10:50
by User
Odlanir wrote:.
adamas wrote:.
@adamas,

The @Odlanir example is not very suitable because regex must match all the text in order to get (KhmrdFsY6Ls)!
(if your original text is too large, it may cause regex limit errors!)

In the other hand, from my example, regex just needs to match ("videoId": "KhmrdFsY6Ls") in order to get (KhmrdFsY6Ls)!

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 11:08
by User
Odlanir wrote:.
adamas wrote:.
the code below explains what I mentioned in the post above:

Code: Select all

text =
(
"nextPageToken": "CAEQAA",
"regionCode": "US",
"videoId": "KhmrdFsY6Ls"
"pageInfo": {
"totalResults": 245,
"resultsPerPage": 1
)

RegExMatch(text, "OmU).*videoId"":\s*?""(.*)"".*?", out)
MsgBox % "adamas Example: `n`n" out[0]
MsgBox % "adamas Example: `n`n" out[1]

RegExMatch(text, """videoId"".*?""(.*?)""", Matched)
MsgBox % "My example: `n`n" Matched
MsgBox % "My example: `n`n" Matched1

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 13:20
by Odlanir
User wrote:
The @Odlanir example is not very suitable because regex must match all the text in order to get (KhmrdFsY6Ls)!
(if your original text is too large, it may cause regex limit errors!)
@User: Are you sure ? I've tested my RegEx with a 10MB file and I've found any problem, it performed exactly like your RegEx. Only two different approaches.
Cheers.

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 13:51
by User
Odlanir wrote:.
So do you really think that forcing regex to match all the 10 MB text only to get "KhmrdFsY6Ls" string is the best way to go?

what if the file is 1 GB? 2, 10 or more? All the 1GB or more file content would be stored in "out[0]" var (unnecessary ram usage and sometimes may cause regex limit error!!!)

in the other hand, using my approach, regex just needs to match and store in "Matched" variable ("videoId": "KhmrdFsY6Ls") in order to get (KhmrdFsY6Ls) string, even if the file is 1TB or more!

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 18:14
by Odlanir
Your sample is not correct. Try with a file.

Code: Select all

FileRead, text, test99.txt ; <--- 10MB file
RegExMatch(text, """videoId"".*?""(.*?)""", Matched)
strx := "User example :   `n`nMatched`t" Matched "`nMatched1`t" Matched1 "`n`n"

RegExMatch(text, "OmU).*videoId"":\s*?""(.*)"".*?", out)
strx .= "Odlanir example : `n`nout[0]`t" out[0] "`nout[1]`t"  out[1]
clipboard := strx
the clipboard contents:
User example :

Matched "videoId": "KhmrdFsY6Ls"
Matched1 KhmrdFsY6Ls

Odlanir example :

out[0] "videoId": "KhmrdFsY6Ls"
out[1] KhmrdFsY6Ls

Re: RegEx to Match Expression With Quotation Marks

Posted: 21 Nov 2017, 19:14
by User
Odlanir wrote:.
What sample is not correct my friend?

Read file does not translate `r`n to `n, and since ".*" can't match new lines by default (in this case `r`n), that's why your approach did not match all the 10 MB text file!

but listen, some people chooses to translate any `r`n to `n while reading files, and in this situations your approach would match all the 10MB text from the file!

Code: Select all

text =
(
"nextPageToken": "CAEQAA",
"regionCode": "US",
"videoId": "KhmrdFsY6Ls"
"pageInfo": {
"totalResults": 245,
"resultsPerPage": 1
)
from the code above, any `r`n is translated to `n before the string is stored in "text" variable, and that's why your approach matches all the "text" string!

If you can't clearly see that my approach is more suitable than yours, well my friend, I can't do anything about that!

Re: RegEx to Match Expression With Quotation Marks

Posted: 15 Feb 2018, 17:58
by adamas
Guys, I use the following code string:

Code: Select all

RegExMatch(VideoData, "OmU).*videoId"":\s*?""(.*)"".*?", RegExResultHoldover) VideoId := RegExResultHoldover[1]
To strip text between the second quotes from here:

Code: Select all

"videoId": "GlUrz6NYXRE"
Could anyone please be so kind to help me modify the code string above to strip the date (in YYYY-MM-DD format, here meaning 2018-02-15) from this example:

Code: Select all

"publishedAt": "2018-02-15T04:00:01.000Z",
I tried a ton of options, but the darn regex is still quite illusive for me :/ So - a ton to learn!


Thanks a million in advance for any help on this! :)

Re: RegEx to Match Expression With Quotation Marks

Posted: 16 Feb 2018, 09:43
by adamas
Good people, could anyone please be so kind to lend a hand with this little puzzle? : )))

Re: RegEx to Match Expression With Quotation Marks

Posted: 22 Feb 2018, 14:20
by adamas
Anyone, guys? :roll:

Re: RegEx to Match Expression With Quotation Marks

Posted: 22 Feb 2018, 14:45
by gregster
I have no clue concening RegEx. But jeeswg's suggestion above to look into JSON makes sense (because it looks pretty much like that, although you provide only parts of the data, I suppose). I have myself worked with a lot of different APIs and their JSON responses. Coco's JSON library loads the JSON string as an AHK object and you can access the values quite easily (and apply easier string operations, if necessary). No need to fiddle around with RegEx.

Re: RegEx to Match Expression With Quotation Marks

Posted: 23 Feb 2018, 10:13
by Odlanir

Code: Select all

line = "publishedAt": "2018-02-15T04:00:01.000Z"
MsgBox % substr(RegExReplace(line, ".*publishedAt"":\s*?""(.*)"".*?", "$1") ,1,10)

Re: RegEx to Match Expression With Quotation Marks

Posted: 24 Feb 2018, 12:38
by adamas
Odlanir wrote:

Code: Select all

line = "publishedAt": "2018-02-15T04:00:01.000Z"
MsgBox % substr(RegExReplace(line, ".*publishedAt"":\s*?""(.*)"".*?", "$1") ,1,10)
THANK YOU so very, very, very much much! This really helps a ton and first and foremost allows me to learn and better understand AutoHotkey's regex!

I modified the code a bit to fit my needs and here's what works for my case:

Code: Select all

MyVariableNameHere := SubStr(RegExReplace(VariableThatHasMyStringToProcess, ".*publishedAt"":\s*?""(.*)"".*?", "$1") ,1,10)

Pure awesomeness! THANK YOU!!! :D

Re: RegEx to Match Expression With Quotation Marks

Posted: 24 Feb 2018, 12:51
by jeeswg
- One concern is, if this is from the YouTube API for example, that there could be multiple lines containing 'publishedAt', so you could end up retrieving text from the wrong line. This is why going down the JSON route is a good idea.
- JSON is a bit like the ini format, for storing data, but it can handle more complicated hierarchies. I wrote some YouTube scripts handling all of the data manually, but everything was far easier when done via an object that handles the JSON text, see the links above. Similarly, handling html via the HTMLFile object, or via an InternetExplorer.Application object, was far better than parsing the text manually.
- Btw there are some RegEx tips here.
jeeswg's RegEx tutorial (RegExMatch, RegExReplace) - AutoHotkey Community
https://autohotkey.com/boards/viewtopic.php?f=7&t=28031