This is useful if you have a html file/or variable and you want to extract a block of data, for example the CONTENT or MAIN div or a UL that is used for a menu. By default it will get the first occurrence of a tag, so if you want to use it to get more than one tag you will have to use a loop, for example if you wanted to get all <p> tags from the html data. (See example 5)
The examples below will hopefully illustrate its purpose. And of course it only works with properly formatted html.
Documentation
<!-- m -->http://www.autohotke... ... edTag.html<!-- m -->
Function
; GetNestedTag() v1
; AHK Forum Topic : http://www.autohotkey.com/forum/viewtopic.php?t=77653
; Documentation : http://www.autohotkey.net/~hugov/functions/GetNestedTag.html
GetNestedTag(data,tag,occurrence="1")
{
Start:=InStr(data,tag,false,1,occurrence)
RegExMatch(tag,"i)<([a-z]*)",basetag) ; get yer basetag1 here
Loop
{
Until:=InStr(data, "</" basetag1 ">", false, Start, A_Index) + StrLen(basetag1) + 3
Strng:=SubStr(data, Start, Until - Start)
StringReplace, strng, strng, <%basetag1%, <%basetag1%, UseErrorLevel ; start counting to make match
OpenCount:=ErrorLevel
StringReplace, strng, strng, </%basetag1%, </%basetag1%, UseErrorLevel
CloseCount:=ErrorLevel
If (OpenCount = CloseCount)
Break
If (A_Index > 250) ; for safety so it won't get stuck in an endless loop,
{ ; it is unlikely to have over 250 nested tags
strng=
Break
}
}
If (StrLen(strng) < StrLen(tag)) ; something went wrong/can't find it
strng=
Return strng
}
Examples - (includes function to it runs out of the box if you simply want to try)
Gosub, GetHtml
; Example 1 - get DIV header
; Pretty simple as it is not a nested DIV
get=<div id="header">
MsgBox,,Example 1 - get DIV header, % GetNestedTag(html,get)
Result:
<div id="header"><!-- start header -->
<h1>GetNestedTag(data,tag)</h1>
<!-- /end header --></div>
; Example 2 - get content
; More complex as it is a nested DIV
get=<div id="content">
MsgBox,,Example 2 - get DIV content, % GetNestedTag(html,get)
Result:
<div id="content"><!-- start content -->
.... all lines in between ...
<!-- /end content --></div>
; Example 3a - get table data
; Nested TABLE
get=<table id="data">
MsgBox,,Example 3a - get table data, % GetNestedTag(html,get)
; Example 3b - get table subdata
get=<table id="subdata">
MsgBox,,Example 3b - get table subdata, % GetNestedTag(html,get)
; Example 4a - get UL menu
; Nested UL, 3 levels
get=<ul id="menu">
MsgBox,,Example 4a - get UL Menu, % GetNestedTag(html,get)
; Example 4b - get the first UL
get=<ul>
MsgBox,,Example 4 - get UL, % GetNestedTag(html,get)
; Example 5 - get all paragraphs
Loop
{
tag:=GetNestedTag(html,"<p",A_Index)
If (tag = "")
Break
MsgBox,,Example 5 - get P (%A_Index%/5), % tag
}
ExitApp
GetHtml:
html=
(
<!DOCTYPE html>
<html lang="en-us">
<head>
<title>Example HTML</title>
</head>
<body>
<div id="wrapper"><!-- start wrapper -->
<div id="header"><!-- start header -->
<h1>GetNestedTag(data,tag)</h1>
<!-- /end header --></div>
<div id="navigation"><!-- start navigation -->
<ul id="menu">
<li>Menu option 1</li>
<li>Menu option 2
<ul class="submenu">
<li>Submenu 2.1</li>
<li>Submenu 2.2
<ul><!-- ul1 -->
<li>Submenu 2.2.1</li>
<li>Submenu 2.2.2</li>
<li>Submenu 2.2.3</li>
<li>Submenu 2.2.4</li>
</ul>
</li>
<li>Submenu 2.3</li>
<li>Submenu 2.4</li>
</ul>
</li>
<li>Menu option 3</li>
<li>Menu option 4</li>
<li>Menu option 5
<ul class="submenu">
<li>Submenu 5.1</li>
<li>Submenu 5.2</li>
</ul>
</li>
<li>Menu option 6</li>
</ul>
</div><!-- /end navigation -->
<div id="leftcolumn"><!-- start leftcol -->
<ol>
<li>Scripting</li>
<li>Hotkeys</li>
<li>Automation</li>
</ol>
<!-- /end leftcol --></div>
<div id="content"><!-- start content -->
<div class="intro"><p>AutoHotkey is a free, open-source utility for Windows. With it, you can:</p></div>
<div style="clear:both;"></div>
<ul><!-- ul2 -->
<li>Automate almost anything by sending keystrokes and mouse clicks.</li>
<li>Create hotkeys for keyboard, joystick, and mouse.</li>
<li>Expand abbreviations as you type them.</li>
<li>Create custom data-entry forms, user interfaces, and menu bars.</li>
<li>Remap keys and buttons on your keyboard, joystick, and mouse.</li>
<li>Respond to signals from hand-held remote controls via the WinLIRC client script.</li>
<li>Run existing AutoIt v2 scripts and enhance them with new capabilities.</li>
</ul>
<p>Getting started might be easier than you think. Check out the quick-start tutorial.</p>
<div style="clear:both;"></div>
<p>Here is a nice table with some data:</p>
<table id="data">
<tr>
<th>1</th>
<th>2</th>
</tr>
<tr>
<td>
<table id="subdata">
<tr>
<td>3a</td>
<td>3b</td>
</tr>
</table>
</td>
<td>4</td>
</tr>
<tr>
<td>5</td>
<td>6</td>
</tr>
</table>
<p>Nothing more to report in content.</p>
<!-- /end content --></div>
<div id="rightcolumn"><!-- start rightcol -->
<ol>
<li>Automation</li>
<li>Hotkeys</li>
<li>Scripting</li>
</ol>
<!-- /end rightcol --></div>
<div id="footer">
<p><a href='http://www.autohotkey.com/'>http://www.autohotkey.com/</a></p>
</div>
<!-- /end wrapper --></div>
</body>
</html>
)
Return
; GetNestedTag() v1
; AHK Forum Topic : http://www.autohotkey.com/forum/viewtopic.php?t=77653
; Documentation : http://www.autohotkey.net/~hugov/functions/GetNestedTag.html
GetNestedTag(data,tag,occurrence="1")
{
Start:=InStr(data,tag,false,1,occurrence)
RegExMatch(tag,"i)<([a-z]*)",basetag) ; get yer basetag1 here
Loop
{
Until:=InStr(data, "</" basetag1 ">", false, Start, A_Index) + StrLen(basetag1) + 3
Strng:=SubStr(data, Start, Until - Start)
StringReplace, strng, strng, <%basetag1%, <%basetag1%, UseErrorLevel ; start counting to make match
OpenCount:=ErrorLevel
StringReplace, strng, strng, </%basetag1%, </%basetag1%, UseErrorLevel
CloseCount:=ErrorLevel
If (OpenCount = CloseCount)
Break
If (A_Index > 250) ; for safety so it won't get stuck in an endless loop,
{ ; it is unlikely to have over 250 nested tags
strng=
Break
}
}
If (StrLen(strng) < StrLen(tag)) ; something went wrong/can't find it
strng=
Return strng
}
(I don't even know if it is possible with COM, it very well might be, but I needed it so I wrote it)




