如何在C＃中以最快的方式检索HTMLDocument的所有文本节点？

本文介绍了如何在C＃中以最快的方式检索HTMLDocument的所有文本节点？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在HTMLDocument的所有文本节点上执行一些逻辑。这是我目前的方式：

I need to perform some logic on all the text nodes of a HTMLDocument. This is how I currently do this:

HTMLDocument pageContent = (HTMLDocument)_webBrowser2.Document;
IHTMLElementCollection myCol = pageContent.all;
foreach (IHTMLDOMNode myElement in myCol)
{
    foreach (IHTMLDOMNode child in (IHTMLDOMChildrenCollection)myElement.childNodes)
    {
        if (child.nodeType == 3)
        {
           //Do something with textnode!
        }
     }
 }

由于某些元素myCol也有孩子，它们自己在myCol中，我多次访问一些节点！必须有一些更好的方法来做到这一点？

Since some of the elements in myCol also have children, which themselves are in myCol, I visit some nodes more than once! There must be some better way to do this?

推荐答案

最好在...之内迭代childNodes（直接后代）递归函数，从顶层开始，如下所示：

It might be best to iterate over the childNodes (direct descendants) within a recursive function, starting at the top-level, something like:

HtmlElementCollection collection = pageContent.GetElementsByTagName("HTML");
IHTMLDOMNode htmlNode = (IHTMLDOMNode)collection[0];
ProcessChildNodes(htmlNode);

private void ProcessChildNodes(IHTMLDOMNode node)
{
    foreach (IHTMLDOMNode childNode in node.childNodes)
    {
        if (childNode.nodeType == 3)
        {
            // ...
        }
        ProcessChildNodes(childNode);
    }
}

这篇关于如何在C＃中以最快的方式检索HTMLDocument的所有文本节点？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！