本文介绍了提取没有任何类或div的HTML源代码(python硒)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的列表不包含任何唯一的div或类.

I have list that does not contain any unique div or class.

我想复制下面每一行的HTML源代码.我找不到该行的班级.

I want to copy the HTML source code for each row below. I cannot find the class for the row.

当我打开编辑HTML"时,我看到以下代码:

When I open the 'Edit HTML' I see the following code:

<tr style="font-size: 11px">
              <td class="center"><a href="/countries/1"><img alt="" src="/assets/flags/flag_1-1db156e1884c1b3d5614b55996cf96cd38843b290c7c43bdd5abbdb944b4075c.gif"></a></td>
              <td><a href="/employees/9526577">Bernard Aarslev</a></td>
              <td><a href="/clubs/1200094">Kirslev FC</a></td>
              <td align="right" style="padding-right: 5px;">69</td>
              <td>Talentspejder</td>
              <td>Talentspejder</td>
              <td align="right" style="padding-right: 5px;">24.000 C</td>
              <td class="center" style="width: 120px;">
                <div class="relative">
                  <div id="stats9526577" style="z-index: 99; position: absolute; top: -80px; right: 80px; display: none;" class="dark"></div>
                  <img src="/assets/detaljer-c83987d00da87f2fa8810793cc815a1659249440edee3c0d084333bc69323384.gif" alt="stats" onmouseout="hide_stats(9526577);" onmouseover="view_stats(9526577, 14, 13, 4, 7, 10, 8, 6, 3);">
                </div>
              </td>
            </tr>

如何编写正确的find_element_by_xpath函数以使其正常工作?

How do I write the correct find_element_by_xpath function to make this work?

推荐答案

您可以使用:

node = driver.find_element_by_xpath("//table[@class='stretch']//tr[@style][1]")

首先,我们查找包含特定@class属性的表元素.然后,我们查找该表的第一个tr元素,其中包含@style属性.

First we look for the table element containing a specific @class attribute. Then we look for the first tr element of this table containing a @style attribute.

更多详细信息,因为它已被否决.将前面的表达式与element.get_attribute('outerHTML')组合(以保留标签).所以:

EDIT : More details, since it was downvoted. Combine the preceding expression with element.get_attribute('outerHTML') (to keep the tags). So :

data = node.get_attribute('outerHTML')

如果表的所有行都需要此,则:

If you need this for all the lines of the table, then :

node = driver.find_elements_by_xpath("//table[@class='stretch']//tr[@style]")
for elem in node :
    data = elem.get_attribute('outerHTML')

这篇关于提取没有任何类或div的HTML源代码(python硒)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-01 11:02