问题描述
我有一个html文档,并且doc内的某个地方在一个表的下面,我可以获取表格行和java DOM对象。什么是不明白的是,当值是一个字符串时,如何提取表格单元格的值,以及当它是一个二进制资源?
我使用如下代码:
XPath xpath;
XPathExpression expr;
NodeList nodes = null;
//使用XPath从(X)HTML
try {b
$ b xpath = XPathFactory.newInstance()。newXPath();
//< table class =data>
NodeList list = doc.getElementsByTagName(table);
// Node node = list.item(0);
//System.out.println(node.getTextContent());
// String textContent = node.getTextContent();
expr = xpath.compile(// table / tr / td);
nodes =(NodeList)expr.evaluate(doc,XPathConstants.NODESET);
and loopiong like:
for(int i = 0; i< nodes.getLength(); i ++){
Node ln = list.item(i);
String lnText = ln.toString();
NodeList rowElements = ln.getChildNodes();
节点one = rowElements.item(0);
String oneText = one.toString();
String nodeName = one.getNodeName();
字符串valOne = one.getNodeValue();
但是我没有看到表格中的值。
< table class =data>
< tr>< td> ImageName1< / td>< td width =50>< / td>< td>< img src =/ images / 036000291452alt = 036000291452/>< / td>< / tr>
< tr>< td> ImageName2< / td>< td width =50>< / td>< td>< img src =/ images / 36000291452alt = 36000291452/>< / td>< / tr>
< tr>< td>说明< / td>< td>< / td>< td>时代杂志< / td>< / tr>
< tr>< td>大小/重量< / td>< td>< / td>< td> 14问题< / td>< / tr>
< tr>< td>颁发国家< / td>< td>< / td>< td>美国< / td>< / tr>
< / table>
此XPath表达式:
/ * / tr [1] / td [1]
选择第一个 td 元素> tr 所提供XML文档的顶层元素( table
)的子元素。
XPath表达式:
/ * / tr [1] / td [2]
选择 td
元素(在无名称空间中)是提供的XML文档的顶层元素( table
)的第一个 tr
子元素的第二个子元素。
一般: tr [$ m] / td [$ n]
选择 td
元素(在无名称空间中)是 $ m
的 $ n
-th tr
顶部元素(表格
)的孩子ded XML文档。用所需的整数值替换 $ m
和 $ n
。
您可以使用标准XPath函数 以获取字符串值:
评估为 td
元素(在无名称空间中)的字符串值,该元素是 $ n
子元素的子元素 $ m
-th tr
顶层元素的子元素( table
)提供的XML文档。
I have a html document and somewhere inside the doc is below a table, I can get the table rows and java DOM objects. What is not clear to me is how to extract the value of the table cell when the value is a string and also when it is a binary resource?
I am using code like:
XPath xpath;
XPathExpression expr;
NodeList nodes=null;
// Use XPath to obtain whatever you want from the (X)HTML
try{
xpath = XPathFactory.newInstance().newXPath();
//<table class="data">
NodeList list = doc.getElementsByTagName("table");
// Node node = list.item(0);
//System.out.println(node.getTextContent());
//String textContent=node.getTextContent();
expr = xpath.compile("//table/tr/td");
nodes = (NodeList)expr.evaluate(doc, XPathConstants.NODESET);
and loopiong like:
for (int i = 0; i < nodes.getLength(); i++) {
Node ln = list.item(i);
String lnText=ln.toString();
NodeList rowElements=ln.getChildNodes();
Node one=rowElements.item(0);
String oneText=one.toString();
String nodeName=one.getNodeName();
String valOne = one.getNodeValue();
But I am not seeing the values in the table.
<table class="data">
<tr><td>ImageName1</td><td width="50"></td><td><img src="/images/036000291452" alt="036000291452" /></td></tr>
<tr><td>ImageName2</td><td width="50"></td><td><img src="/images/36000291452" alt="36000291452" /></td></tr>
<tr><td>Description</td><td></td><td>Time Magazine</td></tr>
<tr><td>Size/Weight</td><td></td><td>14 Issues</td></tr>
<tr><td>Issuing Country</td><td></td><td>United States</td></tr>
</table>
This XPath expression:
/*/tr[1]/td[1]
selects the td
element (in no namespace) that is the first child of the first tr
child of the top element (table
) of the provided XML document.
The XPath expression:
/*/tr[1]/td[2]
selects the td
element (in no namespace) that is the second child of the first tr
child of the top element (table
) of the provided XML document.
In general:
/*/tr[$m]/td[$n]
selects the td
element (in no namespace) that is the $n
-th child of the $m
-th tr
child of the top element (table
) of the provided XML document. Just replace $m
and $n
with the desired integer values.
You can use the standard XPath function string()
to obtain their string value:
string(/*/tr[$m]/td[$n])
evaluates to the string value of the td
element (in no namespace) that is the $n
-th child of the $m
-th tr
child of the top element (table
) of the provided XML document.
这篇关于XPath如何从html文档中检索表格单元格的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!