问题描述
我已经创建了自己的DefaultHandler来解析rss feed,而对于大多数的feed来说,它的工作正常,但是,对于ESPN来说,由于ESPN格式化URL的方式,它会切断部分文章的URL。来自ESPN的完整文章网址的一个例子。
http://sports.espn.go.com/nba/news / story?id = 5189101& amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp;由于某种原因,DefaultHandler characters方法只能从包含上述URL的标签中获取。
方法: >http:// sports。 espn.go.com/nba/news/story?id=5189101
如你所见,它是从&符号转义码中删除所有的URL,之后。如何让SAX解析器不能用这个转义码来剪切字符串?参考文献这里是我的角色方法。
public void characters(char ch [],int start,int length){
String chars =(new String(ch).substring(start,start + length));
try {
//如果不在项目中,则title / link指的是feed
if(!inItem){
if(inTitle)
currentFeed.title = chars;
} else {
if(inLink)
currentArticle.url =新的URL(chars);
if(inTitle)
currentArticle.title = chars;
if(inDescription)
currentArticle.description = chars;
if(inPubDate)
currentArticle.pubDate = chars;
if(inEnclosure){
}
}
} catch(MalformedURLException e){
Log.e(RSSReader,e.toString());
}
}
Rob W。
解决方案
当我编写SAX解析器时,我使用一个
StringBuilder
来附加传递给个字符的所有内容)
:public void characters(char ch [],int start,int length){
if(buf!= null){
for(int i = start; i< start + length; i ++){
buf.append(ch [i]);
}
}
}
然后在$ code> endElement(),我将使用
StringBuilder
的内容,并使用它。这样,如果解析器多次调用characters()
,我不会错过任何东西。I've created my own DefaultHandler to parse rss feeds and for most feeds it's working fine, however, for ESPN, it is cutting off part of the article url due to the way ESPN formats it's urls. An example of a full article url from ESPN..
http://sports.espn.go.com/nba/news/story?id=5189101&campaign=rss&source=ESPNHeadlines
The problem is for some reason the DefaultHandler characters method is only getting this from the tag that contains the above url.
http://sports.espn.go.com/nba/news/story?id=5189101
As you can see, it's cutting everything off the url from the ampersand escape code and after. How can I get the SAX parser to not cut my string off at this escape code? For ref. here is my characters method..
public void characters(char ch[], int start, int length) { String chars = (new String(ch).substring(start, start + length)); try { // If not in item, then title/link refers to feed if (!inItem) { if (inTitle) currentFeed.title = chars; } else { if (inLink) currentArticle.url = new URL(chars); if (inTitle) currentArticle.title = chars; if (inDescription) currentArticle.description = chars; if (inPubDate) currentArticle.pubDate = chars; if (inEnclosure) { } } } catch (MalformedURLException e) { Log.e("RSSReader", e.toString()); } }
Rob W.
解决方案From the documentation of the
characters()
method:When I write SAX parsers, I use a
StringBuilder
to append everything passed tocharacters()
:public void characters (char ch[], int start, int length) { if (buf!=null) { for (int i=start; i<start+length; i++) { buf.append(ch[i]); } } }
Then in
endElement()
, I take the contents of theStringBuilder
and do something with it. That way, if the parser callscharacters()
several times, I don't miss anything.这篇关于Android SAX解析器没有从标签之间获取全文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!