我正在尝试使用XOM Java库解析an RSS field。每个条目的图像URL都存储为<img>元素的属性,如下所示。
<rss version="2.0">
<channel>
<item>
<title>Decision Paralysis</title>
<link>https://xkcd.com/1801/</link>
<description>
<img src="https://imgs.xkcd.com/comics/decision_paralysis.png"/>
</description>
<pubDate>Mon, 20 Feb 2017 05:00:00 -0000</pubDate>
<guid>https://xkcd.com/1801/</guid>
</item>
</channel>
</rss>尝试用.getFirstChildElement("img")解析<img src="">只返回一个空指针,使我的代码在尝试检索<img src= ...>时崩溃。为什么我的程序无法读入<img>元素,以及如何正确地读入它?
import nu.xom.*;
public class RSSParser {
public static void main() {
try {
Builder parser = new Builder();
Document doc = parser.build ( "https://xkcd.com/rss.xml" );
Element rootElement = doc.getRootElement();
Element channelElement = rootElement.getFirstChildElement("channel");
Elements itemList = channelElement.getChildElements("item");
// Iterate through itemList
for (int i = 0; i < itemList.size(); i++) {
Element item = itemList.get(i);
Element descElement = item.getFirstChildElement("description");
Element imgElement = descElement.getFirstChildElement("img");
// Crashes with NullPointerException
String imgSrc = imgElement.getAttributeValue("src");
}
}
catch (Exception error) {
error.printStackTrace();
System.exit(1);
}
}
}发布于 2016-12-01 07:11:42
项目中没有img元素。试一试
if (imgElement != null) {
String imgSrc = imgElement.getAttributeValue("src");
}该项目包含的内容如下:
<description><img
src="http://imgs.xkcd.com/comics/us_state_names.png"
title="Technically DC isn't a state, but no one is too
pedantic about it because they don't want to disturb the snakes
."
alt="Technically DC isn't a state, but no one is too pedantic about it because they don't want to disturb the snakes." />
</description>这不是img元素。这是纯文本。
发布于 2017-02-18 00:20:21
我设法想出了一个使用正则表达式和模式匹配的有点老套的解决方案。
// Iterate through itemList
for (int i = 0; i < itemList.size(); i++) {
Element item = itemList.get(i);
String descString = item.getFirstChildElement("description").getValue();
// Parse image URL (hacky)
String imgSrc = "";
Pattern pattern = Pattern.compile("src=\"[^\"]*\"");
Matcher matcher = pattern.matcher(descString);
if (matcher.find()) {
imgSrc = descString.substring( matcher.start()+5, matcher.end()-1 );
}
}https://stackoverflow.com/questions/40687122
复制相似问题