当我打印结果时,我正在尝试从一个使用python和BeautifulSoup.However的中文网页中获取内容,有人能告诉我为什么吗?(ps:我也会更改其他一些网页,有时代码工作,但有时不工作)下面是我的代码:
# _*_ coding:utf-8 _*_
from bs4 import BeautifulSoup
import urllib2
import urllib
import urllib2
url='http://finance.sina.com.cn/chanjing/cyxw/2015-12-17/doc-ifxmttcn4893506.shtml'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
headers = { 'User-Agent' : user_agent }
try:
request=urllib2.Request(url)
response=urllib2.urlopen(request)
html=response.read()
content=BeautifulSoup(response)
print content
except urllib2.URLError,e:
if hasattr(e,"code"):
print e.code
if hasattr(e,"reason"):
print e.reason这是我的结果:在这里输入图像描述‘
发布于 2015-12-17 06:07:37
试试这个:
page = requests.get('http://finance.sina.com.cn/chanjing/cyxw/2015-12-17/doc-ifxmttcn4893506.shtml')
print page.text
soup = BeautifulSoup(page.text)
soup.prettify()
print soup发布于 2015-12-17 06:01:55
试一试
content=BeautifulSoup(html)https://stackoverflow.com/questions/34327553
复制相似问题