python - urllib2.HTTPError: HTTP Error 404: Not Found for valid url -



python - urllib2.HTTPError: HTTP Error 404: Not Found for valid url -

i'm using python opengraph library parse website's opengraph tags https://github.com/erikriver/opengraph.

import opengraph url = 'http://www.foxnews.com/world/2014/10/20/uk-gun-owners-now-subject-to-warrantless-home-searches/' og = opengraph.opengraph(url=url) print og.to_json()

when run script next error

traceback (most recent phone call last): file "test.py", line 16, in <module> raw = urllib2.urlopen(url) file "/usr/lib/python2.7/urllib2.py", line 127, in urlopen homecoming _opener.open(url, data, timeout) file "/usr/lib/python2.7/urllib2.py", line 410, in open response = meth(req, response) file "/usr/lib/python2.7/urllib2.py", line 523, in http_response 'http', request, response, code, msg, hdrs) file "/usr/lib/python2.7/urllib2.py", line 448, in error homecoming self._call_chain(*args) file "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain result = func(*args) file "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default raise httperror(req.get_full_url(), code, msg, hdrs, fp) urllib2.httperror: http error 404: not found

urllib2 used under hood grab html before parsed https://github.com/erikriver/opengraph/blob/master/opengraph/opengraph.py#l50-l52

why receiving 404 error? can access url browser , retrieve open graph tags url using php library https://github.com/scottmac/opengraph.

the python library able retrieve open graph tags other urls url seems anomaly.

updated:

you 404 response because request hasn't passed user-agent. installed opengraph on virtualenv test it, works after adding missing user-agent in header:

url = 'http://www.foxnews.com/world/2014/10/20/uk-gun-owners-now-subject-to-warrantless-home-searches/' req = opengraph.opengraph.urllib2.request(url, headers={ 'user-agent': 'mozilla/5.0' }) og = opengraph.opengraph() og.parser(opengraph.opengraph.urllib2.urlopen(req).read()) og.to_json() '{"site_name": "fox news", "description": "registered gun owners in united kingdom of britain and northern republic of ireland subject unannounced visits homes under new guidance allows police force inspect firearms storage without warrant.", "title": "uk gun owners subject warrantless home searches", "url": "http://www.foxnews.com/world/2014/10/20/uk-gun-owners-now-subject-to-warrantless-home-searches/", "image": "http://global.fncstatic.com/static/v/all/img/fn_128x128.png", "scrape": false, "_url": null, "type": "article"}'

python python-2.7 facebook-opengraph urllib2 opengraph

Comments

Popular posts from this blog

xslt - DocBook 5 to PDF transform failing with error: "fo:flow" is missing child elements. Required content model: marker* -

mediawiki - How do I insert tables inside infoboxes on Wikia pages? -

Local Service User Logged into Windows -