python - AttributeError: 'int' object has no attribute 'rindex' -
python - AttributeError: 'int' object has no attribute 'rindex' -
setup
i'm using scrapy 0.24.4 , scrapy-elasticsearch 0.5 scrape website , store results in elasticsearch instance have running.
i've used this blog post set up, minor modification documented here.
settings.py
bot_name = 'blah' spider_modules = ['blah.spiders'] newspider_module = 'blah.spiders' item_pipelines = [ 'scrapyelasticsearch.scrapyelasticsearch.elasticsearchpipeline', 100 ] elasticsearch_server = 'localhost' elasticsearch_port = 9200 elasticsearch_index = 'scrapy' elasticsearch_type = 'items'
problem
if run next command scrape website:
scrapy crawl wiki -o wiki.json
with item_pipelines commented out - works correctly , exports results wiki.json file.
with item_pipelines uncommented (e.g. set enable piping results elasticsearch) - next error:
file "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 34, in load_object dot = path.rindex('.') attributeerror: 'int' object has no attribute 'rindex'
notes
may or may not relevant. had alter local re-create of elasticsearchpipeline python file comment out this block causing syntax errors @ point @ indexing using uniq_id.any help hugely appreciated.
stupid, stupid stupid. syntax error!
having item_pipelines list deprecated, needs dictionary, effort @ converting dictionary terribly mangled:
item_pipelines = [ 'scrapyelasticsearch.scrapyelasticsearch.elasticsearchpipeline', 100 ]
that's not valid syntax. should have been:
item_pipelines = { 'scrapyelasticsearch.scrapyelasticsearch.elasticsearchpipeline': 100 }
python scrapy
Comments
Post a Comment