我正在使用scthon与
python,我在
python项目pipline中有这个代码
- def process_item(self,item,spider):
- import pdb; pdb.set_trace()
- ID = str(uuid.uuid5(uuid.NAMESPACE_DNS,item['link']))
我收到了这个错误:
- Traceback (most recent call last):
- File "C:\Python27\lib\site-packages\scrapy-0.20.2-py2.7.egg\scrapy\mid
- dleware.py",line 62,in _process_chain
- return process_chain(self.methods[methodname],obj,*args)
- File "C:\Python27\lib\site-packages\scrapy-0.20.2-py2.7.egg\scrapy\uti
- ls\defer.py",line 65,in process_chain
- d.callback(input)
- File "C:\Python27\lib\site-packages\twisted\internet\defer.py",line 3
- 82,in callback
- self._startRunCallbacks(result)
- File "C:\Python27\lib\site-packages\twisted\internet\defer.py",line 4
- 90,in _startRunCallbacks
- self._runCallbacks()
- --- <exception caught here> ---
- File "C:\Python27\lib\site-packages\twisted\internet\defer.py",line 5
- 77,in _runCallbacks
- current.result = callback(current.result,*args,**kw)
- File "General_Spider_code_version_2\pipelines.py",line 7,in process_
- item
- ID = str(uuid.uuid5(uuid.NAMESPACE_DNS,item['link']))
- File "C:\Python27\lib\uuid.py",line 549,in uuid5
- hash = sha1(namespace.bytes + name).digest()
- exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in p
- osition 1: ordinal not in range(128)
我试图调试项目[‘link’]
这就是结果
- -> ID = str(uuid.uuid5(uuid.NAMESPACE_DNS,item['link']))
- (Pdb) item['link']
- u'http://dubai.dubizzle.com/property-for-rent/residential/apartmentflat/2014/4/6
- /palm-jumeirah-abu-keibal-3-br-maid-partial-2/?back=ZHViYWkuZHViaXp6bGUuY29tL3By
- b3BlcnR5LWZvci1yZW50L3Jlc2lkZW50aWFsL2FwYXJ0bWVudGZsYXQv&pos=1'
- (Pdb)
如你所见,项目[‘link’]是unicode
EDIT1
解决方法
使用.encode(‘utf-8’)将unicode字符串编码为字节字符串,它应该工作:
- str(uuid.uuid5(uuid.NAMESPACE_DNS,item['link'].encode('utf-8')))