我设计了一个爬行器,其中将有两个蜘蛛.我使用scrapy设计了这些.
这些蜘蛛将通过从数据库中获取数据而独立运行.
这些蜘蛛将通过从数据库中获取数据而独立运行.
我们使用反应器运行这些蜘蛛.我们知道我们不能反复运行反应堆
我们给第二个爬行的蜘蛛提供了大约500个链接.
如果我们这样做,我们就会遇到端口错误的问题.即scrapy只使用单一端口
Error caught on signal handler: <bound method ?.start_listening of <scrapy.telnet.TelnetConsole instance at 0x0467B440>> Traceback (most recent call last): File "C:\Python27\lib\site-packages\twisted\internet\defer.py",line 1070,in _inlineCallbacks result = g.send(result) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\core\engine.py",line 75,in start yield self.signals.send_catch_log_deferred(signal=signals.engine_started) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\signalmanager.py",line 23,in send_catch_log_deferred return signal.send_catch_log_deferred(*a,**kw) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\utils\signal.py",line 53,in send_catch_log_deferred *arguments,**named) --- <exception caught here> --- File "C:\Python27\lib\site-packages\twisted\internet\defer.py",line 137,in maybeDeferred result = f(*args,**kw) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\xlib\pydispatch\robustapply.py",line 47,in robustApply return receiver(*arguments,**named) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\telnet.py",in start_listening self.port = listen_tcp(self.portrange,self.host,self) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\utils\reactor.py",line 14,in listen_tcp return reactor.listenTCP(x,factory,interface=host) File "C:\Python27\lib\site-packages\twisted\internet\posixbase.py",line 489,in listenTCP p.startListening() File "C:\Python27\lib\site-packages\twisted\internet\tcp.py",line 980,in startListening raise CannotListenError(self.interface,self.port,le) twisted.internet.error.CannotListenError: Couldn't listen on 0.0.0.0:6073: [Errno 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted.
解决方法
最简单的方法是通过将其添加到settings.py来禁用Telnet控制台:
EXTENSIONS = { 'scrapy.telnet.TelnetConsole': None }
有关默认启用的扩展名列表,另请参阅http://doc.scrapy.org/en/latest/topics/settings.html#extensions.