我正在编写一个rake任务,每隔一分钟(可能每30秒一次)被Whenever调用,它会联系一个轮询API端点(我们数据库中的每个用户).显然,这不是单个线程的高效运行,但可以多线程吗?如果没有,是否有一个很好的基于事件的HTTP库可以完成工作?
解决方法
I’m writing a rake task that would be called every minute (possibly every 30 seconds in the future) by Whenever
小心Rails启动时间,最好使用resking或Sidekiq等分叉模型,Rescue提供https://github.com/bvandenbos/resque-scheduler应该能够做你需要的东西,我不能谈论Sidekiq,但我确定它有东西类似的可用(Sidekiq比Resque新得多)
ObvIoUsly,this is not efficient run as a single thread,but is it possible to multithread? If not,is there a good event-based HTTP library that would be able to get the job done?
我建议您查看ActiveRecord’s find_each
以获取有关提高查找程序进程效率的提示,一旦您获得批次,您可以使用以下线程轻松完成某些操作:
# # Find each returns 50 by default,you can pass options # to optimize that for larger (or smaller) batch sizes # depending on your available RAM # Users.find_each do |batch_of_users| # # Find each returns an Enumerable collection of users # in that batch,they'll be always smaller than or # equal to the batch size chosen in `find_each` # # # We collect a bunch of new threads,one for each # user,eac # batch_threads = batch_of_users.collect do |user| # # We pass the user to the thread,this is good # habit for shared variables,in this case # it doesn't make much difference # Thread.new(user) do |u| # # Do the API call here use `u` (not `user`) # to access the user instance # # We shouldn't need to use an evented HTTP library # Ruby threads will pass control when the IO happens # control will return to the thread sometime when # the scheduler decides,but 99% of the time # HTTP and network IO are the best thread optimized # thing you can do in Ruby. # end end # # Joining threads means waiting for them to finish # before moving onto the next batch. # batch_threads.map(&:join) end
这将仅启动batch_size的线程,在每个batch_size完成后等待.
有可能做这样的事情,但是你会有一个无法控制的线程数量,你可以从这里获得一个替代方案,它会变得更加复杂,包括一个ThreadPool和共享的工作列表,我’我发布它在Github所以’不是垃圾堆栈溢出:https://gist.github.com/6767fbad1f0a66fa90ac