以请求知乎主题广场为例子
1. 观察数据
2. 分析
- 异步请求地址: https://www.zhihu.com/node/TopicsPlazzaListV2
- 表单数据:
mothed: next
params: {“topic_id”: xxx,”offset”:0,”hash_id”: “”}
构造请求
import requests
def get_sub_topics(url,topic_id,offset):
"""获取某个topic下的子topic topic_id :某个主题的id offset :下拉刷新时获取的条数,一次下拉刷新offset增加20 """
params = '{0}{4}topic_id{4}:{1},{4}offset{4}:{2},{4}hash_id{4}:{4}{4}{3}'.format('{',offset,'}','"')
data = {
"method":"next","params": params
}
print(data['params'])
req = requests.post(SUB_TOPIC_HTTP,data=data,proxies=proxies,headers=REQUEST_HEADERS)
json_data = req.json()
sub_topics = json_data['msg']
print(sub_topics[0])
print(sub_topics[2])
结果
https://www.zhihu.com/topics#美食 304
{"topic_id":304,"offset":20,"hash_id":""}
<div class="item"><div class="blk">
<a target="_blank" href="/topic/19579555">
<img src="https://pic3.zhimg.com/4ab91208a_xs.jpg" alt="川菜">
<strong>川菜</strong>
</a>
<p>川菜作为中国汉族四大菜系之一,取材广泛,调味多变,菜式多样,口…</p>
<a id="t::-9766" href="javascript:;" class="follow Meta-item zg-follow"><i class="z-icon-follow"></i>关注</a>
</div></div>
<div class="item"><div class="blk">
<a target="_blank" href="/topic/19552371">
<img src="https://pic1.zhimg.com/183ecd6c8_xs.jpg" alt="巧克力">
<strong>巧克力</strong>
</a>
<p>巧克力(Chocolate)是以可可做为主料的一种混合型食品,…</p>
<a id="t::-721" href="javascript:;" class="follow Meta-item zg-follow"><i class="z-icon-follow"></i>关注</a>
</div></div>