# -*- coding: utf-8 -*- """ 有反爬提示“您操作太频繁,请稍后再访问”,加headers也无效。必须登录,且登录地址必须选正确,否则生成的cookie无效。 """ import requests #请求网站 url='https://www.lagou.com/jobs/positionAjax.json?city=%E5%8C%97%E4%BA%AC&needAddtionalResult=false' urls='https://www.lagou.com/jobs/list_python' #对网址发请求 data={ 'first':'true', 'pn':'1',#页数 'kd':'python'#职位名 } #加headers headers={ 'Host':'www.lagou.com', 'Origin':'https://www.lagou.com', 'Referer':'https://www.lagou.com/jobs/list_python', 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/82.0.4068.4 Safari/537.36' } #让用户先登录,再保持用户的登录状态,来会话保持session session=requests.Session() session.get(url=urls,headers=headers) cookie=session.cookies #代用记信息,再请求 response=session.post(url=url,headers=headers,data=data,cookies=cookie).json() result=response['content']['positionResult']['result'] for i in result: companyName=i['companyFullName'] print(companyName) f=open('lagou.csv','a',encoding='utf-8') f.write('{}\n'.format(companyName)) f.close()
python爬拉勾网职(要求登录有反爬),用session保持登录
阅读:2855 输入:2020-03-24 15:36:02