博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
requests 后续1
阅读量:5902 次
发布时间:2019-06-19

本文共 4465 字,大约阅读时间需要 14 分钟。

发送带数据post请求
import requests# 发送post请求data = {}response = requests.post(url, data=data)# 内网 需要 认证auth = (user,pwd)response = requests.get(url,auth=auth)
发送代理post请求
import requests# 1.请求urlurl = 'http://www.baidu.com'headers = {    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36'}free_proxy = {
'http': '27.17.45.90:43411'}response = requests.get(url=url, headers=headers, proxies=free_proxy)print(response.status_code)
发送带CA证书认证post请求
import requestsurl = 'https://www.12306.cn/mormhweb/'headers = {    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36'}# 因为hhtps  是有第三方 CA 证书认证的# 但是 12306  虽然是https 但是 它不是 CA证书, 他是自己 颁布的证书# 解决方法 是: 告诉 web 忽略证书 访问response = requests.get(url=url, headers=headers, verify=False)data = response.content.decode()with open('03-ssl.html', 'w') as f:    f.write(data)# requests.exceptions.SSLError: HTTPSConnectionPool(host=
发送带cookie(字符串)post请求
import requests# 请求数据urlmember_url = 'https://www.yaozh.com/member/'headers = {    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36'}#  cookies 的字符串cookies = '_ga=GA1.2.1820447474.1535025127; MEIQIA_EXTRA_TRACK_ID=199Tty9OyANCXtHaSobJs67FU7J; WAF_SESSION_ID=7d88ae0fc48bffa022729657cf09807d; PHPSESSID=70kadg2ahpv7uuc8docd09iat4; _gid=GA1.2.133568065.1540383729; _gat=1; MEIQIA_VISIT_ID=1C1OdtdqpgpGeJ5A2lCKLMGiR4b; yaozh_logintime=1540383753; yaozh_user=381740%09xiaomaoera12; yaozh_userId=381740; db_w_auth=368675%09xiaomaoera12; UtzD_f52b_saltkey=ylH82082; UtzD_f52b_lastvisit=1540380154; UtzD_f52b_lastact=1540383754%09uc.php%09; UtzD_f52b_auth=f958AVKmmdzQ2CWwmr6GMrIS5oKlW%2BkP5dWz3SNLzr%2F1b6tOE6vzf7ssgZDjhuXa2JsO%2FIWtqd%2FZFelWpPHThohKQho; yaozh_uidhas=1; yaozh_mylogin=1540383756; MEIQIA_EXTRA_TRACK_ID=199Tty9OyANCXtHaSobJs67FU7J; WAF_SESSION_ID=7d88ae0fc48bffa022729657cf09807d; Hm_lvt_65968db3ac154c3089d7f9a4cbb98c94=1535025126%2C1535283389%2C1535283401%2C1539351081%2C1539512967%2C1540209934%2C1540383729; MEIQIA_VISIT_ID=1C1OdtdqpgpGeJ5A2lCKLMGiR4b; Hm_lpvt_65968db3ac154c3089d7f9a4cbb98c94=1540383761'# 需要的是 字典类型cook_dict = {}cookies_list = cookies.split('; ')for cookie in cookies_list:    cook_dict[cookie.split('=')[0]] = cookie.split('=')[1]# 字典推导式cook_dict = {cookie.split('=')[0]:cookie.split('=')[1] for cookie in cookies.split('; ')}response = requests.get(member_url, headers=headers, cookies=cook_dict)data = response.content.decode()with open('05-cookie.html','w') as f:    f.write(data)
发送post请求(自动携带session)
import requests# 请求数据urlmember_url = 'https://www.yaozh.com/member/'headers = {    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36'}# session 类 可以自动保存cookies === cookiesJarsession = requests.session()# 1.代码登录login_url = 'https://www.yaozh.com/login'login_form_data = {    'username':'aoa1',    'pwd': 'l812',    'formhash': '54AEE419',    'backurl': 'https%3AF%2Fwww.yaozh.com%2F',}login_response = session.post(login_url,data=login_form_data,headers=headers)print(login_response.content.decode())# 2.登录成功之后 带着 有效的cookies 访问 请求目标数据data = session.get(member_url,headers=headers).content.decode()with open('05-cookie2.html','w') as f:    f.write(data)

正则表达式

import re# 贪婪模式  从开头匹配到结尾 默认  'm(.*)n'# 非贪婪  'm(.*?)n'one = 'mdfsdsfffdsn12345656n'two = "a\d"pattern = re.compile('a\b')# pattern = re.compile('m(.*?)n')result = pattern.findall(two)print(result)
.匹配除换行符号\n 之外的
import re# . 除了 换行符号\n 之外的 匹配#   re.S忽略\n#   re.I忽略大小写one = """    msfdsdffdsdfsn    1234567778888N"""pattern = re.compile('m(.*)n', re.S | re.I)result = pattern.findall(one)print(result)

 匹配数字

import re# 纯数字的正则 \d 0-9之间的一个数pattern = re.compile('^\d+$')one = '234'# 匹配判断的方法# match 方法 是否匹配成功 从头开始 匹配一次result = pattern.match(one)print(result.group())

范围匹配

import re# 范围运算 [123] [1-9]one = '7893452'pattern = re.compile('[1-9]')result = pattern.findall(one)print(result)

 

import reone = 'abc 123'patter = re.compile('\d+')# match 从头匹配 匹配一次result = patter.match(one)# search 从任意位置 , 匹配一次result = patter.search(one)# findall  查找符合正则的 内容 -- listresult = patter.findall(one)# sub  替换字符串result = patter.sub('#',one)# split  拆分patter = re.compile(' ')result = patter.split(one)print(result)

 

转载于:https://www.cnblogs.com/sunBinary/p/10624070.html

你可能感兴趣的文章
org.springframework.util 类 Assert的使用
查看>>
java提供类与cglib包实现动态代理
查看>>
flask上传多个文件,获取input中的数组
查看>>
更改UIView的背景
查看>>
JLNotebookView
查看>>
StackPanel
查看>>
SPUserResizableView
查看>>
UML类图示例
查看>>
sh ./ 执行区别
查看>>
宏定义(#ifndef+#define+#endif)的作用
查看>>
Prometheus安装部署以及配置
查看>>
Oracle存储过程大冒险-2存储过程常用语法
查看>>
taobao-pamirs-schedule-2.0源码分析——类设计
查看>>
10位程序员眼中的2007:寻找软件开…
查看>>
Stream API
查看>>
Web开发之-DOM操作对象
查看>>
APUE第15章学习扎记之程序的存储区布局试验
查看>>
ubuntu升级16.04 inter idea 中文输入法无效
查看>>
查找命令集:which/whereis/locate/find
查看>>
三目运算判断jsp脚本里面的值
查看>>