Requests：人性化的 HTTP

原文： https://pythonspot.com/requests-http-for-humans/

如果要从 Web 服务器请求数据，则在 Python 中执行此操作的传统方法是使用urllib库。尽管此库有效，但您可以轻松地构建比构建某些东西所需的复杂性。还有另一种方法吗？

Requests 是用 Python 编写的 Apache2 许可 HTTP 库。它由httplib和urllib3支持，但为您完成了所有艰苦的工作。

要安装类型：

git clone https://github.com/kennethreitz/requests.git
cd requests
sudo python setup.py install

现在已安装 Requests 库。我们将在下面列出一些示例：

使用 HTTP / HTTPS 请求获取原始 html

现在，我们可以查询网站：

import requests
r = requests.get('http://pythonspot.com/')
print r.content

保存并运行：

python website.py

它将输出原始 HTML 代码。

使用 Python 下载二进制映像

from PIL import Image
from StringIO import StringIO
import requests

r = requests.get('http://1.bp.blogspot.com/_r-MQun1PKUg/SlnHnaLcw6I/AAAAAAAAA_U$
i = Image.open(StringIO(r.content))
i.show()

使用 python 检索的图像

网站状态代码（网站是否在线？）

import requests
r = requests.get('http://pythonspot.com/')
print r.status_code

这将返回 200（确定）。可以在此处找到状态代码列表：https://en.wikipedia.org/wiki/List_of_HTTP_status_codes

从 Web 服务器检索 JSON

您可以轻松地从 Web 服务器获取 JSON 对象。

import requests

import requests
r = requests.get('https://api.github.com/events')
print r.json()

使用 Python 的 HTTP 发布请求

from StringIO import StringIO
import requests

payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.post("http://httpbin.org/post", data=payload)
print(r.text)

SSL 验证，使用 Python 验证证书

from StringIO import StringIO
import requests
print requests.get('https://github.com', verify=True)

从 HTTP 响应标头中提取数据

对于发送到 HTTP 服务器的每个请求，服务器都会向您发送一些其他数据。您可以使用以下命令从 HTTP 响应中提取数据：

#!/usr/bin/env python
import requests
r = requests.get('http://pythonspot.com/')
print r.headers

这将以 JSON 格式返回数据。我们可以将以 JSON 格式编码的数据解析为 Python 字典。

#!/usr/bin/env python
import requests
import json
r = requests.get('http://pythonspot.com/')

jsondata = str(r.headers).replace('\'','"')
headerObj = json.loads(jsondata)
print headerObj['server']
print headerObj['content-length']
print headerObj['content-encoding']
print headerObj['content-type']
print headerObj['date']
print headerObj['x-powered-by']

从 HTML 响应中提取数据

从服务器获取数据后，您可以使用 python 字符串函数或使用库来对其进行解析。经常使用BeautifulSoup。获取页面标题和链接的示例代码：

from bs4 import BeautifulSoup
import requests

# get html data
r = requests.get('http://stackoverflow.com/')
html_doc = r.content

# create a beautifulsoup object
soup = BeautifulSoup(html_doc)

# get title
print soup.title

# print all links
for link in soup.find_all('a'):
    print(link.get('href'))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Files

73.md

73.md

Requests：人性化的 HTTP

使用 HTTP / HTTPS 请求获取原始 html

网站状态代码（网站是否在线？）

从 Web 服务器检索 JSON

使用 Python 的 HTTP 发布请求

从 HTTP 响应标头中提取数据

从 HTML 响应中提取数据

Collapse file tree

Files

73.md

Latest commit

History

73.md

File metadata and controls

Requests：人性化的 HTTP

使用 HTTP / HTTPS 请求获取原始 html

网站状态代码（网站是否在线？）

从 Web 服务器检索 JSON

使用 Python 的 HTTP 发布请求

从 HTTP 响应标头中提取数据

从 HTML 响应中提取数据