urllib3 Documentation

ConnectionPools

A connection pool is a container for a collection of connections to a specific host.

If you need to make requests to the same host repeatedly, then you should use a HTTPConnectionPool.

>>> from urllib3 import HTTPConnectionPool
>>> pool = HTTPConnectionPool('ajax.googleapis.com', maxsize=1)
>>> r = pool.request('GET', '/ajax/services/search/web',
...                  fields={'q': 'urllib3', 'v': '1.0'})
>>> r.status
200
>>> r.headers['content-type']
'text/javascript; charset=utf-8'
>>> 'data: ' + r.data # Content of the response
'data: ...'
>>> r = pool.request('GET', '/ajax/services/search/web',
...                  fields={'q': 'python', 'v': '1.0'})
>>> 'data: ' + r.data # Content of the response
'data: ...'
>>> pool.num_connections
1
>>> pool.num_requests
2

By default, the pool will cache just one connection. If you’re planning on using such a pool in a multithreaded environment, you should set the maxsize of the pool to a higher number, such as the number of threads. You can also control many other variables like timeout, blocking, and default headers.

A ConnectionPool can be used as a context manager to automatically clear the pool after usage.

>>> from urllib3 import HTTPConnectionPool
>>> with HTTPConnectionPool('ajax.googleapis.com', maxsize=1) as pool:
...     r = pool.request('GET', '/ajax/services/search/web',
...                      fields={'q': 'urllib3', 'v': '1.0'})
...     print(pool.pool)
...
<queue.LifoQueue object at 0x7f67367dfcf8>
>>> print(pool.pool)
None

Helpers

There are various helper functions provided for instantiating these ConnectionPools more easily:

urllib3.connectionpool.connection_from_url(url, **kw)

Given a url, return an ConnectionPool instance of its host.

This is a shortcut for not having to parse out the scheme, host, and port of the url before creating an ConnectionPool instance.

Parameters:
  • url – Absolute URL string that must include the scheme. Port is optional.
  • **kw – Passes additional parameters to the constructor of the appropriate ConnectionPool. Useful for specifying things like timeout, maxsize, headers, etc.

Example:

>>> conn = connection_from_url('http://google.com/')
>>> r = conn.request('GET', '/')

API

urllib3.connectionpool comes with two connection pools:

class urllib3.connectionpool.HTTPConnectionPool(host, port=None, strict=False, timeout=<object object at 0x7f42c1d23240>, maxsize=1, block=False, headers=None, retries=None, _proxy=None, _proxy_headers=None, **conn_kw)

Thread-safe connection pool for one host.

Parameters:
  • host – Host used for this HTTP Connection (e.g. “localhost”), passed into httplib.HTTPConnection.
  • port – Port used for this HTTP Connection (None is equivalent to 80), passed into httplib.HTTPConnection.
  • strict

    Causes BadStatusLine to be raised if the status line can’t be parsed as a valid HTTP/1.0 or 1.1 status line, passed into httplib.HTTPConnection.

    Note

    Only works in Python 2. This parameter is ignored in Python 3.

  • timeout – Socket timeout in seconds for each individual connection. This can be a float or integer, which sets the timeout for the HTTP request, or an instance of urllib3.util.Timeout which gives you more fine-grained control over request timeouts. After the constructor has been parsed, this is always a urllib3.util.Timeout object.
  • maxsize – Number of connections to save that can be reused. More than 1 is useful in multithreaded situations. If block is set to false, more connections will be created but they will not be saved once they’ve been used.
  • block – If set to True, no more than maxsize connections will be used at a time. When no free connections are available, the call will block until a connection has been released. This is a useful side effect for particular multithreaded situations where one does not want to use more than maxsize connections per host to prevent flooding.
  • headers – Headers to include with all requests, unless other headers are given explicitly.
  • retries – Retry configuration to use by default with requests in this pool.
  • _proxy – Parsed proxy URL, should not be used directly, instead, see urllib3.connectionpool.ProxyManager
  • _proxy_headers – A dictionary with proxy headers, should not be used directly, instead, see urllib3.connectionpool.ProxyManager
  • **conn_kw – Additional parameters are used to create fresh urllib3.connection.HTTPConnection, urllib3.connection.HTTPSConnection instances.
ConnectionCls

alias of HTTPConnection

QueueCls

alias of LifoQueue

close()

Close all pooled connections and disable the pool.

is_same_host(url)

Check if the given url is a member of the same host as this connection pool.

request(method, url, fields=None, headers=None, **urlopen_kw)

Make a request using urlopen() with the appropriate encoding of fields based on the method used.

This is a convenience method that requires the least amount of manual effort. It can be used in most situations, while still having the option to drop down to more specific methods when necessary, such as request_encode_url(), request_encode_body(), or even the lowest level urlopen().

request_encode_body(method, url, fields=None, headers=None, encode_multipart=True, multipart_boundary=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the body. This is useful for request methods like POST, PUT, PATCH, etc.

When encode_multipart=True (default), then urllib3.filepost.encode_multipart_formdata() is used to encode the payload with the appropriate content type. Otherwise urllib.urlencode() is used with the ‘application/x-www-form-urlencoded’ content type.

Multipart encoding must be used when posting files, and it’s reasonably safe to use it in other times too. However, it may break request signing, such as with OAuth.

Supports an optional fields parameter of key/value strings AND key/filetuple. A filetuple is a (filename, data, MIME type) tuple where the MIME type is optional. For example:

fields = {
    'foo': 'bar',
    'fakefile': ('foofile.txt', 'contents of foofile'),
    'realfile': ('barfile.txt', open('realfile').read()),
    'typedfile': ('bazfile.bin', open('bazfile').read(),
                  'image/jpeg'),
    'nonamefile': 'contents of nonamefile field',
}

When uploading a file, providing a filename (the first parameter of the tuple) is optional but recommended to best mimick behavior of browsers.

Note that if headers are supplied, the ‘Content-Type’ header will be overwritten because it depends on the dynamic random boundary string which is used to compose the body of the request. The random boundary string can be explicitly set with the multipart_boundary parameter.

request_encode_url(method, url, fields=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the url. This is useful for request methods like GET, HEAD, DELETE, etc.

urlopen(method, url, body=None, headers=None, retries=None, redirect=True, assert_same_host=True, timeout=<object object at 0x7f42c1d23440>, pool_timeout=None, release_conn=None, **response_kw)

Get a connection from the pool and perform an HTTP request. This is the lowest level call for making a request, so you’ll need to specify all the raw details.

Note

More commonly, it’s appropriate to use a convenience method provided by RequestMethods, such as request().

Note

release_conn will only behave as expected if preload_content=False because we want to make preload_content=False the default behaviour someday soon without breaking backwards compatibility.

Parameters:
  • method – HTTP request method (such as GET, POST, PUT, etc.)
  • body – Data to send in the request body (useful for creating POST requests, see HTTPConnectionPool.post_url for more convenience).
  • headers – Dictionary of custom headers to send, such as User-Agent, If-None-Match, etc. If None, pool headers are used. If provided, these headers completely replace any pool-specific headers.
  • retries (Retry, False, or an int.) –

    Configure the number of retries to allow before raising a MaxRetryError exception.

    Pass None to retry until you receive a response. Pass a Retry object for fine-grained control over different types of retries. Pass an integer number to retry connection errors that many times, but no other types of errors. Pass zero to never retry.

    If False, then retries are disabled and any exception is raised immediately. Also, instead of raising a MaxRetryError on redirects, the redirect response will be returned.

  • redirect – If True, automatically handle redirects (status codes 301, 302, 303, 307, 308). Each redirect counts as a retry. Disabling retries will disable redirect, too.
  • assert_same_host – If True, will make sure that the host of the pool requests is consistent else will raise HostChangedError. When False, you can use the pool on an HTTP proxy and request foreign hosts.
  • timeout – If specified, overrides the default timeout for this one request. It may be a float (in seconds) or an instance of urllib3.util.Timeout.
  • pool_timeout – If set and the pool is set to block=True, then this method will block for pool_timeout seconds and raise EmptyPoolError if no connection is available within the time period.
  • release_conn – If False, then the urlopen call will not release the connection back into the pool once a response is received (but will release if you read the entire contents of the response such as when preload_content=True). This is useful if you’re not preloading the response’s content immediately. You will need to call r.release_conn() on the response r to return the connection back into the pool. If None, it takes the value of response_kw.get('preload_content', True).
  • **response_kw – Additional parameters are passed to urllib3.response.HTTPResponse.from_httplib()
class urllib3.connectionpool.HTTPSConnectionPool(host, port=None, strict=False, timeout=<object object at 0x7f42c1d23240>, maxsize=1, block=False, headers=None, retries=None, _proxy=None, _proxy_headers=None, key_file=None, cert_file=None, cert_reqs=None, ca_certs=None, ssl_version=None, assert_hostname=None, assert_fingerprint=None, **conn_kw)

Same as HTTPConnectionPool, but HTTPS.

When Python is compiled with the ssl module, then VerifiedHTTPSConnection is used, which can verify certificates, instead of HTTPSConnection.

VerifiedHTTPSConnection uses one of assert_fingerprint, assert_hostname and host in this order to verify connections. If assert_hostname is False, no verification is done.

The key_file, cert_file, cert_reqs, ca_certs and ssl_version are only used if ssl is available and are fed into urllib3.util.ssl_wrap_socket() to upgrade the connection socket into an SSL socket.

All of these pools inherit from a common base class:

class urllib3.connectionpool.ConnectionPool(host, port=None)

Base class for all connection pools, such as HTTPConnectionPool and HTTPSConnectionPool.

PoolManager

A pool manager is an abstraction for a collection of ConnectionPools.

If you need to make requests to multiple hosts, then you can use a PoolManager, which takes care of maintaining your pools so you don’t have to.

>>> from urllib3 import PoolManager
>>> manager = PoolManager(10)
>>> r = manager.request('GET', 'http://example.com')
>>> r.headers['server']
'ECS (iad/182A)'
>>> r = manager.request('GET', 'http://httpbin.org/')
>>> r.headers['server']
'gunicorn/18.0'
>>> r = manager.request('POST', 'http://httpbin.org/headers')
>>> r = manager.request('HEAD', 'http://httpbin.org/cookies')
>>> len(manager.pools)
2
>>> conn = manager.connection_from_host('httpbin.org')
>>> conn.num_requests
3

The API of a PoolManager object is similar to that of a ConnectionPool, so they can be passed around interchangeably.

The PoolManager uses a Least Recently Used (LRU) policy for discarding old pools. That is, if you set the PoolManager num_pools to 10, then after making requests to 11 or more different hosts, the least recently used pools will be cleaned up eventually.

Cleanup of stale pools does not happen immediately but can be forced when used as a context manager.

>>> from urllib3 import PoolManager
>>> with PoolManager(10) as manager:
...     r = manager.request('GET', 'http://example.com')
...     r = manager.request('GET', 'http://httpbin.org/')
...     len(manager.pools)
...
2
>>> len(manager.pools)
0

You can read more about the implementation and the various adjustable variables within RecentlyUsedContainer.

API

class urllib3.poolmanager.PoolManager(num_pools=10, headers=None, **connection_pool_kw)

Allows for arbitrary requests while transparently keeping track of necessary connection pools for you.

Parameters:
  • num_pools – Number of connection pools to cache before discarding the least recently used pool.
  • headers – Headers to include with all requests, unless other headers are given explicitly.
  • **connection_pool_kw – Additional parameters are used to create fresh urllib3.connectionpool.ConnectionPool instances.

Example:

>>> manager = PoolManager(num_pools=2)
>>> r = manager.request('GET', 'http://google.com/')
>>> r = manager.request('GET', 'http://google.com/mail')
>>> r = manager.request('GET', 'http://yahoo.com/')
>>> len(manager.pools)
2
clear()

Empty our store of pools and direct them all to close.

This will not affect in-flight connections, but they will not be re-used after completion.

connection_from_host(host, port=None, scheme='http')

Get a ConnectionPool based on the host, port, and scheme.

If port isn’t given, it will be derived from the scheme using urllib3.connectionpool.port_by_scheme.

connection_from_url(url)

Similar to urllib3.connectionpool.connection_from_url() but doesn’t pass any additional parameters to the urllib3.connectionpool.ConnectionPool constructor.

Additional parameters are taken from the PoolManager constructor.

request(method, url, fields=None, headers=None, **urlopen_kw)

Make a request using urlopen() with the appropriate encoding of fields based on the method used.

This is a convenience method that requires the least amount of manual effort. It can be used in most situations, while still having the option to drop down to more specific methods when necessary, such as request_encode_url(), request_encode_body(), or even the lowest level urlopen().

request_encode_body(method, url, fields=None, headers=None, encode_multipart=True, multipart_boundary=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the body. This is useful for request methods like POST, PUT, PATCH, etc.

When encode_multipart=True (default), then urllib3.filepost.encode_multipart_formdata() is used to encode the payload with the appropriate content type. Otherwise urllib.urlencode() is used with the ‘application/x-www-form-urlencoded’ content type.

Multipart encoding must be used when posting files, and it’s reasonably safe to use it in other times too. However, it may break request signing, such as with OAuth.

Supports an optional fields parameter of key/value strings AND key/filetuple. A filetuple is a (filename, data, MIME type) tuple where the MIME type is optional. For example:

fields = {
    'foo': 'bar',
    'fakefile': ('foofile.txt', 'contents of foofile'),
    'realfile': ('barfile.txt', open('realfile').read()),
    'typedfile': ('bazfile.bin', open('bazfile').read(),
                  'image/jpeg'),
    'nonamefile': 'contents of nonamefile field',
}

When uploading a file, providing a filename (the first parameter of the tuple) is optional but recommended to best mimick behavior of browsers.

Note that if headers are supplied, the ‘Content-Type’ header will be overwritten because it depends on the dynamic random boundary string which is used to compose the body of the request. The random boundary string can be explicitly set with the multipart_boundary parameter.

request_encode_url(method, url, fields=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the url. This is useful for request methods like GET, HEAD, DELETE, etc.

urlopen(method, url, redirect=True, **kw)

Same as urllib3.connectionpool.HTTPConnectionPool.urlopen() with custom cross-host redirect logic and only sends the request-uri portion of the url.

The given url parameter must be absolute, such that an appropriate urllib3.connectionpool.ConnectionPool can be chosen for it.

ProxyManager

ProxyManager is an HTTP proxy-aware subclass of PoolManager. It produces a single HTTPConnectionPool instance for all HTTP connections and individual per-server:port HTTPSConnectionPool instances for tunnelled HTTPS connections.

API

class urllib3.poolmanager.ProxyManager(proxy_url, num_pools=10, headers=None, proxy_headers=None, **connection_pool_kw)

Behaves just like PoolManager, but sends all requests through the defined proxy, using the CONNECT method for HTTPS URLs.

Parameters:
  • proxy_url – The URL of the proxy to be used.
  • proxy_headers – A dictionary contaning headers that will be sent to the proxy. In case of HTTP they are being sent with each request, while in the HTTPS/CONNECT case they are sent only once. Could be used for proxy authentication.
Example:
>>> proxy = urllib3.ProxyManager('http://localhost:3128/')
>>> r1 = proxy.request('GET', 'http://google.com/')
>>> r2 = proxy.request('GET', 'http://httpbin.org/')
>>> len(proxy.pools)
1
>>> r3 = proxy.request('GET', 'https://httpbin.org/')
>>> r4 = proxy.request('GET', 'https://twitter.com/')
>>> len(proxy.pools)
3

Security: Verified HTTPS with SSL/TLS

Very important fact: By default, urllib3 does not verify HTTPS requests.

The historic reason for this is that we rely on httplib for some of the HTTP protocol implementation, and httplib does not verify requests out of the box. This is not a good reason, but here we are.

Luckily, it’s not too hard to enable verified HTTPS requests and there are a few ways to do it.

Python with SSL enabled

First we need to make sure your Python installation has SSL enabled. Easiest way to check is to simply open a Python shell and type import ssl:

>>> import ssl
Traceback (most recent call last):
  ...
ImportError: No module named _ssl

If you got an ImportError, then your Python is not compiled with SSL support and you’ll need to re-install it. Read this StackOverflow thread for details.

Otherwise, if ssl imported cleanly, then we’re ready to setup our certificates: Using Certifi with urllib3.

Enabling SSL on Google AppEngine

If you’re using Google App Engine, you’ll need to add ssl as a library dependency to your yaml file, like this:

libraries:
- name: ssl
  version: latest

If it’s still not working, you may need to enable billing on your account to enable using sockets.

Using Certifi with urllib3

Certifi is a package which ships with Mozilla’s root certificates for easy programmatic access.

  1. Install the Python certifi package:

    $ pip install certifi
    
  2. Setup your pool to require a certificate and provide the certifi bundle:

    import urllib3
    import certifi
    
    http = urllib3.PoolManager(
        cert_reqs='CERT_REQUIRED', # Force certificate check.
        ca_certs=certifi.where(),  # Path to the Certifi bundle.
    )
    
    # You're ready to make verified HTTPS requests.
    try:
        r = http.request('GET', 'https://example.com/')
    except urllib3.exceptions.SSLError as e:
        # Handle incorrect certificate error.
        ...
    

Make sure to update your certifi package regularly to get the latest root certificates.

Using your system’s root certificates

Your system’s root certificates may be more up-to-date than maintaining your own, but the trick is finding where they live. Different operating systems have them in different places.

For example, on most Linux distributions they’re at /etc/ssl/certs/ca-certificates.crt. On Windows and OS X? It’s not so simple.

Once you find your root certificate file:

import urllib3

ca_certs = "/etc/ssl/certs/ca-certificates.crt"  # Or wherever it lives.

http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED', # Force certificate check.
    ca_certs=ca_certs,         # Path to your certificate bundle.
)

# You're ready to make verified HTTPS requests.
try:
    r = http.request('GET', 'https://example.com/')
except urllib3.exceptions.SSLError as e:
    # Handle incorrect certificate error.
    ...

OpenSSL / PyOpenSSL

By default, we use the standard library’s ssl module. Unfortunately, there are several limitations which are addressed by PyOpenSSL:

  • (Python 2.x) SNI support.
  • (Python 2.x-3.2) Disabling compression to mitigate CRIME attack.

To use the Python OpenSSL bindings instead, you’ll need to install the required packages:

$ pip install pyopenssl ndg-httpsclient pyasn1

Once the packages are installed, you can tell urllib3 to switch the ssl backend to PyOpenSSL with inject_into_urllib3():

import urllib3.contrib.pyopenssl
urllib3.contrib.pyopenssl.inject_into_urllib3()

Now you can continue using urllib3 as you normally would.

For more details, check the pyopenssl module.

InsecureRequestWarning

New in version 1.9.

Unverified HTTPS requests will trigger a warning via Python’s warnings module:

urllib3/connectionpool.py:736: InsecureRequestWarning: Unverified HTTPS
request is being made. Adding certificate verification is strongly advised.
See: https://urllib3.readthedocs.org/en/latest/security.html

This would be a great time to enable HTTPS verification: Using Certifi with urllib3.

If you know what you’re doing and would like to disable this and other warnings, you can use disable_warnings():

import urllib3
urllib3.disable_warnings()

Making unverified HTTPS requests is strongly discouraged. ˙ ͜ʟ˙

Alternatively, if you are using Python’s logging module, you can capture the warnings to your own log:

logging.captureWarnings(True)

Capturing the warnings to your own log is much preferred over simply disabling the warnings.

InsecurePlatformWarning

New in version 1.11.

Certain Python platforms (specifically, versions of Python earlier than 2.7.9) have restrictions in their ssl module that limit the configuration that urllib3 can apply. In particular, this can cause HTTPS requests that would succeed on more featureful platforms to fail, and can cause certain security features to be unavailable.

If you encounter this warning, it is strongly recommended you upgrade to a newer Python version, or that you use pyOpenSSL as described in the OpenSSL / PyOpenSSL section.

If you know what you are doing and would like to disable this and other warnings, please consult the InsecureRequestWarning section for instructions on how to handle the warnings.

Helpers

Useful methods for working with httplib, completely decoupled from code specific to urllib3.

Timeouts

class urllib3.util.timeout.Timeout(total=None, connect=<object object at 0x7f42c1d233f0>, read=<object object at 0x7f42c1d233f0>)

Timeout configuration.

Timeouts can be defined as a default for a pool:

timeout = Timeout(connect=2.0, read=7.0)
http = PoolManager(timeout=timeout)
response = http.request('GET', 'http://example.com/')

Or per-request (which overrides the default for the pool):

response = http.request('GET', 'http://example.com/', timeout=Timeout(10))

Timeouts can be disabled by setting all the parameters to None:

no_timeout = Timeout(connect=None, read=None)
response = http.request('GET', 'http://example.com/, timeout=no_timeout)
Parameters:
  • total (integer, float, or None) –

    This combines the connect and read timeouts into one; the read timeout will be set to the time leftover from the connect attempt. In the event that both a connect timeout and a total are specified, or a read timeout and a total are specified, the shorter timeout will be applied.

    Defaults to None.

  • connect (integer, float, or None) – The maximum amount of time to wait for a connection attempt to a server to succeed. Omitting the parameter will default the connect timeout to the system default, probably the global default timeout in socket.py. None will set an infinite timeout for connection attempts.
  • read (integer, float, or None) –

    The maximum amount of time to wait between consecutive read operations for a response from the server. Omitting the parameter will default the read timeout to the system default, probably the global default timeout in socket.py. None will set an infinite timeout.

Note

Many factors can affect the total amount of time for urllib3 to return an HTTP response.

For example, Python’s DNS resolver does not obey the timeout specified on the socket. Other factors that can affect total request time include high CPU load, high swap, the program running at a low priority level, or other behaviors.

In addition, the read and total timeouts only measure the time between read operations on the socket connecting the client and the server, not the total amount of time for the request to return a complete response. For most requests, the timeout is raised because the server has not sent the first byte in the specified time. This is not always the case; if a server streams one byte every fifteen seconds, a timeout of 20 seconds will not trigger, even though the request will take several minutes to complete.

If your goal is to cut off any request after a set amount of wall clock time, consider having a second “watcher” thread to cut off a slow request.

DEFAULT_TIMEOUT = <object object at 0x7f42c1d23240>

A sentinel object representing the default timeout value

clone()

Create a copy of the timeout object

Timeout properties are stored per-pool but each request needs a fresh Timeout object to ensure each one has its own start/stop configured.

Returns:a copy of the timeout object
Return type:Timeout
connect_timeout

Get the value to use when setting a connection timeout.

This will be a positive float or integer, the value None (never timeout), or the default system timeout.

Returns:Connect timeout.
Return type:int, float, Timeout.DEFAULT_TIMEOUT or None
classmethod from_float(timeout)

Create a new Timeout from a legacy timeout value.

The timeout value used by httplib.py sets the same timeout on the connect(), and recv() socket requests. This creates a Timeout object that sets the individual timeouts to the timeout value passed to this function.

Parameters:timeout (integer, float, sentinel default object, or None) – The legacy timeout value.
Returns:Timeout object
Return type:Timeout
get_connect_duration()

Gets the time elapsed since the call to start_connect().

Returns:Elapsed time.
Return type:float
Raises urllib3.exceptions.TimeoutStateError:
 if you attempt to get duration for a timer that hasn’t been started.
read_timeout

Get the value for the read timeout.

This assumes some time has elapsed in the connection timeout and computes the read timeout appropriately.

If self.total is set, the read timeout is dependent on the amount of time taken by the connect timeout. If the connection time has not been established, a TimeoutStateError will be raised.

Returns:Value to use for the read timeout.
Return type:int, float, Timeout.DEFAULT_TIMEOUT or None
Raises urllib3.exceptions.TimeoutStateError:
 If start_connect() has not yet been called on this object.
start_connect()

Start the timeout clock, used during a connect() attempt

Raises urllib3.exceptions.TimeoutStateError:
 if you attempt to start a timer that has been started already.
urllib3.util.timeout.current_time()

Retrieve the current time. This function is mocked out in unit testing.

Retries

class urllib3.util.retry.Retry(total=10, connect=None, read=None, redirect=None, method_whitelist=frozenset(['HEAD', 'TRACE', 'GET', 'PUT', 'OPTIONS', 'DELETE']), status_forcelist=None, backoff_factor=0, raise_on_redirect=True, _observed_errors=0)

Retry configuration.

Each retry attempt will create a new Retry object with updated values, so they can be safely reused.

Retries can be defined as a default for a pool:

retries = Retry(connect=5, read=2, redirect=5)
http = PoolManager(retries=retries)
response = http.request('GET', 'http://example.com/')

Or per-request (which overrides the default for the pool):

response = http.request('GET', 'http://example.com/', retries=Retry(10))

Retries can be disabled by passing False:

response = http.request('GET', 'http://example.com/', retries=False)

Errors will be wrapped in MaxRetryError unless retries are disabled, in which case the causing exception will be raised.

Parameters:
  • total (int) –

    Total number of retries to allow. Takes precedence over other counts.

    Set to None to remove this constraint and fall back on other counts. It’s a good idea to set this to some sensibly-high value to account for unexpected edge cases and avoid infinite retry loops.

    Set to 0 to fail on the first retry.

    Set to False to disable and imply raise_on_redirect=False.

  • connect (int) –

    How many connection-related errors to retry on.

    These are errors raised before the request is sent to the remote server, which we assume has not triggered the server to process the request.

    Set to 0 to fail on the first retry of this type.

  • read (int) –

    How many times to retry on read errors.

    These errors are raised after the request was sent to the server, so the request may have side-effects.

    Set to 0 to fail on the first retry of this type.

  • redirect (int) –

    How many redirects to perform. Limit this to avoid infinite redirect loops.

    A redirect is a HTTP response with a status code 301, 302, 303, 307 or 308.

    Set to 0 to fail on the first retry of this type.

    Set to False to disable and imply raise_on_redirect=False.

  • method_whitelist (iterable) –

    Set of uppercased HTTP method verbs that we should retry on.

    By default, we only retry on methods which are considered to be indempotent (multiple requests with the same parameters end with the same state). See Retry.DEFAULT_METHOD_WHITELIST.

  • status_forcelist (iterable) –

    A set of HTTP status codes that we should force a retry on.

    By default, this is disabled with None.

  • backoff_factor (float) –

    A backoff factor to apply between attempts. urllib3 will sleep for:

    {backoff factor} * (2 ^ ({number of total retries} - 1))
    

    seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.1s, 0.2s, 0.4s, ...] between retries. It will never be longer than Retry.MAX_BACKOFF.

    By default, backoff is disabled (set to 0).

  • raise_on_redirect (bool) – Whether, if the number of redirects is exhausted, to raise a MaxRetryError, or to return a response with a response code in the 3xx range.
BACKOFF_MAX = 120

Maximum backoff time.

classmethod from_int(retries, redirect=True, default=None)

Backwards-compatibility for the old retries format.

get_backoff_time()

Formula for computing the current backoff

Return type:float
increment(method=None, url=None, response=None, error=None, _pool=None, _stacktrace=None)

Return a new Retry object with incremented retry counters.

Parameters:
  • response (HTTPResponse) – A response object, or None, if the server did not return a response.
  • error (Exception) – An error encountered during the request, or None if the response was received successfully.
Returns:

A new Retry object.

is_exhausted()

Are we out of retries?

is_forced_retry(method, status_code)

Is this method/status code retryable? (Based on method/codes whitelists)

sleep()

Sleep between retry attempts using an exponential backoff.

By default, the backoff factor is 0 and this method will return immediately.

URL Helpers

class urllib3.util.url.Url

Datastructure for representing an HTTP URL. Used as a return value for parse_url().

hostname

For backwards-compatibility with urlparse. We’re nice like that.

netloc

Network location including host and port

request_uri

Absolute path including the query string.

url

Convert self into a url

This function should more or less round-trip with parse_url(). The returned url may not be exactly the same as the url inputted to parse_url(), but it should be equivalent by the RFC (e.g., urls with a blank port will have : removed).

Example:

>>> U = parse_url('http://google.com/mail/')
>>> U.url
'http://google.com/mail/'
>>> Url('http', 'username:password', 'host.com', 80,
... '/path', 'query', 'fragment').url
'http://username:password@host.com:80/path?query#fragment'
urllib3.util.url.get_host(url)

Deprecated. Use parse_url() instead.

urllib3.util.url.parse_url(url)

Given a url, return a parsed Url namedtuple. Best-effort is performed to parse incomplete urls. Fields not provided will be None.

Partly backwards-compatible with urlparse.

Example:

>>> parse_url('http://google.com/mail/')
Url(scheme='http', host='google.com', port=None, path='/mail/', ...)
>>> parse_url('google.com:80')
Url(scheme=None, host='google.com', port=80, path=None, ...)
>>> parse_url('/foo?bar')
Url(scheme=None, host=None, port=None, path='/foo', query='bar', ...)
urllib3.util.url.split_first(s, delims)

Given a string and an iterable of delimiters, split on the first found delimiter. Return two split parts and the matched delimiter.

If not found, then the first part is the full input string.

Example:

>>> split_first('foo/bar?baz', '?/=')
('foo', 'bar?baz', '/')
>>> split_first('foo/bar?baz', '123')
('foo/bar?baz', '', None)

Scales linearly with number of delims. Not ideal for large number of delims.

Filepost

urllib3.filepost.choose_boundary()

Our embarassingly-simple replacement for mimetools.choose_boundary.

urllib3.filepost.encode_multipart_formdata(fields, boundary=None)

Encode a dictionary of fields using the multipart/form-data MIME format.

Parameters:
urllib3.filepost.iter_field_objects(fields)

Iterate over fields.

Supports list of (k, v) tuples and dicts, and lists of RequestField.

urllib3.filepost.iter_fields(fields)

Deprecated since version 1.6.

Iterate over fields.

The addition of RequestField makes this function obsolete. Instead, use iter_field_objects(), which returns RequestField objects.

Supports list of (k, v) tuples and dicts.

class urllib3.fields.RequestField(name, data, filename=None, headers=None)

A data container for request body parameters.

Parameters:
  • name – The name of this request field.
  • data – The data/value body.
  • filename – An optional filename of the request field.
  • headers – An optional dict-like object of headers to initially use for the field.
classmethod from_tuples(fieldname, value)

A RequestField factory from old-style tuple parameters.

Supports constructing RequestField from parameter of key/value strings AND key/filetuple. A filetuple is a (filename, data, MIME type) tuple where the MIME type is optional. For example:

'foo': 'bar',
'fakefile': ('foofile.txt', 'contents of foofile'),
'realfile': ('barfile.txt', open('realfile').read()),
'typedfile': ('bazfile.bin', open('bazfile').read(), 'image/jpeg'),
'nonamefile': 'contents of nonamefile field',

Field names and filenames must be unicode.

make_multipart(content_disposition=None, content_type=None, content_location=None)

Makes this request field into a multipart request field.

This method overrides “Content-Disposition”, “Content-Type” and “Content-Location” headers to the request parameter.

Parameters:
  • content_type – The ‘Content-Type’ of the request body.
  • content_location – The ‘Content-Location’ of the request body.
render_headers()

Renders the headers for this request field.

urllib3.fields.format_header_param(name, value)

Helper function to format and quote a single header parameter.

Particularly useful for header parameters which might contain non-ASCII values, like file names. This follows RFC 2231, as suggested by RFC 2388 Section 4.4.

Parameters:
  • name – The name of the parameter, a string expected to be ASCII only.
  • value – The value of the parameter, provided as a unicode string.
urllib3.fields.guess_content_type(filename, default='application/octet-stream')

Guess the “Content-Type” of a file.

Parameters:
  • filename – The filename to guess the “Content-Type” of using mimetypes.
  • default – If no “Content-Type” can be guessed, default to default.

Request

class urllib3.request.RequestMethods(headers=None)

Convenience mixin for classes who implement a urlopen() method, such as HTTPConnectionPool and PoolManager.

Provides behavior for making common types of HTTP request methods and decides which type of request field encoding to use.

Specifically,

request_encode_url() is for sending requests whose fields are encoded in the URL (such as GET, HEAD, DELETE).

request_encode_body() is for sending requests whose fields are encoded in the body of the request using multipart or www-form-urlencoded (such as for POST, PUT, PATCH).

request() is for making any kind of request, it will look up the appropriate encoding format and use one of the above two methods to make the request.

Initializer parameters:

Parameters:headers – Headers to include with all requests, unless other headers are given explicitly.
request(method, url, fields=None, headers=None, **urlopen_kw)

Make a request using urlopen() with the appropriate encoding of fields based on the method used.

This is a convenience method that requires the least amount of manual effort. It can be used in most situations, while still having the option to drop down to more specific methods when necessary, such as request_encode_url(), request_encode_body(), or even the lowest level urlopen().

request_encode_body(method, url, fields=None, headers=None, encode_multipart=True, multipart_boundary=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the body. This is useful for request methods like POST, PUT, PATCH, etc.

When encode_multipart=True (default), then urllib3.filepost.encode_multipart_formdata() is used to encode the payload with the appropriate content type. Otherwise urllib.urlencode() is used with the ‘application/x-www-form-urlencoded’ content type.

Multipart encoding must be used when posting files, and it’s reasonably safe to use it in other times too. However, it may break request signing, such as with OAuth.

Supports an optional fields parameter of key/value strings AND key/filetuple. A filetuple is a (filename, data, MIME type) tuple where the MIME type is optional. For example:

fields = {
    'foo': 'bar',
    'fakefile': ('foofile.txt', 'contents of foofile'),
    'realfile': ('barfile.txt', open('realfile').read()),
    'typedfile': ('bazfile.bin', open('bazfile').read(),
                  'image/jpeg'),
    'nonamefile': 'contents of nonamefile field',
}

When uploading a file, providing a filename (the first parameter of the tuple) is optional but recommended to best mimick behavior of browsers.

Note that if headers are supplied, the ‘Content-Type’ header will be overwritten because it depends on the dynamic random boundary string which is used to compose the body of the request. The random boundary string can be explicitly set with the multipart_boundary parameter.

request_encode_url(method, url, fields=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the url. This is useful for request methods like GET, HEAD, DELETE, etc.

urllib3.util.request.make_headers(keep_alive=None, accept_encoding=None, user_agent=None, basic_auth=None, proxy_basic_auth=None, disable_cache=None)

Shortcuts for generating request headers.

Parameters:
  • keep_alive – If True, adds ‘connection: keep-alive’ header.
  • accept_encoding – Can be a boolean, list, or string. True translates to ‘gzip,deflate’. List will get joined by comma. String will be used as provided.
  • user_agent – String representing the user-agent you want, such as “python-urllib3/0.6”
  • basic_auth – Colon-separated username:password string for ‘authorization: basic ...’ auth header.
  • proxy_basic_auth – Colon-separated username:password string for ‘proxy-authorization: basic ...’ auth header.
  • disable_cache – If True, adds ‘cache-control: no-cache’ header.

Example:

>>> make_headers(keep_alive=True, user_agent="Batman/1.0")
{'connection': 'keep-alive', 'user-agent': 'Batman/1.0'}
>>> make_headers(accept_encoding=True)
{'accept-encoding': 'gzip,deflate'}

Response

class urllib3.response.DeflateDecoder
decompress(data)
class urllib3.response.GzipDecoder
decompress(data)
class urllib3.response.HTTPResponse(body='', headers=None, status=0, version=0, reason=None, strict=0, preload_content=True, decode_content=True, original_response=None, pool=None, connection=None)

HTTP Response container.

Backwards-compatible to httplib’s HTTPResponse but the response body is loaded and decoded on-demand when the data property is accessed. This class is also compatible with the Python standard library’s io module, and can hence be treated as a readable object in the context of that framework.

Extra parameters for behaviour not present in httplib.HTTPResponse:

Parameters:
  • preload_content – If True, the response’s body will be preloaded during construction.
  • decode_content – If True, attempts to decode specific content-encoding’s based on headers (like ‘gzip’ and ‘deflate’) will be skipped and raw data will be used instead.
  • original_response – When this HTTPResponse wrapper is generated from an httplib.HTTPResponse object, it’s convenient to include the original for debug purposes. It’s otherwise unused.
CONTENT_DECODERS = ['gzip', 'deflate']
REDIRECT_STATUSES = [301, 302, 303, 307, 308]
close()
closed
data
fileno()
flush()
classmethod from_httplib(ResponseCls, r, **response_kw)

Given an httplib.HTTPResponse instance r, return a corresponding urllib3.response.HTTPResponse object.

Remaining parameters are passed to the HTTPResponse constructor, along with original_response=r.

get_redirect_location()

Should we redirect and where to?

Returns:Truthy redirect location string if we got a redirect status code and valid location. None if redirect status and no location. False if not a redirect status code.
getheader(name, default=None)
getheaders()
read(amt=None, decode_content=None, cache_content=False)

Similar to httplib.HTTPResponse.read(), but with two additional parameters: decode_content and cache_content.

Parameters:
  • amt – How much of the content to read. If specified, caching is skipped because it doesn’t make sense to cache partial content as the full response.
  • decode_content – If True, will attempt to decode the body based on the ‘content-encoding’ header.
  • cache_content – If True, will save the returned data such that the same result is returned despite of the state of the underlying file object. This is useful if you want the .data property to continue working after having .read() the file object. (Overridden if amt is set.)
read_chunked(amt=None, decode_content=None)

Similar to HTTPResponse.read(), but with an additional parameter: decode_content.

Parameters:decode_content – If True, will attempt to decode the body based on the ‘content-encoding’ header.
readable()
readinto(b)
release_conn()
stream(amt=65536, decode_content=None)

A generator wrapper for the read() method. A call will block until amt bytes have been read from the connection or until the connection is closed.

Parameters:
  • amt – How much of the content to read. The generator will return up to much data per iteration, but may return less. This is particularly likely when using compressed data. However, the empty string will never be returned.
  • decode_content – If True, will attempt to decode the body based on the ‘content-encoding’ header.
tell()

Obtain the number of bytes pulled over the wire so far. May differ from the amount of content returned by :meth:HTTPResponse.read if bytes are encoded on the wire (e.g, compressed).

SSL/TLS Helpers

urllib3.util.ssl_.assert_fingerprint(cert, fingerprint)

Checks if given fingerprint matches the supplied certificate.

Parameters:
  • cert – Certificate as bytes object.
  • fingerprint – Fingerprint as string of hexdigits, can be interspersed by colons.
urllib3.util.ssl_.create_urllib3_context(ssl_version=None, cert_reqs=None, options=None, ciphers=None)

All arguments have the same meaning as ssl_wrap_socket.

By default, this function does a lot of the same work that ssl.create_default_context does on Python 3.4+. It:

  • Disables SSLv2, SSLv3, and compression
  • Sets a restricted set of server ciphers

If you wish to enable SSLv3, you can do:

from urllib3.util import ssl_
context = ssl_.create_urllib3_context()
context.options &= ~ssl_.OP_NO_SSLv3

You can do the same to enable compression (substituting COMPRESSION for SSLv3 in the last line above).

Parameters:
  • ssl_version – The desired protocol version to use. This will default to PROTOCOL_SSLv23 which will negotiate the highest protocol that both the server and your installation of OpenSSL support.
  • cert_reqs – Whether to require the certificate verification. This defaults to ssl.CERT_REQUIRED.
  • options – Specific OpenSSL options. These default to ssl.OP_NO_SSLv2, ssl.OP_NO_SSLv3, ssl.OP_NO_COMPRESSION.
  • ciphers – Which cipher suites to allow the server to select.
Returns:

Constructed SSLContext object with specified options

Return type:

SSLContext

urllib3.util.ssl_.resolve_cert_reqs(candidate)

Resolves the argument to a numeric constant, which can be passed to the wrap_socket function/method from the ssl module. Defaults to ssl.CERT_NONE. If given a string it is assumed to be the name of the constant in the ssl module or its abbrevation. (So you can specify REQUIRED instead of CERT_REQUIRED. If it’s neither None nor a string we assume it is already the numeric constant which can directly be passed to wrap_socket.

urllib3.util.ssl_.resolve_ssl_version(candidate)

like resolve_cert_reqs

urllib3.util.ssl_.ssl_wrap_socket(sock, keyfile=None, certfile=None, cert_reqs=None, ca_certs=None, server_hostname=None, ssl_version=None, ciphers=None, ssl_context=None)

All arguments except for server_hostname and ssl_context have the same meaning as they do when using ssl.wrap_socket().

Parameters:
  • server_hostname – When SNI is supported, the expected hostname of the certificate
  • ssl_context – A pre-made SSLContext object. If none is provided, one will be created using create_urllib3_context().
  • ciphers – A string of ciphers we wish the client to support. This is not supported on Python 2.6 as the ssl module does not support it.

Collections

These datastructures are used to implement the behaviour of various urllib3 components in a decoupled and application-agnostic design.

class urllib3._collections.RecentlyUsedContainer(maxsize=10, dispose_func=None)

Provides a thread-safe dict-like container which maintains up to maxsize keys while throwing away the least-recently-used keys beyond maxsize.

Parameters:
  • maxsize – Maximum number of recent elements to retain.
  • dispose_func – Every time an item is evicted from the container, dispose_func(value) is called. Callback which will get called
ContainerCls

alias of OrderedDict

class urllib3._collections.HTTPHeaderDict(headers=None, **kwargs)
Parameters:
  • headers – An iterable of field-value pairs. Must not contain multiple field names when compared case-insensitively.
  • kwargs – Additional field-value pairs to pass in to dict.update.

A dict like container for storing HTTP Headers.

Field names are stored and compared case-insensitively in compliance with RFC 7230. Iteration provides the first case-sensitive key seen for each case-insensitive pair.

Using __setitem__ syntax overwrites fields that compare equal case-insensitively in order to maintain dict‘s api. For fields that compare equal, instead create a new HTTPHeaderDict and use .add in a loop.

If multiple fields that are equal case-insensitively are passed to the constructor or .update, the behavior is undefined and some will be lost.

>>> headers = HTTPHeaderDict()
>>> headers.add('Set-Cookie', 'foo=bar')
>>> headers.add('set-cookie', 'baz=quxx')
>>> headers['content-length'] = '7'
>>> headers['SET-cookie']
'foo=bar, baz=quxx'
>>> headers['Content-Length']
'7'
add(key, val)

Adds a (name, value) pair, doesn’t overwrite the value if it already exists.

>>> headers = HTTPHeaderDict(foo='bar')
>>> headers.add('Foo', 'baz')
>>> headers['foo']
'bar, baz'
extend(*args, **kwargs)

Generic import function for any type of header-like object. Adapted version of MutableMapping.update in order to insert items with self.add instead of self.__setitem__

classmethod from_httplib(message)

Read headers from a Python 2 httplib message object.

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
getallmatchingheaders(key)

Returns a list of all the values for the named field. Returns an empty list if the key doesn’t exist.

getheaders(key)

Returns a list of all the values for the named field. Returns an empty list if the key doesn’t exist.

getlist(key)

Returns a list of all the values for the named field. Returns an empty list if the key doesn’t exist.

iget(key)

Returns a list of all the values for the named field. Returns an empty list if the key doesn’t exist.

iteritems()

Iterate over all header lines, including duplicate ones.

iterkeys() → an iterator over the keys of D
itermerged()

Iterate over all headers, merging duplicate ones together.

itervalues() → an iterator over the values of D
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

update([E, ]**F) → None. Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values() → list of D's values

Contrib Modules

These modules implement various extra features, that may not be ready for prime time.

SNI-support for Python 2

SSL with SNI-support for Python 2. Follow these instructions if you would like to verify SSL certificates in Python 2. Note, the default libraries do not do certificate checking; you need to do additional work to validate certificates yourself.

This needs the following packages installed:

  • pyOpenSSL (tested with 0.13)
  • ndg-httpsclient (tested with 0.3.2)
  • pyasn1 (tested with 0.1.6)

You can install them with the following command:

pip install pyopenssl ndg-httpsclient pyasn1

To activate certificate checking, call inject_into_urllib3() from your Python code before you begin making HTTP requests. This can be done in a sitecustomize module, or at any other time before your application begins using urllib3, like this:

try:
    import urllib3.contrib.pyopenssl
    urllib3.contrib.pyopenssl.inject_into_urllib3()
except ImportError:
    pass

Now you can use urllib3 as you normally would, and it will support SNI when the required modules are installed.

Activating this module also has the positive side effect of disabling SSL/TLS compression in Python 2 (see CRIME attack).

If you want to configure the default list of supported cipher suites, you can set the urllib3.contrib.pyopenssl.DEFAULT_SSL_CIPHER_LIST variable.

Module Variables

var DEFAULT_SSL_CIPHER_LIST:
 The list of supported SSL/TLS cipher suites.

Security: Verified HTTPS with SSL/TLS

Very important fact: By default, urllib3 does not verify HTTPS requests.

The historic reason for this is that we rely on httplib for some of the HTTP protocol implementation, and httplib does not verify requests out of the box. This is not a good reason, but here we are.

Luckily, it’s not too hard to enable verified HTTPS requests and there are a few ways to do it.

Python with SSL enabled

First we need to make sure your Python installation has SSL enabled. Easiest way to check is to simply open a Python shell and type import ssl:

>>> import ssl
Traceback (most recent call last):
  ...
ImportError: No module named _ssl

If you got an ImportError, then your Python is not compiled with SSL support and you’ll need to re-install it. Read this StackOverflow thread for details.

Otherwise, if ssl imported cleanly, then we’re ready to setup our certificates: Using Certifi with urllib3.

Enabling SSL on Google AppEngine

If you’re using Google App Engine, you’ll need to add ssl as a library dependency to your yaml file, like this:

libraries:
- name: ssl
  version: latest

If it’s still not working, you may need to enable billing on your account to enable using sockets.

Using Certifi with urllib3

Certifi is a package which ships with Mozilla’s root certificates for easy programmatic access.

  1. Install the Python certifi package:

    $ pip install certifi
    
  2. Setup your pool to require a certificate and provide the certifi bundle:

    import urllib3
    import certifi
    
    http = urllib3.PoolManager(
        cert_reqs='CERT_REQUIRED', # Force certificate check.
        ca_certs=certifi.where(),  # Path to the Certifi bundle.
    )
    
    # You're ready to make verified HTTPS requests.
    try:
        r = http.request('GET', 'https://example.com/')
    except urllib3.exceptions.SSLError as e:
        # Handle incorrect certificate error.
        ...
    

Make sure to update your certifi package regularly to get the latest root certificates.

Using your system’s root certificates

Your system’s root certificates may be more up-to-date than maintaining your own, but the trick is finding where they live. Different operating systems have them in different places.

For example, on most Linux distributions they’re at /etc/ssl/certs/ca-certificates.crt. On Windows and OS X? It’s not so simple.

Once you find your root certificate file:

import urllib3

ca_certs = "/etc/ssl/certs/ca-certificates.crt"  # Or wherever it lives.

http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED', # Force certificate check.
    ca_certs=ca_certs,         # Path to your certificate bundle.
)

# You're ready to make verified HTTPS requests.
try:
    r = http.request('GET', 'https://example.com/')
except urllib3.exceptions.SSLError as e:
    # Handle incorrect certificate error.
    ...

OpenSSL / PyOpenSSL

By default, we use the standard library’s ssl module. Unfortunately, there are several limitations which are addressed by PyOpenSSL:

  • (Python 2.x) SNI support.
  • (Python 2.x-3.2) Disabling compression to mitigate CRIME attack.

To use the Python OpenSSL bindings instead, you’ll need to install the required packages:

$ pip install pyopenssl ndg-httpsclient pyasn1

Once the packages are installed, you can tell urllib3 to switch the ssl backend to PyOpenSSL with inject_into_urllib3():

import urllib3.contrib.pyopenssl
urllib3.contrib.pyopenssl.inject_into_urllib3()

Now you can continue using urllib3 as you normally would.

For more details, check the pyopenssl module.

InsecureRequestWarning

New in version 1.9.

Unverified HTTPS requests will trigger a warning via Python’s warnings module:

urllib3/connectionpool.py:736: InsecureRequestWarning: Unverified HTTPS
request is being made. Adding certificate verification is strongly advised.
See: https://urllib3.readthedocs.org/en/latest/security.html

This would be a great time to enable HTTPS verification: Using Certifi with urllib3.

If you know what you’re doing and would like to disable this and other warnings, you can use disable_warnings():

import urllib3
urllib3.disable_warnings()

Making unverified HTTPS requests is strongly discouraged. ˙ ͜ʟ˙

Alternatively, if you are using Python’s logging module, you can capture the warnings to your own log:

logging.captureWarnings(True)

Capturing the warnings to your own log is much preferred over simply disabling the warnings.

InsecurePlatformWarning

New in version 1.11.

Certain Python platforms (specifically, versions of Python earlier than 2.7.9) have restrictions in their ssl module that limit the configuration that urllib3 can apply. In particular, this can cause HTTPS requests that would succeed on more featureful platforms to fail, and can cause certain security features to be unavailable.

If you encounter this warning, it is strongly recommended you upgrade to a newer Python version, or that you use pyOpenSSL as described in the OpenSSL / PyOpenSSL section.

If you know what you are doing and would like to disable this and other warnings, please consult the InsecureRequestWarning section for instructions on how to handle the warnings.

Highlights

  • Re-use the same socket connection for multiple requests, with optional client-side certificate verification. See: HTTPConnectionPool and HTTPSConnectionPool
  • File posting. See: encode_multipart_formdata()
  • Built-in redirection and retries (optional).
  • Supports gzip and deflate decoding. See: decode_gzip() and decode_deflate()
  • Thread-safe and sanity-safe.
  • Tested on Python 2.6+ and Python 3.2+, 100% unit test coverage.
  • Works with AppEngine, gevent, eventlib, and the standard library io module.
  • Small and easy to understand codebase perfect for extending and building upon. For a more comprehensive solution, have a look at Requests which is also powered by urllib3.

Getting Started

Installing

pip install urllib3 or fetch the latest source from github.com/shazow/urllib3.

Usage

>>> import urllib3
>>> http = urllib3.PoolManager()
>>> r = http.request('GET', 'http://example.com/')
>>> r.status
200
>>> r.headers['server']
'ECS (iad/182A)'
>>> 'data: ' + r.data
'data: ...'

By default, urllib3 does not verify your HTTPS requests. You’ll need to supply a root certificate bundle, or use certifi

>>> import urllib3, certifi
>>> http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED', ca_certs=certifi.where())
>>> r = http.request('GET', 'https://insecure.com/')
Traceback (most recent call last):
  ...
SSLError: hostname 'insecure.com' doesn't match 'svn.nmap.org'

For more on making secure SSL/TLS HTTPS requests, read the Security section.

urllib3’s responses respect the io framework from Python’s standard library, allowing use of these standard objects for purposes like buffering:

>>> http = urllib3.PoolManager()
>>> r = http.urlopen('GET','http://example.com/', preload_content=False)
>>> b = io.BufferedReader(r, 2048)
>>> firstpart = b.read(100)
>>> # ... your internet connection fails momentarily ...
>>> secondpart = b.read()

Components

urllib3 tries to strike a fine balance between power, extendability, and sanity. To achieve this, the codebase is a collection of small reusable utilities and abstractions composed together in a few helpful layers.

PoolManager

The highest level is the PoolManager(...).

The PoolManager will take care of reusing connections for you whenever you request the same host. This should cover most scenarios without significant loss of efficiency, but you can always drop down to a lower level component for more granular control.

>>> import urllib3
>>> http = urllib3.PoolManager(10)
>>> r1 = http.request('GET', 'http://example.com/')
>>> r2 = http.request('GET', 'http://httpbin.org/')
>>> r3 = http.request('GET', 'http://httpbin.org/get')
>>> len(http.pools)
2

A PoolManager is a proxy for a collection of ConnectionPool objects. They both inherit from RequestMethods to make sure that their API is similar, so that instances of either can be passed around interchangeably.

ProxyManager

The ProxyManager is an HTTP proxy-aware subclass of PoolManager. It produces a single HTTPConnectionPool instance for all HTTP connections and individual per-server:port HTTPSConnectionPool instances for tunnelled HTTPS connections:

>>> proxy = urllib3.ProxyManager('http://localhost:3128/')
>>> r1 = proxy.request('GET', 'http://google.com/')
>>> r2 = proxy.request('GET', 'http://httpbin.org/')
>>> len(proxy.pools)
1
>>> r3 = proxy.request('GET', 'https://httpbin.org/')
>>> r4 = proxy.request('GET', 'https://twitter.com/')
>>> len(proxy.pools)
3

ConnectionPool

The next layer is the ConnectionPool(...).

The HTTPConnectionPool and HTTPSConnectionPool classes allow you to define a pool of connections to a single host and make requests against this pool with automatic connection reusing and thread safety.

When the ssl module is available, then HTTPSConnectionPool objects can be configured to check SSL certificates against specific provided certificate authorities.

>>> import urllib3
>>> conn = urllib3.connection_from_url('http://httpbin.org/')
>>> r1 = conn.request('GET', 'http://httpbin.org/')
>>> r2 = conn.request('GET', '/user-agent')
>>> r3 = conn.request('GET', 'http://example.com')
Traceback (most recent call last):
  ...
urllib3.exceptions.HostChangedError: HTTPConnectionPool(host='httpbin.org', port=None): Tried to open a foreign host with url: http://example.com

Again, a ConnectionPool is a pool of connections to a specific host. Trying to access a different host through the same pool will raise a HostChangedError exception unless you specify assert_same_host=False. Do this at your own risk as the outcome is completely dependent on the behaviour of the host server.

If you need to access multiple hosts and don’t want to manage your own collection of ConnectionPool objects, then you should use a PoolManager.

A ConnectionPool is composed of a collection of httplib.HTTPConnection objects.

Timeout

A timeout can be set to abort socket operations on individual connections after the specified duration. The timeout can be defined as a float or an instance of Timeout which gives more granular configuration over how much time is allowed for different stages of the request. This can be set for the entire pool or per-request.

>>> from urllib3 import PoolManager, Timeout

>>> # Manager with 3 seconds combined timeout.
>>> http = PoolManager(timeout=3.0)
>>> r = http.request('GET', 'http://httpbin.org/delay/1')

>>> # Manager with 2 second timeout for the read phase, no limit for the rest.
>>> http = PoolManager(timeout=Timeout(read=2.0))
>>> r = http.request('GET', 'http://httpbin.org/delay/1')

>>> # Manager with no timeout but a request with a timeout of 1 seconds for
>>> # the connect phase and 2 seconds for the read phase.
>>> http = PoolManager()
>>> r = http.request('GET', 'http://httpbin.org/delay/1', timeout=Timeout(connect=1.0, read=2.0))

>>> # Same Manager but request with a 5 second total timeout.
>>> r = http.request('GET', 'http://httpbin.org/delay/1', timeout=Timeout(total=5.0))

See the Timeout definition for more details.

Retry

Retries can be configured by passing an instance of Retry, or disabled by passing False, to the retries parameter.

Redirects are also considered to be a subset of retries but can be configured or disabled individually.

>>> from urllib3 import PoolManager, Retry

>>> # Allow 3 retries total for all requests in this pool. These are the same:
>>> http = PoolManager(retries=3)
>>> http = PoolManager(retries=Retry(3))
>>> http = PoolManager(retries=Retry(total=3))

>>> r = http.request('GET', 'http://httpbin.org/redirect/2')
>>> # r.status -> 200

>>> # Disable redirects for this request.
>>> r = http.request('GET', 'http://httpbin.org/redirect/2', retries=Retry(3, redirect=False))
>>> # r.status -> 302

>>> # No total limit, but only do 5 connect retries, for this request.
>>> r = http.request('GET', 'http://httpbin.org/', retries=Retry(connect=5))

See the Retry definition for more details.

Stream

You may also stream your response and get data as they come (e.g. when using transfer-encoding: chunked). In this case, method stream() will return generator.

>>> from urllib3 import PoolManager
>>> http = urllib3.PoolManager()

>>> r = http.request("GET", "http://httpbin.org/stream/3")
>>> r.getheader("transfer-encoding")
'chunked'

>>> for chunk in r.stream():
... print chunk
{"url": "http://httpbin.org/stream/3", ..., "id": 0, ...}
{"url": "http://httpbin.org/stream/3", ..., "id": 1, ...}
{"url": "http://httpbin.org/stream/3", ..., "id": 2, ...}
>>> r.closed
True

Completely consuming the stream will auto-close the response and release the connection back to the pool. If you’re only partially consuming the consuming a stream, make sure to manually call r.close() on the response.

Foundation

At the very core, just like its predecessors, urllib3 is built on top of httplib – the lowest level HTTP library included in the Python standard library.

To aid the limited functionality of the httplib module, urllib3 provides various helper methods which are used with the higher level components but can also be used independently.

Helpers

Useful methods for working with httplib, completely decoupled from code specific to urllib3.

Timeouts
class urllib3.util.timeout.Timeout(total=None, connect=<object object at 0x7f42c1d233f0>, read=<object object at 0x7f42c1d233f0>)

Timeout configuration.

Timeouts can be defined as a default for a pool:

timeout = Timeout(connect=2.0, read=7.0)
http = PoolManager(timeout=timeout)
response = http.request('GET', 'http://example.com/')

Or per-request (which overrides the default for the pool):

response = http.request('GET', 'http://example.com/', timeout=Timeout(10))

Timeouts can be disabled by setting all the parameters to None:

no_timeout = Timeout(connect=None, read=None)
response = http.request('GET', 'http://example.com/, timeout=no_timeout)
Parameters:
  • total (integer, float, or None) –

    This combines the connect and read timeouts into one; the read timeout will be set to the time leftover from the connect attempt. In the event that both a connect timeout and a total are specified, or a read timeout and a total are specified, the shorter timeout will be applied.

    Defaults to None.

  • connect (integer, float, or None) – The maximum amount of time to wait for a connection attempt to a server to succeed. Omitting the parameter will default the connect timeout to the system default, probably the global default timeout in socket.py. None will set an infinite timeout for connection attempts.
  • read (integer, float, or None) –

    The maximum amount of time to wait between consecutive read operations for a response from the server. Omitting the parameter will default the read timeout to the system default, probably the global default timeout in socket.py. None will set an infinite timeout.

Note

Many factors can affect the total amount of time for urllib3 to return an HTTP response.

For example, Python’s DNS resolver does not obey the timeout specified on the socket. Other factors that can affect total request time include high CPU load, high swap, the program running at a low priority level, or other behaviors.

In addition, the read and total timeouts only measure the time between read operations on the socket connecting the client and the server, not the total amount of time for the request to return a complete response. For most requests, the timeout is raised because the server has not sent the first byte in the specified time. This is not always the case; if a server streams one byte every fifteen seconds, a timeout of 20 seconds will not trigger, even though the request will take several minutes to complete.

If your goal is to cut off any request after a set amount of wall clock time, consider having a second “watcher” thread to cut off a slow request.

DEFAULT_TIMEOUT = <object object at 0x7f42c1d23240>

A sentinel object representing the default timeout value

clone()

Create a copy of the timeout object

Timeout properties are stored per-pool but each request needs a fresh Timeout object to ensure each one has its own start/stop configured.

Returns:a copy of the timeout object
Return type:Timeout
connect_timeout

Get the value to use when setting a connection timeout.

This will be a positive float or integer, the value None (never timeout), or the default system timeout.

Returns:Connect timeout.
Return type:int, float, Timeout.DEFAULT_TIMEOUT or None
classmethod from_float(timeout)

Create a new Timeout from a legacy timeout value.

The timeout value used by httplib.py sets the same timeout on the connect(), and recv() socket requests. This creates a Timeout object that sets the individual timeouts to the timeout value passed to this function.

Parameters:timeout (integer, float, sentinel default object, or None) – The legacy timeout value.
Returns:Timeout object
Return type:Timeout
get_connect_duration()

Gets the time elapsed since the call to start_connect().

Returns:Elapsed time.
Return type:float
Raises urllib3.exceptions.TimeoutStateError:
 if you attempt to get duration for a timer that hasn’t been started.
read_timeout

Get the value for the read timeout.

This assumes some time has elapsed in the connection timeout and computes the read timeout appropriately.

If self.total is set, the read timeout is dependent on the amount of time taken by the connect timeout. If the connection time has not been established, a TimeoutStateError will be raised.

Returns:Value to use for the read timeout.
Return type:int, float, Timeout.DEFAULT_TIMEOUT or None
Raises urllib3.exceptions.TimeoutStateError:
 If start_connect() has not yet been called on this object.
start_connect()

Start the timeout clock, used during a connect() attempt

Raises urllib3.exceptions.TimeoutStateError:
 if you attempt to start a timer that has been started already.
urllib3.util.timeout.current_time()

Retrieve the current time. This function is mocked out in unit testing.

Retries
class urllib3.util.retry.Retry(total=10, connect=None, read=None, redirect=None, method_whitelist=frozenset(['HEAD', 'TRACE', 'GET', 'PUT', 'OPTIONS', 'DELETE']), status_forcelist=None, backoff_factor=0, raise_on_redirect=True, _observed_errors=0)

Retry configuration.

Each retry attempt will create a new Retry object with updated values, so they can be safely reused.

Retries can be defined as a default for a pool:

retries = Retry(connect=5, read=2, redirect=5)
http = PoolManager(retries=retries)
response = http.request('GET', 'http://example.com/')

Or per-request (which overrides the default for the pool):

response = http.request('GET', 'http://example.com/', retries=Retry(10))

Retries can be disabled by passing False:

response = http.request('GET', 'http://example.com/', retries=False)

Errors will be wrapped in MaxRetryError unless retries are disabled, in which case the causing exception will be raised.

Parameters:
  • total (int) –

    Total number of retries to allow. Takes precedence over other counts.

    Set to None to remove this constraint and fall back on other counts. It’s a good idea to set this to some sensibly-high value to account for unexpected edge cases and avoid infinite retry loops.

    Set to 0 to fail on the first retry.

    Set to False to disable and imply raise_on_redirect=False.

  • connect (int) –

    How many connection-related errors to retry on.

    These are errors raised before the request is sent to the remote server, which we assume has not triggered the server to process the request.

    Set to 0 to fail on the first retry of this type.

  • read (int) –

    How many times to retry on read errors.

    These errors are raised after the request was sent to the server, so the request may have side-effects.

    Set to 0 to fail on the first retry of this type.

  • redirect (int) –

    How many redirects to perform. Limit this to avoid infinite redirect loops.

    A redirect is a HTTP response with a status code 301, 302, 303, 307 or 308.

    Set to 0 to fail on the first retry of this type.

    Set to False to disable and imply raise_on_redirect=False.

  • method_whitelist (iterable) –

    Set of uppercased HTTP method verbs that we should retry on.

    By default, we only retry on methods which are considered to be indempotent (multiple requests with the same parameters end with the same state). See Retry.DEFAULT_METHOD_WHITELIST.

  • status_forcelist (iterable) –

    A set of HTTP status codes that we should force a retry on.

    By default, this is disabled with None.

  • backoff_factor (float) –

    A backoff factor to apply between attempts. urllib3 will sleep for:

    {backoff factor} * (2 ^ ({number of total retries} - 1))
    

    seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.1s, 0.2s, 0.4s, ...] between retries. It will never be longer than Retry.MAX_BACKOFF.

    By default, backoff is disabled (set to 0).

  • raise_on_redirect (bool) – Whether, if the number of redirects is exhausted, to raise a MaxRetryError, or to return a response with a response code in the 3xx range.
BACKOFF_MAX = 120

Maximum backoff time.

classmethod from_int(retries, redirect=True, default=None)

Backwards-compatibility for the old retries format.

get_backoff_time()

Formula for computing the current backoff

Return type:float
increment(method=None, url=None, response=None, error=None, _pool=None, _stacktrace=None)

Return a new Retry object with incremented retry counters.

Parameters:
  • response (HTTPResponse) – A response object, or None, if the server did not return a response.
  • error (Exception) – An error encountered during the request, or None if the response was received successfully.
Returns:

A new Retry object.

is_exhausted()

Are we out of retries?

is_forced_retry(method, status_code)

Is this method/status code retryable? (Based on method/codes whitelists)

sleep()

Sleep between retry attempts using an exponential backoff.

By default, the backoff factor is 0 and this method will return immediately.

URL Helpers
class urllib3.util.url.Url

Datastructure for representing an HTTP URL. Used as a return value for parse_url().

hostname

For backwards-compatibility with urlparse. We’re nice like that.

netloc

Network location including host and port

request_uri

Absolute path including the query string.

url

Convert self into a url

This function should more or less round-trip with parse_url(). The returned url may not be exactly the same as the url inputted to parse_url(), but it should be equivalent by the RFC (e.g., urls with a blank port will have : removed).

Example:

>>> U = parse_url('http://google.com/mail/')
>>> U.url
'http://google.com/mail/'
>>> Url('http', 'username:password', 'host.com', 80,
... '/path', 'query', 'fragment').url
'http://username:password@host.com:80/path?query#fragment'
urllib3.util.url.get_host(url)

Deprecated. Use parse_url() instead.

urllib3.util.url.parse_url(url)

Given a url, return a parsed Url namedtuple. Best-effort is performed to parse incomplete urls. Fields not provided will be None.

Partly backwards-compatible with urlparse.

Example:

>>> parse_url('http://google.com/mail/')
Url(scheme='http', host='google.com', port=None, path='/mail/', ...)
>>> parse_url('google.com:80')
Url(scheme=None, host='google.com', port=80, path=None, ...)
>>> parse_url('/foo?bar')
Url(scheme=None, host=None, port=None, path='/foo', query='bar', ...)
urllib3.util.url.split_first(s, delims)

Given a string and an iterable of delimiters, split on the first found delimiter. Return two split parts and the matched delimiter.

If not found, then the first part is the full input string.

Example:

>>> split_first('foo/bar?baz', '?/=')
('foo', 'bar?baz', '/')
>>> split_first('foo/bar?baz', '123')
('foo/bar?baz', '', None)

Scales linearly with number of delims. Not ideal for large number of delims.

Filepost
urllib3.filepost.choose_boundary()

Our embarassingly-simple replacement for mimetools.choose_boundary.

urllib3.filepost.encode_multipart_formdata(fields, boundary=None)

Encode a dictionary of fields using the multipart/form-data MIME format.

Parameters:
urllib3.filepost.iter_field_objects(fields)

Iterate over fields.

Supports list of (k, v) tuples and dicts, and lists of RequestField.

urllib3.filepost.iter_fields(fields)

Deprecated since version 1.6.

Iterate over fields.

The addition of RequestField makes this function obsolete. Instead, use iter_field_objects(), which returns RequestField objects.

Supports list of (k, v) tuples and dicts.

class urllib3.fields.RequestField(name, data, filename=None, headers=None)

A data container for request body parameters.

Parameters:
  • name – The name of this request field.
  • data – The data/value body.
  • filename – An optional filename of the request field.
  • headers – An optional dict-like object of headers to initially use for the field.
classmethod from_tuples(fieldname, value)

A RequestField factory from old-style tuple parameters.

Supports constructing RequestField from parameter of key/value strings AND key/filetuple. A filetuple is a (filename, data, MIME type) tuple where the MIME type is optional. For example:

'foo': 'bar',
'fakefile': ('foofile.txt', 'contents of foofile'),
'realfile': ('barfile.txt', open('realfile').read()),
'typedfile': ('bazfile.bin', open('bazfile').read(), 'image/jpeg'),
'nonamefile': 'contents of nonamefile field',

Field names and filenames must be unicode.

make_multipart(content_disposition=None, content_type=None, content_location=None)

Makes this request field into a multipart request field.

This method overrides “Content-Disposition”, “Content-Type” and “Content-Location” headers to the request parameter.

Parameters:
  • content_type – The ‘Content-Type’ of the request body.
  • content_location – The ‘Content-Location’ of the request body.
render_headers()

Renders the headers for this request field.

urllib3.fields.format_header_param(name, value)

Helper function to format and quote a single header parameter.

Particularly useful for header parameters which might contain non-ASCII values, like file names. This follows RFC 2231, as suggested by RFC 2388 Section 4.4.

Parameters:
  • name – The name of the parameter, a string expected to be ASCII only.
  • value – The value of the parameter, provided as a unicode string.
urllib3.fields.guess_content_type(filename, default='application/octet-stream')

Guess the “Content-Type” of a file.

Parameters:
  • filename – The filename to guess the “Content-Type” of using mimetypes.
  • default – If no “Content-Type” can be guessed, default to default.
Request
class urllib3.request.RequestMethods(headers=None)

Convenience mixin for classes who implement a urlopen() method, such as HTTPConnectionPool and PoolManager.

Provides behavior for making common types of HTTP request methods and decides which type of request field encoding to use.

Specifically,

request_encode_url() is for sending requests whose fields are encoded in the URL (such as GET, HEAD, DELETE).

request_encode_body() is for sending requests whose fields are encoded in the body of the request using multipart or www-form-urlencoded (such as for POST, PUT, PATCH).

request() is for making any kind of request, it will look up the appropriate encoding format and use one of the above two methods to make the request.

Initializer parameters:

Parameters:headers – Headers to include with all requests, unless other headers are given explicitly.
request(method, url, fields=None, headers=None, **urlopen_kw)

Make a request using urlopen() with the appropriate encoding of fields based on the method used.

This is a convenience method that requires the least amount of manual effort. It can be used in most situations, while still having the option to drop down to more specific methods when necessary, such as request_encode_url(), request_encode_body(), or even the lowest level urlopen().

request_encode_body(method, url, fields=None, headers=None, encode_multipart=True, multipart_boundary=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the body. This is useful for request methods like POST, PUT, PATCH, etc.

When encode_multipart=True (default), then urllib3.filepost.encode_multipart_formdata() is used to encode the payload with the appropriate content type. Otherwise urllib.urlencode() is used with the ‘application/x-www-form-urlencoded’ content type.

Multipart encoding must be used when posting files, and it’s reasonably safe to use it in other times too. However, it may break request signing, such as with OAuth.

Supports an optional fields parameter of key/value strings AND key/filetuple. A filetuple is a (filename, data, MIME type) tuple where the MIME type is optional. For example:

fields = {
    'foo': 'bar',
    'fakefile': ('foofile.txt', 'contents of foofile'),
    'realfile': ('barfile.txt', open('realfile').read()),
    'typedfile': ('bazfile.bin', open('bazfile').read(),
                  'image/jpeg'),
    'nonamefile': 'contents of nonamefile field',
}

When uploading a file, providing a filename (the first parameter of the tuple) is optional but recommended to best mimick behavior of browsers.

Note that if headers are supplied, the ‘Content-Type’ header will be overwritten because it depends on the dynamic random boundary string which is used to compose the body of the request. The random boundary string can be explicitly set with the multipart_boundary parameter.

request_encode_url(method, url, fields=None, **urlopen_kw)

Make a request using urlopen() with the fields encoded in the url. This is useful for request methods like GET, HEAD, DELETE, etc.

urllib3.util.request.make_headers(keep_alive=None, accept_encoding=None, user_agent=None, basic_auth=None, proxy_basic_auth=None, disable_cache=None)

Shortcuts for generating request headers.

Parameters:
  • keep_alive – If True, adds ‘connection: keep-alive’ header.
  • accept_encoding – Can be a boolean, list, or string. True translates to ‘gzip,deflate’. List will get joined by comma. String will be used as provided.
  • user_agent – String representing the user-agent you want, such as “python-urllib3/0.6”
  • basic_auth – Colon-separated username:password string for ‘authorization: basic ...’ auth header.
  • proxy_basic_auth – Colon-separated username:password string for ‘proxy-authorization: basic ...’ auth header.
  • disable_cache – If True, adds ‘cache-control: no-cache’ header.

Example:

>>> make_headers(keep_alive=True, user_agent="Batman/1.0")
{'connection': 'keep-alive', 'user-agent': 'Batman/1.0'}
>>> make_headers(accept_encoding=True)
{'accept-encoding': 'gzip,deflate'}
Response
class urllib3.response.DeflateDecoder
decompress(data)
class urllib3.response.GzipDecoder
decompress(data)
class urllib3.response.HTTPResponse(body='', headers=None, status=0, version=0, reason=None, strict=0, preload_content=True, decode_content=True, original_response=None, pool=None, connection=None)

HTTP Response container.

Backwards-compatible to httplib’s HTTPResponse but the response body is loaded and decoded on-demand when the data property is accessed. This class is also compatible with the Python standard library’s io module, and can hence be treated as a readable object in the context of that framework.

Extra parameters for behaviour not present in httplib.HTTPResponse:

Parameters:
  • preload_content – If True, the response’s body will be preloaded during construction.
  • decode_content – If True, attempts to decode specific content-encoding’s based on headers (like ‘gzip’ and ‘deflate’) will be skipped and raw data will be used instead.
  • original_response – When this HTTPResponse wrapper is generated from an httplib.HTTPResponse object, it’s convenient to include the original for debug purposes. It’s otherwise unused.
CONTENT_DECODERS = ['gzip', 'deflate']
REDIRECT_STATUSES = [301, 302, 303, 307, 308]
close()
closed
data
fileno()
flush()
classmethod from_httplib(ResponseCls, r, **response_kw)

Given an httplib.HTTPResponse instance r, return a corresponding urllib3.response.HTTPResponse object.

Remaining parameters are passed to the HTTPResponse constructor, along with original_response=r.

get_redirect_location()

Should we redirect and where to?

Returns:Truthy redirect location string if we got a redirect status code and valid location. None if redirect status and no location. False if not a redirect status code.
getheader(name, default=None)
getheaders()
read(amt=None, decode_content=None, cache_content=False)

Similar to httplib.HTTPResponse.read(), but with two additional parameters: decode_content and cache_content.

Parameters:
  • amt – How much of the content to read. If specified, caching is skipped because it doesn’t make sense to cache partial content as the full response.
  • decode_content – If True, will attempt to decode the body based on the ‘content-encoding’ header.
  • cache_content – If True, will save the returned data such that the same result is returned despite of the state of the underlying file object. This is useful if you want the .data property to continue working after having .read() the file object. (Overridden if amt is set.)
read_chunked(amt=None, decode_content=None)

Similar to HTTPResponse.read(), but with an additional parameter: decode_content.

Parameters:decode_content – If True, will attempt to decode the body based on the ‘content-encoding’ header.
readable()
readinto(b)
release_conn()
stream(amt=65536, decode_content=None)

A generator wrapper for the read() method. A call will block until amt bytes have been read from the connection or until the connection is closed.

Parameters:
  • amt – How much of the content to read. The generator will return up to much data per iteration, but may return less. This is particularly likely when using compressed data. However, the empty string will never be returned.
  • decode_content – If True, will attempt to decode the body based on the ‘content-encoding’ header.
tell()

Obtain the number of bytes pulled over the wire so far. May differ from the amount of content returned by :meth:HTTPResponse.read if bytes are encoded on the wire (e.g, compressed).

SSL/TLS Helpers
urllib3.util.ssl_.assert_fingerprint(cert, fingerprint)

Checks if given fingerprint matches the supplied certificate.

Parameters:
  • cert – Certificate as bytes object.
  • fingerprint – Fingerprint as string of hexdigits, can be interspersed by colons.
urllib3.util.ssl_.create_urllib3_context(ssl_version=None, cert_reqs=None, options=None, ciphers=None)

All arguments have the same meaning as ssl_wrap_socket.

By default, this function does a lot of the same work that ssl.create_default_context does on Python 3.4+. It:

  • Disables SSLv2, SSLv3, and compression
  • Sets a restricted set of server ciphers

If you wish to enable SSLv3, you can do:

from urllib3.util import ssl_
context = ssl_.create_urllib3_context()
context.options &= ~ssl_.OP_NO_SSLv3

You can do the same to enable compression (substituting COMPRESSION for SSLv3 in the last line above).

Parameters:
  • ssl_version – The desired protocol version to use. This will default to PROTOCOL_SSLv23 which will negotiate the highest protocol that both the server and your installation of OpenSSL support.
  • cert_reqs – Whether to require the certificate verification. This defaults to ssl.CERT_REQUIRED.
  • options – Specific OpenSSL options. These default to ssl.OP_NO_SSLv2, ssl.OP_NO_SSLv3, ssl.OP_NO_COMPRESSION.
  • ciphers – Which cipher suites to allow the server to select.
Returns:

Constructed SSLContext object with specified options

Return type:

SSLContext

urllib3.util.ssl_.resolve_cert_reqs(candidate)

Resolves the argument to a numeric constant, which can be passed to the wrap_socket function/method from the ssl module. Defaults to ssl.CERT_NONE. If given a string it is assumed to be the name of the constant in the ssl module or its abbrevation. (So you can specify REQUIRED instead of CERT_REQUIRED. If it’s neither None nor a string we assume it is already the numeric constant which can directly be passed to wrap_socket.

urllib3.util.ssl_.resolve_ssl_version(candidate)

like resolve_cert_reqs

urllib3.util.ssl_.ssl_wrap_socket(sock, keyfile=None, certfile=None, cert_reqs=None, ca_certs=None, server_hostname=None, ssl_version=None, ciphers=None, ssl_context=None)

All arguments except for server_hostname and ssl_context have the same meaning as they do when using ssl.wrap_socket().

Parameters:
  • server_hostname – When SNI is supported, the expected hostname of the certificate
  • ssl_context – A pre-made SSLContext object. If none is provided, one will be created using create_urllib3_context().
  • ciphers – A string of ciphers we wish the client to support. This is not supported on Python 2.6 as the ssl module does not support it.

Exceptions

Custom exceptions defined by urllib3

exception urllib3.exceptions.ClosedPoolError(pool, message)

Raised when a request enters a pool after the pool has been closed.

exception urllib3.exceptions.ConnectTimeoutError

Raised when a socket timeout occurs while connecting to a server

urllib3.exceptions.ConnectionError

Renamed to ProtocolError but aliased for backwards compatibility.

alias of ProtocolError

exception urllib3.exceptions.DecodeError

Raised when automatic decoding based on Content-Type fails.

exception urllib3.exceptions.EmptyPoolError(pool, message)

Raised when a pool runs out of connections and no more are allowed.

exception urllib3.exceptions.HTTPError

Base exception used by this module.

exception urllib3.exceptions.HTTPWarning

Base warning used by this module.

exception urllib3.exceptions.HostChangedError(pool, url, retries=3)

Raised when an existing pool gets a request for a foreign host.

exception urllib3.exceptions.InsecurePlatformWarning

Warned when certain SSL configuration is not available on a platform.

exception urllib3.exceptions.InsecureRequestWarning

Warned when making an unverified HTTPS request.

exception urllib3.exceptions.LocationParseError(location)

Raised when get_host or similar fails to parse the URL input.

exception urllib3.exceptions.LocationValueError

Raised when there is something wrong with a given URL input.

exception urllib3.exceptions.MaxRetryError(pool, url, reason=None)

Raised when the maximum number of retries is exceeded.

Parameters:
exception urllib3.exceptions.PoolError(pool, message)

Base exception for errors caused within a pool.

exception urllib3.exceptions.ProtocolError

Raised when something unexpected happens mid-request/response.

exception urllib3.exceptions.ProxyError

Raised when the connection to a proxy fails.

exception urllib3.exceptions.ReadTimeoutError(pool, url, message)

Raised when a socket timeout occurs while receiving data from a server

exception urllib3.exceptions.RequestError(pool, url, message)

Base exception for PoolErrors that have associated URLs.

exception urllib3.exceptions.ResponseError

Used as a container for an error reason supplied in a MaxRetryError.

exception urllib3.exceptions.ResponseNotChunked

Response needs to be chunked in order to read it as chunks.

exception urllib3.exceptions.SSLError

Raised when SSL certificate fails in an HTTPS connection.

exception urllib3.exceptions.SecurityWarning

Warned when perfoming security reducing actions

exception urllib3.exceptions.SystemTimeWarning

Warned when system time is suspected to be wrong

exception urllib3.exceptions.TimeoutError

Raised when a socket timeout error occurs.

Catching this error will catch both ReadTimeoutErrors and ConnectTimeoutErrors.

exception urllib3.exceptions.TimeoutStateError

Raised when passing an invalid state to a timeout

Contrib Modules

These modules implement various extra features, that may not be ready for prime time.

Contrib Modules

These modules implement various extra features, that may not be ready for prime time.

SNI-support for Python 2

SSL with SNI-support for Python 2. Follow these instructions if you would like to verify SSL certificates in Python 2. Note, the default libraries do not do certificate checking; you need to do additional work to validate certificates yourself.

This needs the following packages installed:

  • pyOpenSSL (tested with 0.13)
  • ndg-httpsclient (tested with 0.3.2)
  • pyasn1 (tested with 0.1.6)

You can install them with the following command:

pip install pyopenssl ndg-httpsclient pyasn1

To activate certificate checking, call inject_into_urllib3() from your Python code before you begin making HTTP requests. This can be done in a sitecustomize module, or at any other time before your application begins using urllib3, like this:

try:
    import urllib3.contrib.pyopenssl
    urllib3.contrib.pyopenssl.inject_into_urllib3()
except ImportError:
    pass

Now you can use urllib3 as you normally would, and it will support SNI when the required modules are installed.

Activating this module also has the positive side effect of disabling SSL/TLS compression in Python 2 (see CRIME attack).

If you want to configure the default list of supported cipher suites, you can set the urllib3.contrib.pyopenssl.DEFAULT_SSL_CIPHER_LIST variable.

Module Variables
var DEFAULT_SSL_CIPHER_LIST:
 The list of supported SSL/TLS cipher suites.

Contributing

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
  2. Fork the urllib3 repository on Github to start making your changes.
  3. Write a test which shows that the bug was fixed or that the feature works as expected.
  4. Send a pull request and bug the maintainer until it gets merged and published. :) Make sure to add yourself to CONTRIBUTORS.txt.

Sponsorship

Please consider sponsoring urllib3 development, especially if your company benefits from this library.

  • Project Grant: A grant for contiguous full-time development has the biggest impact for progress. Periods of 3 to 10 days allow a contributor to tackle substantial complex issues which are otherwise left to linger until somebody can’t afford to not fix them.

    Contact @shazow to arrange a grant for a core contributor.

  • One-off: Development will continue regardless of funding, but donations help move things further along quicker as the maintainer can allocate more time off to work on urllib3 specifically.

    Sponsor with Credit Card Sponsor with Bitcoin
  • Recurring: You’re welcome to support the maintainer on Gittip.

Recent Sponsors

Huge thanks to all the companies and individuals who financially contributed to the development of urllib3. Please send a PR if you’ve donated and would like to be listed.