geoplot.
sankey
(*args, projection=None, start=None, end=None, path=None, hue=None, categorical=False, scheme=None, k=5, cmap='viridis', vmin=None, vmax=None, legend=False, legend_kwargs=None, legend_labels=None, legend_values=None, legend_var=None, extent=None, figsize=(8, 6), ax=None, scale=None, limits=(1, 5), scale_func=None, **kwargs)¶Spatial Sankey or flow map.
Parameters: |
|
---|---|
Returns: | The plot axis |
Return type: |
|
Examples
A Sankey diagram is a simple visualization demonstrating flow
through a network. A Sankey diagram is useful when you wish to show the volume of things moving between points or
spaces: traffic load a road network, for example, or inter-airport travel volumes. The geoplot
sankey
adds spatial context to this plot type by laying out the points in meaningful locations: airport locations, say,
or road intersections.
A basic sankey
specifies data, start
points, end
points, and, optionally, a projection. The df
argument is optional; if geometries are provided as independent iterables it is ignored. We overlay world
geometry to aid interpretability.
ax = gplt.sankey(la_flights, start='start', end='end', projection=gcrs.PlateCarree())
ax.set_global(); ax.coastlines()
The lines appear curved because they are great circle paths, which are the shortest routes between points on a sphere.
ax = gplt.sankey(la_flights, start='start', end='end', projection=gcrs.Orthographic())
ax.set_global(); ax.coastlines(); ax.outline_patch.set_visible(True)
To plot using a different distance metric pass a cartopy
crs
object (not a geoplot
one) to the
path
parameter.
import cartopy.crs as ccrs
ax = gplt.sankey(la_flights, start='start', end='end', projection=gcrs.PlateCarree(), path=ccrs.PlateCarree())
ax.set_global(); ax.coastlines()
If your data has custom paths, you can use those instead, via the path
parameter.
gplt.sankey(dc, path=dc.geometry, projection=gcrs.AlbersEqualArea(), scale='aadt')
hue
parameterizes the color, and cmap
controls the colormap. legend
adds a a legend. Keyword
arguments can be passed to the legend using the legend_kwargs
argument. These arguments will be
passed to the underlying matplotlib
Legend. The loc
and bbox_to_anchor
parameters are particularly useful for positioning the legend.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.4, 1.0)})
ax.set_global()
ax.coastlines()
Change the number of bins by specifying an alternative k
value. To use a continuous colormap, explicitly
specify k=None
. You can change the binning sceme with scheme
. The default is quantile
, which bins
observations into classes of different sizes but the same numbers of observations. equal_interval
will
creates bins that are the same size, but potentially containing different numbers of observations. The more
complicated fisher_jenks
scheme is an intermediate between the two.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='mock_variable', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.25, 1.0)},
k=3, scheme='equal_interval')
ax.set_global()
ax.coastlines()
If your variable of interest is already categorical, specify categorical=True
to
use the labels in your dataset directly.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
hue='above_meridian', cmap='RdYlBu',
legend=True, legend_kwargs={'bbox_to_anchor': (1.2, 1.0)},
categorical=True)
ax.set_global()
ax.coastlines()
scale
can be used to enable linewidth
as a visual variable. Adjust the upper and lower bound with the
limits
parameter.
ax = gplt.sankey(la_flights, projection=gcrs.PlateCarree(),
extent=(-125.0011, -66.9326, 24.9493, 49.5904),
start='start', end='end',
scale='Passengers',
limits=(0.1, 5),
legend=True, legend_kwargs={'bbox_to_anchor': (1.1, 1.0)})
ax.coastlines()
The default scaling function is linear: an observations at the midpoint of two others will be exactly midway
between them in size. To specify an alternative scaling function, use the scale_func
parameter. This should
be a factory function of two variables which, when given the maximum and minimum of the dataset,
returns a scaling function which will be applied to the rest of the data. A demo is available in
the example gallery.
def trivial_scale(minval, maxval): return lambda v: 1
ax = gplt.sankey(la_flights, projection=gcrs.PlateCarree(),
extent=(-125.0011, -66.9326, 24.9493, 49.5904),
start='start', end='end',
scale='Passengers', scale_func=trivial_scale,
legend=True, legend_kwargs={'bbox_to_anchor': (1.1, 1.0)})
ax.coastlines()
hue
and scale
can co-exist. In case more than one visual variable is used, control which one appears in
the legend using legend_var
.
ax = gplt.sankey(network, projection=gcrs.PlateCarree(),
start='from', end='to',
scale='mock_data',
legend=True, legend_kwargs={'bbox_to_anchor': (1.1, 1.0)},
hue='mock_data', legend_var="hue")
ax.set_global()
ax.coastlines()