hypertools.tools.describe_pca

hypertools.tools.describe_pca(x, show=True)[source]

Create plot describing covariance with as a function of number of dimensions

This function correlates the raw data with PCA reduced data to get a sense for how well the data can be summarized with n dimensions. Useful for evaluating quality of PCA reduced plots.

Parameters:

x : Numpy array, DataFrame or list of arrays/dfs

A list of Numpy arrays or Pandas Dataframes

Returns:

fig, ax, attr : maplotlib.Figure, matplotlib.Axes, dict

By default, a matplotlib figure and axis handle, and a data dictionary are returned. The dictionary comprises: PCA_summary : dict and average : list. This is a list of the average (over input lists) correlation between the raw data and the dimensionality reduced data. The length is determined by the number of components that explain the most data. Note: the length is typically not as long as the number of features because the PCA model is whitened. If show=False, only attr is returned