autots.tools package¶
Submodules¶
autots.tools.cpu_count module¶
CPU counter for multiprocesing.
-
autots.tools.cpu_count.
cpu_count
(modifier: float = 1)¶ Find available CPU count, running on both Windows/Linux.
- Attempts to be very conservative:
Remove Intel Hyperthreading logical cores
Find max cores allowed to the process, if less than machine has total
Runs best with psutil installed, fallsback to mkl, then os core count/2
- Parameters:
modifier (float) – multiple CPU count by this value
autots.tools.hierarchial module¶
-
class
autots.tools.hierarchial.
hierarchial
(grouping_method: str = 'tile', n_groups: int = 5, reconciliation: str = 'mean', grouping_ids: dict = None)¶ Bases:
object
Create hierarchial series, then reconcile.
Currently only performs one-level groupings. :param grouping_method: method to create groups. ‘User’ requires hier_id input of groupings. :type grouping_method: str :param n_groups: number of groups, if above is not ‘User’ :type n_groups: int :param reconciliation: None, or ‘mean’ method to combine top and bottom forecasts. :type reconciliation: str :param grouping_ids: dict of series_id: group_id to use if grouping is ‘User’ :type grouping_ids: dict
-
fit
(df)¶ Construct and save object info.
-
reconcile
(df)¶ Apply to forecasted data containing bottom and top levels.
-
transform
(df)¶ Apply hierarchy to existing data with bottom levels only.
-
autots.tools.holiday module¶
Manage holiday features.
-
autots.tools.holiday.
holiday_flag
(DTindex, country: str = 'US', encode_holiday_type: bool = False)¶ Create a 0/1 flag for given datetime index.
- Parameters:
DTindex (panda.DatetimeIndex) – DatetimeIndex of dates to create flags
country (str) – to pass through to python package Holidays
encode_holiday_type (bool) – if True, each holiday gets a unique integer, if False, 0/1 for all holidays
- Returns:
pandas.Series() with DatetimeIndex and name ‘HolidayFlag’
-
autots.tools.holiday.
query_holidays
(DTindex, country: str, encode_holiday_type: bool = False)¶ Query holidays package for dates.
- Parameters:
DTindex (panda.DatetimeIndex) – DatetimeIndex of dates to create flags
country (str) – to pass through to python package Holidays
encode_holiday_type (bool) – if True, each holiday gets a unique integer, if False, 0/1 for all holidays
autots.tools.impute module¶
Fill NA.
-
autots.tools.impute.
FillNA
(df, method: str = 'ffill', window: int = 10)¶ Fill NA values using different methods.
- Parameters:
method (str) – ‘ffill’ - fill most recent non-na value forward until another non-na value is reached ‘zero’ - fill with zero. Useful for sales and other data where NA does usually mean $0. ‘mean’ - fill all missing values with the series’ overall average value ‘median’ - fill all missing values with the series’ overall median value ‘rolling mean’ - fill with last n (window) values ‘ffill mean biased’ - simple avg of ffill and mean ‘fake date’ - shifts forward data over nan, thus values will have incorrect timestamps also most method values of pd.DataFrame.interpolate()
window (int) – length of rolling windows for filling na, for rolling methods
-
autots.tools.impute.
biased_ffill
(df, mean_weight: float = 1)¶ Fill NaN with average of last value and mean.
-
autots.tools.impute.
fake_date_fill
(df, back_method: str = 'slice')¶ Numpy vectorized version. Return a dataframe where na values are removed and values shifted forward.
Warning
Thus, values will have incorrect timestamps!
- Parameters:
back_method (str) – how to deal with tails left by shifting NaN - ‘bfill’ -back fill the last value - ‘slice’ - drop any rows above threshold where half are nan, then bfill remainder - ‘slice_all’ - drop any rows with any na - ‘keepna’ - keep the lagging na
-
autots.tools.impute.
fake_date_fill_old
(df, back_method: str = 'slice')¶ Return a dataframe where na values are removed and values shifted forward.
Warning
Thus, values will have incorrect timestamps!
- Parameters:
back_method (str) – how to deal with tails left by shifting NaN - ‘bfill’ -back fill the last value - ‘slice’ - drop any rows above threshold where half are nan, then bfill remainder - ‘slice_all’ - drop any rows with any na - ‘keepna’ - keep the lagging na
-
autots.tools.impute.
fill_forward
(df)¶ Fill NaN with previous values.
-
autots.tools.impute.
fill_forward_alt
(df)¶ Fill NaN with previous values.
-
autots.tools.impute.
fill_mean
(df)¶
-
autots.tools.impute.
fill_mean_old
(df)¶ Fill NaN with mean.
-
autots.tools.impute.
fill_median
(df)¶ Fill nan with median values. Does not work with non-numeric types.
-
autots.tools.impute.
fill_median_old
(df)¶ Fill NaN with median.
-
autots.tools.impute.
fill_zero
(df)¶ Fill NaN with zero.
-
autots.tools.impute.
fillna_np
(array, values)¶
-
autots.tools.impute.
rolling_mean
(df, window: int = 10)¶ Fill NaN with mean of last window values.
autots.tools.percentile module¶
Faster percentile and quantile for numpy
Entirely from: https://krstn.eu/np.nanpercentile()-there-has-to-be-a-faster-way/
-
autots.tools.percentile.
nan_percentile
(in_arr, q, method='linear', axis=0, errors='raise')¶ Given a 3D array, return the given percentiles as input by q. Beware this is only tested for the limited case required here, and will not match np fully. Args more limited. If errors=”rollover” passes to np.nanpercentile where args are not supported.
-
autots.tools.percentile.
nan_quantile
(arr, q, method='linear', axis=0, errors='raise')¶ Same as nan_percentile but accepts q in range [0, 1]. Args more limited. If errors=”rollover” passes to np.nanpercentile where not supported.
autots.tools.probabilistic module¶
Point to Probabilistic
-
autots.tools.probabilistic.
Point_to_Probability
(train, forecast, prediction_interval=0.9, method: str = 'historic_quantile')¶ Data driven placeholder for model error estimation.
Catlin Point to Probability method (‘a mixture of dark magic and gum disease’)
- Parameters:
train (pandas.DataFrame) – DataFrame of time series where index is DatetimeIndex
forecast (pandas.DataFrame) – DataFrame of forecast time series in which the index is a DatetimeIndex and columns/series aligned with train. Forecast must be > 1 in length.
prediction_interval (float) – confidence or perhaps credible interval
method (str) – spell to cast to create dark magic. ‘historic_quantile’, ‘inferred_normal’, ‘variable_pct_change’ gum disease available separately upon request.
- Returns:
upper_error, lower_error (two pandas.DataFrames for upper and lower bound respectively)
-
autots.tools.probabilistic.
Variable_Point_to_Probability
(train, forecast, alpha=0.3, beta=1)¶ Data driven placeholder for model error estimation.
ErrorRange = beta * (En + alpha * En-1 [cum sum of En]) En = abs(0.5 - QTP) * D D = abs(Xn - ((Avg % Change of Train * Xn-1) + Xn-1)) Xn = Forecast Value QTP = Percentile of Score in All Percent Changes of Train Score = Percent Change (from Xn-1 to Xn)
- Parameters:
train (pandas.DataFrame) – DataFrame of time series where index is DatetimeIndex
forecast (pandas.DataFrame) – DataFrame of forecast time series in which the index is a DatetimeIndex and columns/series aligned with train. Forecast must be > 1 in length.
alpha (float) – parameter which effects the broadening of error range over time Usually 0 < alpha < 1 (although it can be larger than 1)
beta (float) – parameter which effects the general width of the error bar Usually 0 < beta < 1 (although it can be larger than 1)
- Returns:
error width for each value of forecast.
- Return type:
ErrorRange (pandas.DataFrame)
-
autots.tools.probabilistic.
historic_quantile
(df_train, prediction_interval: float = 0.9, nan_flag=None)¶ Computes the difference between the median and the prediction interval range in historic data.
- Parameters:
df_train (pd.DataFrame) – a dataframe of training data
prediction_interval (float) – the desired forecast interval range
- Returns:
two 1D arrays
- Return type:
lower, upper (np.array)
-
autots.tools.probabilistic.
inferred_normal
(train, forecast, n: int = 5, prediction_interval: float = 0.9)¶ A corruption of Bayes theorem. It will be sensitive to the transformations of the data.
-
autots.tools.probabilistic.
percentileofscore_appliable
(x, a, kind='rank')¶
autots.tools.profile module¶
Profiling
-
autots.tools.profile.
data_profile
(df)¶ Input: a pd DataFrame of columns which are time series, and a datetime index
Output: a pd DataFrame of column per time series, with rows which are statistics
autots.tools.regressor module¶
-
autots.tools.regressor.
create_lagged_regressor
(df, forecast_length: int, frequency: str = 'infer', scale: bool = True, summarize: str = None, backfill: str = 'bfill', n_jobs: str = 'auto', fill_na: str = 'ffill')¶ Create a regressor of features lagged by forecast length. Useful to some models that don’t otherwise use such information.
It is recommended that the .head(forecast_length) of both regressor_train and the df for training are dropped. df = df.iloc[forecast_length:]
- Parameters:
df (pd.DataFrame) – training data
forecast_length (int) – length of forecasts, to shift data by
frequency (str) – the ever necessary frequency for datetime things. Default ‘infer’
scale (bool) – if True, use the StandardScaler to standardize the features
summarize (str) – options to summarize the features, if large: ‘pca’, ‘median’, ‘mean’, ‘mean+std’, ‘feature_agglomeration’, ‘gaussian_random_projection’, “auto”
backfill (str) – method to deal with the NaNs created by shifting “bfill”- backfill with last values “ETS” -backfill with ETS backwards forecast “DatepartRegression” - backfill with DatepartRegression
fill_na (str) – method to prefill NAs in data, same methods as available elsewhere
- Returns:
regressor_train, regressor_forecast
-
autots.tools.regressor.
create_regressor
(df, forecast_length, frequency: str = 'infer', holiday_countries: list = ['US'], datepart_method: str = 'recurring', drop_most_recent: int = 0, scale: bool = True, summarize: str = 'auto', backfill: str = 'bfill', n_jobs: str = 'auto', fill_na: str = 'ffill', aggfunc: str = 'first')¶ Create a regressor from information available in the existing dataset. Components: are lagged data, datepart information, and holiday.
All of this info and more is already created by the ~Regression models, but this may help some other models (GLM, WindowRegression)
It is recommended that the .head(forecast_length) of both regressor_train and the df for training are dropped. df = df.iloc[forecast_length:] If you don’t want the lagged features, set summarize=”median” which will only give one column of such, which can then be easily dropped
- Parameters:
df (pd.DataFrame) – WIDE style dataframe (use long_to_wide if the data isn’t already) categorical features will be discard for this, if present
forecast_length (int) – time ahead that will be forecast
frequency (str) – those annoying offset codes you have to always use for time series
holiday_countries (list) – list of countries to pull holidays for. Reqs holidays pkg
datepart_method (str) – see date_part from seasonal
scale (bool) – if True, use the StandardScaler to standardize the features
summarize (str) – options to summarize the features, if large: ‘pca’, ‘median’, ‘mean’, ‘mean+std’, ‘feature_agglomeration’, ‘gaussian_random_projection’
backfill (str) – method to deal with the NaNs created by shifting “bfill”- backfill with last values “ETS” -backfill with ETS backwards forecast “DatepartRegression” - backfill with DatepartRegression
fill_na (str) – method to prefill NAs in data, same methods as available elsewhere
aggfunc (str) – str or func, used if frequency is resampled
- Returns:
regressor_train, regressor_forecast
autots.tools.seasonal module¶
seasonal
@author: Colin
-
autots.tools.seasonal.
date_part
(DTindex, method: str = 'simple', set_index: bool = True, polynomial_degree: int = None)¶ Create date part columns from pd.DatetimeIndex.
- Parameters:
DTindex (pd.DatetimeIndex) – datetime index to provide dates
method (str) – expanded, recurring, or simple simple - just day, year, month, weekday expanded - all available futures recurring - all features that should commonly repeat without aging
set_index (bool) – if True, return DTindex as index of df
polynomial_degree (int) – add this degree of sklearn polynomial features if not None
- Returns:
pd.Dataframe with DTindex
-
autots.tools.seasonal.
seasonal_int
(include_one: bool = False, small=False)¶ Generate a random integer of typical seasonalities.
autots.tools.shaping module¶
Reshape data.
-
class
autots.tools.shaping.
NumericTransformer
(na_strings: list = ['', ' '], categorical_fillna: str = 'ffill', handle_unknown: str = 'use_encoded_value', verbose: int = 0)¶ Bases:
object
General purpose numeric conversion for pandas dataframes.
All categorical data and levels must be passed to .fit(). If new categorical series or levels are present in .transform() it won’t work!
Currently datetimes cannot be inverse_transformed back to datetime
- Parameters:
na_strings (list) – list of strings to replace as pd.NA
categorical_fillna (str) – how to fill NaN for categorical variables (numeric NaN are unaltered) “ffill” - uses forward and backward filling to supply na values “indicator” or anything else currently results in all missing replaced with str “missing_value”
handle_unknown (str) – passed through to scikit-learn OrdinalEncoder
verbose (int) – greater than 0 to print some messages
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
inverse_transform
(df, convert_dtypes: bool = False)¶ Convert numeric back to categorical. :param df: df :type df: pandas.DataFrame :param convert_dtypes: whether to use pd.convert_dtypes after inverse :type convert_dtypes: bool
-
transform
(df)¶ Convert categorical dataset to numeric.
-
autots.tools.shaping.
clean_weights
(weights, series, verbose=0)¶ Polish up series weighting information
- Parameters:
weights (dict) – dictionary of series_id: weight (float or int)
series (iterable) – list of series_ids in the dataset
-
autots.tools.shaping.
df_cleanup
(df_wide, frequency: str = 'infer', prefill_na: str = None, na_tolerance: float = 0.999, drop_data_older_than_periods: int = 100000, drop_most_recent: int = 0, aggfunc: str = 'first', verbose: int = 1)¶ Pass cleaning functions through to dataframe.
- Parameters:
df_wide (pd.DataFrame) – input dataframe to clean.
frequency (str, optional) – frequency in string of alias for DateOffset object, normally “1D” -daily, “MS” -month start etc. Currently, aliases are listed somewhere in here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html. Defaults to “infer”.
na_tolerance (float, optional) – allow up to this percent of values to be NaN, else drop the entire series. The default of 0.95 means a series can be 95% NaN values and still be included. Defaults to 0.999.
drop_data_older_than_periods (int, optional) – cut off older data because eventually you just get too much. Defaults to 100000.
drop_most_recent (int, optional) – number of most recent data points to remove. Useful if you pull monthly data before month end, and you don’t want an incomplete month appearing complete. Defaults to 0.
aggfunc (str, optional) – passed to pd.pivot_table, determines how to aggregate duplicates for upsampling. Other options include “mean” and other numpy functions, beware data must already be input as numeric type for these to work. If categorical data is provided, aggfunc=’first’ is recommended. Defaults to ‘first’.
verbose (int, optional) – 0 for silence, higher values for more noise. Defaults to 1.
- Returns:
original dataframe, now possibly shorter.
- Return type:
pd.DataFrame
-
autots.tools.shaping.
infer_frequency
(df_wide, warn=True, **kwargs)¶ Infer the frequency in a slightly more robust way.
- Parameters:
df_wide (pd.Dataframe or pd.DatetimeIndex) – input to pull frequency from
warn (bool) – unused, here to make swappable with pd.infer_freq
-
autots.tools.shaping.
long_to_wide
(df, date_col: str = 'datetime', value_col: str = 'value', id_col: str = 'series_id', aggfunc: str = 'first')¶ Take long data and convert into wide, cleaner data.
- Parameters:
df (pd.DataFrame) –
date_col (str) –
value_col (str) –
the name of the column with the values of the time series (ie sales $)
id_col (str) –
name of the id column, unique for each time series
aggfunc (str) –
passed to pd.pivot_table, determines how to aggregate duplicates for series_id and datetime
other options include “mean” and other numpy functions, beware data must already be input as numeric type for these to work. if categorical data is provided, aggfunc=’first’ is recommended
-
autots.tools.shaping.
simple_train_test_split
(df, forecast_length: int = 10, min_allowed_train_percent: float = 0.3, verbose: int = 1)¶ Uses the last periods of forecast_length as the test set, the rest as train
- Parameters:
forecast_length (int) – number of future periods to predict
min_allowed_train_percent (float) –
forecast length cannot be greater than 1 - this
constrains the forecast length from being much larger than than the training data note this includes NaNs in current configuration
- Returns:
train, test (both pd DataFrames)
-
autots.tools.shaping.
subset_series
(df, weights, n: int = 1000, random_state: int = 2020)¶ Return a sample of time series.
- Parameters:
df (pd.DataFrame) – wide df with series as columns and DT index
n (int) – number of unique time series to keep, or None
random_state (int) – random seed
autots.tools.transform module¶
Preprocessing data methods.
-
class
autots.tools.transform.
CenterLastValue
(rows: int = 1, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Scale all data relative to the last value(s) of the series.
- Parameters:
rows (int) – number of rows to average from most recent data
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
ClipOutliers
(method: str = 'clip', std_threshold: float = 4, fillna: str = None, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
PURGE THE OUTLIERS.
- Parameters:
method (str) – “clip” or “remove”
std_threshold (float) – number of std devs from mean to call an outlier
fillna (str) – fillna method to use per tools.impute.FillNA
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
CumSumTransformer
(**kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Cumulative Sum of Data.
Warning
Inverse transformed values returned will also not return as ‘exactly’ equals due to floating point imprecision. inverse_transform can only be applied to the original series, or an immediately following forecast
-
fit
(df)¶ Fits.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame :param df: input dataframe :type df: pandas.DataFrame
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Returns data to original or forecast form
- Parameters:
df (pandas.DataFrame) – input dataframe
trans_method (str) – whether to inverse on original data, or on a following sequence - ‘original’ return original data to original numbers - ‘forecast’ inverse the transform on a dataset immediately following the original
-
transform
(df)¶ Returns changed data :param df: input dataframe :type df: pandas.DataFrame
-
autots.tools.transform.
DatepartRegression
¶ alias of
autots.tools.transform.DatepartRegressionTransformer
-
class
autots.tools.transform.
DatepartRegressionTransformer
(regression_model: dict = {'model': 'DecisionTree', 'model_params': {'max_depth': 5, 'min_samples_split': 2}}, datepart_method: str = 'expanded', polynomial_degree: int = None, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Remove a regression on datepart from the data. See tools.seasonal.date_part
-
fit
(df)¶ Fits trend for later detrending.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df)¶ Return data to original form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
-
class
autots.tools.transform.
Detrend
(model: str = 'GLS', phi: float = 1.0, window: int = None, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Remove a linear trend from the data.
-
fit
(df)¶ Fits trend for later detrending.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df)¶ Return data to original form. Will only match original if phi==1
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
-
class
autots.tools.transform.
DifferencedTransformer
(**kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Difference from lag n value. inverse_transform can only be applied to the original series, or an immediately following forecast
- Parameters:
lag (int) – number of periods to shift (not implemented, default = 1)
-
fit
(df)¶ Fit. :param df: input dataframe :type df: pandas.DataFrame
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame :param df: input dataframe :type df: pandas.DataFrame
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Returns data to original or forecast form
- Parameters:
df (pandas.DataFrame) – input dataframe
trans_method (str) – whether to inverse on original data, or on a following sequence - ‘original’ return original data to original numbers - ‘forecast’ inverse the transform on a dataset immediately following the original
-
transform
(df)¶ Return differenced data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
Discretize
(discretization: str = 'center', n_bins: int = 10, nan_flag=False, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Round/convert data to bins.
- Parameters:
discretization (str) – method of binning to apply None - no discretization ‘center’ - values are rounded to center value of each bin ‘lower’ - values are rounded to lower range of closest bin ‘upper’ - values are rounded up to upper edge of closest bin ‘sklearn-quantile’, ‘sklearn-uniform’, ‘sklearn-kmeans’ - sklearn kbins discretizer
n_bins (int) – number of bins to group data into.
nan_flag (bool) – set to True if this has to run on NaN values
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
EWMAFilter
(span: int = 7, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Irreversible filters of Exponential Weighted Moving Average
- Parameters:
span (int) – span of exponetial period to convert to alpha
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
EmptyTransformer
(name: str = 'EmptyTransformer', **kwargs)¶ Bases:
object
Base transformer returning raw data.
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
-
class
autots.tools.transform.
FastICA
(**kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
sklearn FastICA for signal decomposition. But need to store columns.
- Parameters:
span (int) – span of exponetial period to convert to alpha
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
GeneralTransformer
(fillna: str = None, transformations: dict = {}, transformation_params: dict = {}, grouping: str = None, reconciliation: str = None, grouping_ids=None, random_seed: int = 2020)¶ Bases:
object
Remove fillNA and then mathematical transformations.
Expects a chronologically sorted pandas.DataFrame with a DatetimeIndex, only numeric data, and a ‘wide’ (one column per series) shape.
Warning
- inverse_transform will not fully return the original data under many conditions
the primary intention of inverse_transform is to inverse for forecast (immediately following the historical time period) data from models, not to return original data
NAs filled will be returned with the filled value
Discretization, statsmodels filters, Round, Slice, ClipOutliers cannot be inversed
- RollingMean, PctChange, CumSum, Seasonal Difference, and DifferencedTransformer will only return original or an immediately following forecast
by default ‘forecast’ is expected, ‘original’ can be set in trans_method
- Parameters:
fillNA (str) –
method to fill NA, passed through to FillNA()
’ffill’ - fill most recent non-na value forward until another non-na value is reached ‘zero’ - fill with zero. Useful for sales and other data where NA does usually mean $0. ‘mean’ - fill all missing values with the series’ overall average value ‘median’ - fill all missing values with the series’ overall median value ‘rolling_mean’ - fill with last n (window = 10) values ‘rolling_mean_24’ - fill with avg of last 24 ‘ffill_mean_biased’ - simple avg of ffill and mean ‘fake_date’ - shifts forward data over nan, thus values will have incorrect timestamps ‘IterativeImputer’ - sklearn iterative imputer most of the interpolate methods from pandas.interpolate
transformations (dict) –
transformations to apply {0: “MinMaxScaler”, 1: “Detrend”, …}
’None’ ‘MinMaxScaler’ - Sklearn MinMaxScaler ‘PowerTransformer’ - Sklearn PowerTransformer ‘QuantileTransformer’ - Sklearn ‘MaxAbsScaler’ - Sklearn ‘StandardScaler’ - Sklearn ‘RobustScaler’ - Sklearn ‘PCA, ‘FastICA’ - performs sklearn decomposition and returns n-cols worth of n_components ‘Detrend’ - fit then remove a linear regression from the data ‘RollingMeanTransformer’ - 10 period rolling average, can receive a custom window by transformation_param if used as second_transformation ‘FixedRollingMean’ - same as RollingMean, but with inverse_transform disabled, so smoothed forecasts are maintained. ‘RollingMean10’ - 10 period rolling average (smoothing) ‘RollingMean100thN’ - Rolling mean of periods of len(train)/100 (minimum 2) ‘DifferencedTransformer’ - makes each value the difference of that value and the previous value ‘PctChangeTransformer’ - converts to pct_change, not recommended if lots of zeroes in data ‘SinTrend’ - removes a sin trend (fitted to each column) from the data ‘CumSumTransformer’ - makes value sum of all previous ‘PositiveShift’ - makes all values >= 1 ‘Log’ - log transform (uses PositiveShift first as necessary) ‘IntermittentOccurrence’ - -1, 1 for non median values ‘SeasonalDifference’ - remove the last lag values from all values ‘SeasonalDifferenceMean’ - remove the average lag values from all ‘SeasonalDifference7’,’12’,’28’ - non-parameterized version of Seasonal ‘CenterLastValue’ - center data around tail of dataset ‘Round’ - round values on inverse or transform ‘Slice’ - use only recent records ‘ClipOutliers’ - remove outliers ‘Discretize’ - bin or round data into groups ‘DatepartRegression’ - move a trend trained on datetime index “ScipyFilter” - filter data (lose information but smoother!) from scipy “HPFilter” - statsmodels hp_filter “STLFilter” - seasonal decompose and keep just one part of decomposition “EWMAFilter” - use an exponential weighted moving average to smooth data
transformation_params (dict) – params of transformers {0: {}, 1: {‘model’: ‘Poisson’}, …} pass through dictionary of empty dictionaries to utilize defaults
random_seed (int) – random state passed through where applicable
-
fill_na
(df, window: int = 10)¶ - Parameters:
df (pandas.DataFrame) – Datetime Indexed
window (int) – passed through to rolling mean fill technique
- Returns:
pandas.DataFrame
-
fit
(df)¶ Apply transformations and return transformer object.
- Parameters:
df (pandas.DataFrame) – Datetime Indexed
-
fit_transform
(df)¶ Directly fit and apply transformations to convert df.
-
inverse_transform
(df, trans_method: str = 'forecast', fillzero: bool = False)¶ Undo the madness.
- Parameters:
df (pandas.DataFrame) – Datetime Indexed
trans_method (str) – ‘forecast’ or ‘original’ passed through
fillzero (bool) – if inverse returns NaN, fill with zero
-
classmethod
retrieve_transformer
(transformation: str = None, param: dict = {}, df=None, random_seed: int = 2020)¶ Retrieves a specific transformer object from a string.
- Parameters:
df (pandas.DataFrame) – Datetime Indexed - required to set params for some transformers
transformation (str) – name of desired method
param (dict) – dict of kwargs to pass (legacy: an actual param)
- Returns:
transformer object
-
transform
(df)¶ Apply transformations to convert df.
-
class
autots.tools.transform.
HPFilter
(part: str = 'trend', lamb: float = 1600, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Irreversible filters.
- Parameters:
lamb (int) – lambda for hpfilter
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
IntermittentOccurrence
(center: str = 'median', **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Intermittent inspired binning predicts probability of not center.
Does not inverse to original values!
- Parameters:
center (str) – one of “mean”, “median”, “midhinge”
-
fit
(df)¶ Fits shift interval.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df)¶ Return data to original form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ 0 if Median. 1 if > Median, -1 if less.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
PCA
(**kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
sklearn PCA for signal decomposition. But need to store columns.
- Parameters:
span (int) – span of exponetial period to convert to alpha
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
PctChangeTransformer
(**kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
% Change of Data.
Warning
Because % change doesn’t play well with zeroes, zeroes are replaced by positive of the lowest non-zero value. Inverse transformed values returned will also not return as ‘exactly’ equals due to floating point imprecision. inverse_transform can only be applied to the original series, or an immediately following forecast
-
fit
(df)¶ Fits.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fit and Return Magical DataFrame. :param df: input dataframe :type df: pandas.DataFrame
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Returns data to original or forecast form
- Parameters:
df (pandas.DataFrame) – input dataframe
trans_method (str) – whether to inverse on original data, or on a following sequence - ‘original’ return original data to original numbers - ‘forecast’ inverse the transform on a dataset immediately following the original
-
transform
(df)¶ Returns changed data :param df: input dataframe :type df: pandas.DataFrame
-
class
autots.tools.transform.
PositiveShift
(log: bool = False, center_one: bool = True, squared=False, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Shift each series if necessary to assure all values >= 1.
- Parameters:
log (bool) – whether to include a log transform.
center_one (bool) – whether to shift to 1 instead of 0.
squared (bool) – whether to square (**2) values after shift.
-
fit
(df)¶ Fits shift interval.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
inverse_transform
(df)¶ Return data to original form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
autots.tools.transform.
RandomTransform
(transformer_list: dict = {None: 0.0, 'MinMaxScaler': 0.05, 'PowerTransformer': 0.02, 'QuantileTransformer': 0.05, 'MaxAbsScaler': 0.05, 'StandardScaler': 0.04, 'RobustScaler': 0.05, 'PCA': 0.01, 'FastICA': 0.01, 'Detrend': 0.1, 'RollingMeanTransformer': 0.02, 'RollingMean100thN': 0.01, 'DifferencedTransformer': 0.07, 'SinTrend': 0.01, 'PctChangeTransformer': 0.01, 'CumSumTransformer': 0.02, 'PositiveShift': 0.02, 'Log': 0.01, 'IntermittentOccurrence': 0.01, 'SeasonalDifference': 0.1, 'cffilter': 0.01, 'bkfilter': 0.05, 'convolution_filter': 0.001, 'HPFilter': 0.01, 'DatepartRegression': 0.01, 'ClipOutliers': 0.05, 'Discretize': 0.03, 'CenterLastValue': 0.01, 'Round': 0.02, 'Slice': 0.02, 'ScipyFilter': 0.02, 'STLFilter': 0.01, 'EWMAFilter': 0.02}, transformer_max_depth: int = 4, na_prob_dict: dict = {'ffill': 0.4, 'fake_date': 0.1, 'rolling_mean': 0.1, 'rolling_mean_24': 0.1, 'IterativeImputer': 0.05, 'mean': 0.06, 'zero': 0.05, 'ffill_mean_biased': 0.1, 'median': 0.03, None: 0.001, 'interpolate': 0.4, 'KNNImputer': 0.05, 'IterativeImputerExtraTrees': 0.0001}, fast_params: bool = None, superfast_params: bool = None, traditional_order: bool = False)¶ Return a dict of randomly choosen transformation selections.
SinTrend is used as a signal that slow parameters are allowed.
-
class
autots.tools.transform.
RollingMeanTransformer
(window: int = 10, fixed: bool = False, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Attempt at Rolling Mean with built-in inverse_transform for time series inverse_transform can only be applied to the original series, or an immediately following forecast Does not play well with data with NaNs Inverse transformed values returned will also not return as ‘exactly’ equals due to floating point imprecision.
- Parameters:
window (int) – number of periods to take mean over
-
fit
(df)¶ Fits.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame :param df: input dataframe :type df: pandas.DataFrame
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Returns data to original or forecast form
- Parameters:
df (pandas.DataFrame) – input dataframe
trans_method (str) – whether to inverse on original data, or on a following sequence - ‘original’ return original data to original numbers - ‘forecast’ inverse the transform on a dataset immediately following the original
-
transform
(df)¶ Returns rolling data :param df: input dataframe :type df: pandas.DataFrame
-
class
autots.tools.transform.
Round
(decimals: int = 0, on_transform: bool = False, on_inverse: bool = True, force_int: bool = False, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Round all values. Convert into Integers if decimal <= 0.
Inverse_transform will not undo the transformation!
- Parameters:
method (str) – only “middle”, in future potentially up/ceiling floor/down
decimals (int) – number of decimal places to round to.
on_transform (bool) – perform rounding on transformation
on_inverse (bool) – perform rounding on inverse transform
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
STLFilter
(decomp_type='STL', part: str = 'trend', seasonal: int = 7, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Irreversible filters.
- Parameters:
decomp_type (str) – which decomposition to use
part (str) – which part of decomposition to return
seaonal (int) – seaonsal component of STL
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
ScipyFilter
(method: str = 'hilbert', method_args: list = None, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Irreversible filters from Scipy
- Parameters:
method (str) – “hilbert”, “wiener”, “savgol_filter”, “butter”, “cheby1”, “cheby2”, “ellip”, “bessel”,
method_args (list) – passed to filter as appropriate
-
fit
(df)¶ Fits filter.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df)¶ Return data to original form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
SeasonalDifference
(lag_1: int = 7, method: str = 'LastValue', **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Remove seasonal component.
- Parameters:
lag_1 (int) – length of seasonal period to remove.
method (str) – ‘LastValue’, ‘Mean’, ‘Median’ to construct seasonality
-
fit
(df)¶ Fits.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame :param df: input dataframe :type df: pandas.DataFrame
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Returns data to original or forecast form
- Parameters:
df (pandas.DataFrame) – input dataframe
trans_method (str) – whether to inverse on original data, or on a following sequence - ‘original’ return original data to original numbers - ‘forecast’ inverse the transform on a dataset immediately following the original
-
transform
(df)¶ Returns rolling data :param df: input dataframe :type df: pandas.DataFrame
-
class
autots.tools.transform.
SinTrend
(**kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Modelling sin.
-
fit
(df)¶ Fits trend for later detrending :param df: input dataframe :type df: pandas.DataFrame
-
fit_sin
(tt, yy)¶ Fit sin to the input time sequence, and return fitting parameters “amp”, “omega”, “phase”, “offset”, “freq”, “period” and “fitfunc”
from user unsym @ https://stackoverflow.com/questions/16716302/how-do-i-fit-a-sine-curve-to-my-data-with-pylab-and-numpy
-
fit_transform
(df)¶ Fits and Returns Detrended DataFrame :param df: input dataframe :type df: pandas.DataFrame
-
inverse_transform
(df)¶ Returns data to original form :param df: input dataframe :type df: pandas.DataFrame
-
transform
(df)¶ Returns detrended data :param df: input dataframe :type df: pandas.DataFrame
-
-
class
autots.tools.transform.
Slice
(method: str = '100', forecast_length: int = 30, **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Take the .tail() of the data returning only most recent values.
Inverse_transform will not undo the transformation!
- Parameters:
method (str) – only “middle”, in future potentially up/ceiling floor/down
forecast_length (int) – forecast horizon, scales some slice windows
-
fit
(df)¶ Learn behavior of data to change.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
fit_transform
(df)¶ Fits and Returns Magical DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
static
get_new_params
(method: str = 'random')¶ Generate new random parameters
-
inverse_transform
(df, trans_method: str = 'forecast')¶ Return data to original or forecast form.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return changed data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
class
autots.tools.transform.
StatsmodelsFilter
(method: str = 'bkfilter', **kwargs)¶ Bases:
autots.tools.transform.EmptyTransformer
Irreversible filters.
- Parameters:
method (str) – bkfilter or cffilter or convolution_filter
-
fit_transform
(df)¶ Fit and Return Detrended DataFrame.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
transform
(df)¶ Return detrended data.
- Parameters:
df (pandas.DataFrame) – input dataframe
-
autots.tools.transform.
clip_outliers
(df, std_threshold: float = 3)¶ Replace outliers above threshold with that threshold. Axis = 0.
- Parameters:
df (pandas.DataFrame) – DataFrame containing numeric data
std_threshold (float) – The number of standard deviations away from mean to count as outlier.
-
autots.tools.transform.
get_transformer_params
(transformer: str = 'EmptyTransformer', method: str = None)¶ Retrieve new random params for new Transformers.
-
autots.tools.transform.
remove_outliers
(df, std_threshold: float = 3)¶ Replace outliers with np.nan. https://stackoverflow.com/questions/23199796/detect-and-exclude-outliers-in-pandas-data-frame
- Parameters:
df (pandas.DataFrame) – DataFrame containing numeric data, DatetimeIndex
std_threshold (float) – The number of standard deviations away from mean to count as outlier.
-
autots.tools.transform.
simple_context_slicer
(df, method: str = 'None', forecast_length: int = 30)¶ Condensed version of context_slicer with more limited options.
- Parameters:
df (pandas.DataFrame) – training data frame to slice
method (str) –
Option to slice dataframe ‘None’ - return unaltered dataframe ‘HalfMax’ - return half of dataframe ‘ForecastLength’ - return dataframe equal to length of forecast ‘2ForecastLength’ - return dataframe equal to twice length of forecast
(also takes 4, 6, 8, 10 in addition to 2)
’n’ - any integer length to slice by ‘-n’ - full length less this amount “0.n” - this percent of the full data
-
autots.tools.transform.
transformer_list_to_dict
(transformer_list)¶ Convert various possibilities to dict.
autots.tools.window_functions module¶
-
autots.tools.window_functions.
last_window
(df, window_size: int = 10, input_dim: str = 'univariate', normalize_window: bool = False)¶ Pandas based function to provide the last window of window_maker.
-
autots.tools.window_functions.
retrieve_closest_indices
(df, num_indices, forecast_length, window_size: int = 10, distance_metric: str = 'braycurtis', stride_size: int = 1, start_index: int = None, include_differenced: bool = False, include_last: bool = True, verbose: int = 0)¶ Find next indicies closest to the final segment of forecast_length
- Parameters:
df (pd.DataFrame) – source data in wide format
num_indices (int) – number of indices to return
forecast_length (int) – length of forecast
window_size (int) – length of comparison
distance_metric (str) – distance measure from scipy and nan_euclidean
stride_size (int) – length of spacing between windows
start_index (int) – index to begin creation of windows from
include_difference (bool) – if True, also compare on differences
-
autots.tools.window_functions.
rolling_window_view
(array, window_shape=(0, ), axis=None, writeable=False)¶ Create a view of array which for every point gives the n-dimensional neighbourhood of size window. New dimensions are added at the end of array or after the corresponding original dimension.
Based on: https://gist.github.com/seberg/3866040 but designed to match the newer np.sliding_window_view
- Parameters:
array (np.array) – Array to which the rolling window is applied.
window_shape (int) – Either a single integer to create a window of only the last axis or a tuple to create it for the last len(window) axis. 0 can be used as a to ignore a dimension in the window.
axis (int) – If given, must have the same size as window. In this case window is interpreted as the size in the dimension given by axis. IE. a window of (2, 1) is equivalent to window=2 and axis=-2.
- Returns:
A view on array which is smaller to fit the windows and has windows added dimensions (0s not counting), ie. every point of array is an array of size window.
-
autots.tools.window_functions.
sliding_window_view
(array, window_shape=(0, ), axis=None, writeable=False, **kwargs)¶ Toggles between numpy and internal version depending on np.__version__.
-
autots.tools.window_functions.
window_id_maker
(window_size: int, max_steps: int, start_index: int = 0, stride_size: int = 1, skip_size: int = 1)¶ Create indices for array of multiple window slices of data
- Parameters:
window_size (int) – length of time history to include
max_steps (int) – the maximum number of windows to create
start_index (int) – if to not start at the first point, start at this point
stride_size (int) – number of skips between each window start point
skip_size (int) – number of skips between each obs in a window (downsamples)
- Returns:
np.array with 3D shape (num windows, window_length, num columns/series), 2D array if only 1D array provided)
-
autots.tools.window_functions.
window_maker
(df, window_size: int = 10, input_dim: str = 'univariate', normalize_window: bool = False, shuffle: bool = False, output_dim: str = 'forecast_length', forecast_length: int = 1, max_windows: int = 5000, regression_type: str = None, future_regressor=None, random_seed: int = 1234)¶ Convert a dataset into slices with history and y forecast.
- Parameters:
df (pd.DataFrame) – wide format df with sorted index
window_size (int) – length of history to use for X window
input_dim (str) – univariate or multivariate. If multivariate, all series in single X row
shuffle (bool) – (deprecated)
output_dim (str) – ‘forecast_length’ or ‘1step’ where 1 step is basically forecast_length=1
forecast_length (int) – number of periods ahead that will be forecast
max_windows (int) – a cap on total number of windows to generate. If exceeded, random of this int are selected.
regression_type (str) – None or “user” if to try to concat regressor to windows
future_regressor (pd.DataFrame) – values of regressor if used
random_seed (int) – a consistent random
- Returns:
X, Y
-
autots.tools.window_functions.
window_maker_2
(array, window_size: int, max_steps: int = None, start_index: int = 0, stride_size: int = 1, skip_size: int = 1)¶ Create array of multiple window slices of data Note that this returns a different orientation than window_maker_3
- Parameters:
array (np.array) – source of historic information of shape (num_obs, num_series)
window_size (int) – length of time history to include
max_steps (int) – the maximum number of windows to create
start_index (int) – if to not start at the first point, start at this point
stride_size (int) – number of skips between each window start point
skip_size (int) – number of skips between each obs in a window (downsamples)
- Returns:
np.array with 3D shape (num windows, window_length, num columns/series), 2D array if only 1D array provided)
-
autots.tools.window_functions.
window_maker_3
(array, window_size: int, **kwargs)¶ stride tricks version of window. About 40% faster than window_maker_2 Note that this returns a different orientation than window_maker_2
- Parameters:
array (np.array) – in shape of (num_obs, num_series)
window_size (int) – length of slice of history
passed to np.lib.stride_tricks.sliding_window_view (**kwargs) –
- Returns:
np.array with 3D shape (num windows, num columns/series, window_length), 2D array if only 1D array provided)
Module contents¶
Basic utilities.
-
-