Luminaire Outlier Detection Models: Structural Modeling¶
-
exception
luminaire.model.lad_structural.
LADStructuralError
(message)¶ Exception class for Luminaire structural anomaly detection model.
-
class
luminaire.model.lad_structural.
LADStructuralHyperParams
(include_holidays_exog=True, p=2, q=2, is_log_transformed=True, max_ft_freq=3)¶ Exception class for Luminaire structural anomaly detection model.
- Parameters
include_holidays_exog (bool, optional) – whether to include holidays as exogenous variables in the regression. Holidays are defined in
LADHolidays
p (int, optional) – Order for the AR component of the model.
q (int, optional) – Order for the MA component of the model.
is_log_transformed (bool, optional) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.
max_ft_freq (int, optional) – The maximum frequency order for the Fourier transformation.
-
class
luminaire.model.lad_structural.
LADStructuralModel
(hyper_params: {‘include_holidays_exog’: True, ‘p’: 2, ‘q’: 2, ‘is_log_transformed’: True, ‘max_ft_freq’: 3}, freq, min_ts_length=None, max_ts_length=None, min_ts_mean=None, min_ts_mean_window=None, **kwargs)¶ A LAD structural time series model.
- Parameters
hyper_params (dict) – Hyper parameters for Luminaire structural modeling. See
luminaire.optimization.hyperparameter_optimization.HyperparameterOptimization
for detailed information.freq (str) – The frequency of the time-series. A Pandas offset such as ‘D’, ‘H’, or ‘M’.
min_ts_length (int, optional) – The minimum required length of the time series for training.
max_ts_length (int, optional) – The maximum required length of the time series for training.
min_ts_mean (float, optional) – Minimum average values in the most recent window of the time series. This optional parameter can be used to avoid over-alerting from noisy low volume time series.
min_ts_mean_window (int, optional) – Size of the most recent window to calculate min_ts_mean.
Note
This class should be used to manually configure the structural model. Exact configuration parameters can be found in luminaire.hyperparameter_optimization.HyperparameterOptimization. Optimal configuration can be obtained by using LAD hyperparameter optimization.
>>> hyper = {"include_holidays_exog": 0, "is_log_transformed": 1, "max_ft_freq": 2, "p": 5, "q": 1} lad_struct_model = LADStructuralModel(hyper_params=hyper, freq='D') >>> lad_struct_model <luminaire.model.lad_structural.LADStructuralModel object at 0x103efe320>
-
score
(observed_value, pred_date, **kwargs)¶ This function scores a value observed at a data date given a trained LAD structural model object.
- Parameters
observed_value (float) – Observed time series value on the prediction date.
pred_date (str) – Prediction date. Needs to be in yyyy-mm-dd or yyyy-mm-dd hh:mm:ss format.
- Returns
Anomaly flag, anomaly probability, prediction and other related metrics.
- Return type
dict
>>> model <luminaire.model.lad_structural.LADStructuralModel object at 0x11c1c3550> >>> model._params['training_end_date'] # Last data date for training time series '2020-06-07 00:00:00'
>>> model.score(2000 ,'2020-06-08') {'Success': True, 'IsLogTransformed': 0, 'AdjustedActual': 2000, 'Prediction': 1943.20426163425, 'StdErr': 93.084646777553, 'CILower': 1785.519523590432, 'CIUpper': 2100.88899967807, 'ConfLevel': 90.0, 'ExogenousHolidays': 0, 'IsAnomaly': False, 'IsAnomalyExtreme': False, 'AnomalyProbability': 0.42671448831719605, 'DownAnomalyProbability': 0.286642755841402, 'UpAnomalyProbability': 0.713357244158598, 'ModelFreshness': 0.1} >>> model.score(2500 ,'2020-06-09') {'Success': True, 'IsLogTransformed': 0, 'AdjustedActual': 2500, 'Prediction': 2028.989933854948, 'StdErr': 93.6623172459385, 'CILower': 1861.009403637476, 'CIUpper': 2186.97046407242, 'ConfLevel': 90.0, 'ExogenousHolidays': 0, 'IsAnomaly': True, 'IsAnomalyExtreme': True, 'AnomalyProbability': 0.9999987021695071, 'DownAnomalyProbability': 6.489152464261849e-07, 'UpAnomalyProbability': 0.9999993510847536, 'ModelFreshness': 0.2}
-
train
(data, optimize=False, **kwargs)¶ This function trains a structural LAD model for a given time series.
- Parameters
data (pandas.DataFrame) – Input time series data
optimize (bool, optional) – Flag to identify whether called from hyperparameter optimization
- Returns
success flag, the model date and the trained lad structural model object
- Return type
tuple[bool, str, LADStructuralModel object]
>>> data raw interpolated 2020-01-01 1326.0 1326.0 2020-01-02 1552.0 1552.0 2020-01-03 1432.0 1432.0 2020-01-04 1470.0 1470.0 2020-01-05 1565.0 1565.0 ... ... ... 2020-06-03 1934.0 1934.0 2020-06-04 1873.0 1873.0 2020-06-05 1674.0 1674.0 2020-06-06 1747.0 1747.0 2020-06-07 1782.0 1782.0 >>> hyper = {"include_holidays_exog": 0, "is_log_transformed": 0, "max_ft_freq": 2, "p": 5, "q": 1} >>> de_obj = DataExploration(freq='D', is_log_transformed=0) >>> data, pre_prc = de_obj.profile(data) >>> pre_prc {'success': True, 'trend_change_list': ['2020-04-01 00:00:00'], 'change_point_list': ['2020-03-16 00:00:00'], 'is_log_transformed': 0, 'min_ts_mean': None, 'ts_start': '2020-01-01 00:00:00', 'ts_end': '2020-06-07 00:00:00'} >>> lad_struct_obj = LADStructuralModel(hyper_params=hyper, freq='D') >>> model = lad_struct_obj.train(data=data, **pre_prc)
>>> model (True, '2020-06-07 00:00:00', <luminaire.model.lad_structural.LADStructuralModel object at 0x126edf588>)
Luminaire Outlier Detection Models: Factoring holidays as exogenous¶
-
class
luminaire.model.model_utils.
LADHolidays
(name=None, holiday_rules=None)¶ A class that generates holiday calendars to be used as external features in the batch outlier detection model. By default, holidays include:
Memorial Day, plus the weekend leading into it
Veterans Day, plus the weekend leading into it
Labor Day
President’s Day
Martin Luther King Jr. Day
Valentine’s Day
Mother’s Day
Father’s Day
Independence Day (actual and observed)
Halloween
Superbowl
Easter
Thanksgiving, plus the following weekend
Christmas Eve, Christmas Day, and all dates up to New Year’s Day (actual and observed)
Luminaire Outlier Detection Models: Kalman Filter¶
-
class
luminaire.model.lad_filtering.
LADFilteringHyperParams
(is_log_transformed=True)¶ Exception class for Luminaire filtering anomaly detection model.
- Parameters
is_log_transformed (bool, optional) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.
-
class
luminaire.model.lad_filtering.
LADFilteringModel
(hyper_params: {‘is_log_transformed’: True}, freq, min_ts_length=None, max_ts_length=None, **kwargs)¶ A Markovian state space model. This model detects anomaly based on the residual process obtained through Kalman Filter based model estimation.
- Parameters
hyper_params (dict) – Hyper parameters for Luminaire structural modeling. See
luminaire.optimization.hyperparameter_optimization.HyperparameterOptimization
for detailed information.freq (str) – The frequency of the time-series. A Pandas offset such as ‘D’, ‘H’, or ‘M’.
min_ts_length (int, optional) – The minimum required length of the time series for training.
max_ts_length (int, optional) – The maximum required length of the time series for training.
>>> hyper = {"is_log_transformed": 1} lad_filtering_model = LADFilteringModel(hyper_params=hyper, freq='D')
>>> lad_filtering_model <luminaire.model.filtering.LADFilteringModel object at 0x103efe320>
-
score
(observed_value, pred_date, synthetic_actual=None, **kwargs)¶ This function scores a value observed at a data date given a trained LAD filtering model object.
- Parameters
observed_value (float) – Observed time series value on the prediction date.
pred_date (str) – Prediction date. Needs to be in yyyy-mm-dd or yyyy-mm-dd hh:mm:ss format.
synthetic_actual (float, optional) – Synthetic time series value. This is an artificial value used to optimize classification accuracy in Luminaire hyperparameter optimization.
- Returns
Model results and LAD filtering model object
- Return type
tuple[dict, LADFilteringlModel object]
>>> model <luminaire.model.lad_filtering.LADFilteringModel object at 0x11f0b2b38> >>> model._params['training_end_date'] '2020-06-07 00:00:00'
>>> model.score(2000 ,'2020-06-08') ({'Success': True, 'AdjustedActual': 0.10110881711268949, 'ConfLevel': 90.0, 'Prediction': 1934.153554885343, 'PredStdErr': 212.4399633739204, 'IsAnomaly': False, 'IsAnomalyExtreme': False, 'AnomalyProbability': 0.4244056403219776, 'DownAnomalyProbability': 0.2877971798390112, 'UpAnomalyProbability': 0.7122028201609888, 'NonStationarityDiffOrder': 2, 'ModelFreshness': 0.1}, <luminaire.model.lad_filtering.LADFilteringModel object at 0x11f3c0860>)
-
train
(data, **kwargs)¶ This function trains a filtering LAD model for a given time series.
- Parameters
data (pandas.DataFrame) – Input time series data
- Returns
The success flag, model date and a trained lad filtering object
- Return type
tuple[bool, str, LADFilteringModel object]
>>> data raw interpolated 2020-01-01 1326.0 1326.0 2020-01-02 1552.0 1552.0 2020-01-03 1432.0 1432.0 2020-01-04 1470.0 1470.0 2020-01-05 1565.0 1565.0 ... ... ... 2020-06-03 1934.0 1934.0 2020-06-04 1873.0 1873.0 2020-06-05 1674.0 1674.0 2020-06-06 1747.0 1747.0 2020-06-07 1782.0 1782.0 >>> hyper = {"is_log_transformed": 1} >>> de_obj = DataExploration(freq='D', is_log_transformed=1, fill_rate=0.95) >>> data, pre_prc = de_obj.profile(data) >>> pre_prc {'success': True, 'trend_change_list': ['2020-04-01 00:00:00'], 'change_point_list': ['2020-03-16 00:00:00'], 'is_log_transformed': 1, 'min_ts_mean': None, 'ts_start': '2020-01-01 00:00:00', 'ts_end': '2020-06-07 00:00:00'} >>> lad_filter_obj = LADFilteringModel(hyper_params=hyper, freq='D') >>> model = lad_filter_obj.train(data=data, **pre_prc)
>>> model (True, '2020-06-07 00:00:00', <luminaire.model.lad_filtering.LADFilteringModel object at 0x11b6c4f60>)
-
exception
luminaire.model.lad_filtering.
LADFilteringModelError
(message)¶ Exception class for Luminaire filtering anomaly detection model.