Luminaire Outlier Detection Models: Structural Modeling

exception luminaire.model.lad_structural.LADStructuralError(message)

Exception class for Luminaire structural anomaly detection model.

class luminaire.model.lad_structural.LADStructuralHyperParams(include_holidays_exog=True, p=2, q=2, is_log_transformed=True, max_ft_freq=3)

Exception class for Luminaire structural anomaly detection model.

Parameters
  • include_holidays_exog (bool, optional) – whether to include holidays as exogenous variables in the regression. Holidays are defined in LADHolidays

  • p (int, optional) – Order for the AR component of the model.

  • q (int, optional) – Order for the MA component of the model.

  • is_log_transformed (bool, optional) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.

  • max_ft_freq (int, optional) – The maximum frequency order for the Fourier transformation.

class luminaire.model.lad_structural.LADStructuralModel(hyper_params: {‘include_holidays_exog’: True, ‘p’: 2, ‘q’: 2, ‘is_log_transformed’: True, ‘max_ft_freq’: 3}, freq, min_ts_length=None, max_ts_length=None, min_ts_mean=None, min_ts_mean_window=None, **kwargs)

A LAD structural time series model.

Parameters
  • hyper_params (dict) – Hyper parameters for Luminaire structural modeling. See luminaire.optimization.hyperparameter_optimization.HyperparameterOptimization for detailed information.

  • freq (str) – The frequency of the time-series. A Pandas offset such as ‘D’, ‘H’, or ‘M’.

  • min_ts_length (int, optional) – The minimum required length of the time series for training.

  • max_ts_length (int, optional) – The maximum required length of the time series for training.

  • min_ts_mean (float, optional) – Minimum average values in the most recent window of the time series. This optional parameter can be used to avoid over-alerting from noisy low volume time series.

  • min_ts_mean_window (int, optional) – Size of the most recent window to calculate min_ts_mean.

Note

This class should be used to manually configure the structural model. Exact configuration parameters can be found in luminaire.hyperparameter_optimization.HyperparameterOptimization. Optimal configuration can be obtained by using LAD hyperparameter optimization.

>>> hyper = {"include_holidays_exog": 0, "is_log_transformed": 1, "max_ft_freq": 2, "p": 5, "q": 1}
lad_struct_model = LADStructuralModel(hyper_params=hyper, freq='D')
>>> lad_struct_model
<luminaire.model.lad_structural.LADStructuralModel object at 0x103efe320>
score(observed_value, pred_date, **kwargs)

This function scores a value observed at a data date given a trained LAD structural model object.

Parameters
  • observed_value (float) – Observed time series value on the prediction date.

  • pred_date (str) – Prediction date. Needs to be in yyyy-mm-dd or yyyy-mm-dd hh:mm:ss format.

Returns

Anomaly flag, anomaly probability, prediction and other related metrics.

Return type

dict

>>> model
<luminaire.model.lad_structural.LADStructuralModel object at 0x11c1c3550>
>>> model._params['training_end_date'] # Last data date for training time series
'2020-06-07 00:00:00'
>>> model.score(2000 ,'2020-06-08')
{'Success': True, 'IsLogTransformed': 0, 'AdjustedActual': 2000, 'Prediction': 1943.20426163425,
'StdErr': 93.084646777553, 'CILower': 1785.519523590432, 'CIUpper': 2100.88899967807, 'ConfLevel': 90.0,
'ExogenousHolidays': 0, 'IsAnomaly': False, 'IsAnomalyExtreme': False, 'AnomalyProbability': 0.42671448831719605,
'DownAnomalyProbability': 0.286642755841402, 'UpAnomalyProbability': 0.713357244158598, 'ModelFreshness': 0.1}
>>> model.score(2500 ,'2020-06-09')
{'Success': True, 'IsLogTransformed': 0, 'AdjustedActual': 2500, 'Prediction': 2028.989933854948,
'StdErr': 93.6623172459385, 'CILower': 1861.009403637476, 'CIUpper': 2186.97046407242, 'ConfLevel': 90.0,
'ExogenousHolidays': 0, 'IsAnomaly': True, 'IsAnomalyExtreme': True, 'AnomalyProbability': 0.9999987021695071,
'DownAnomalyProbability': 6.489152464261849e-07, 'UpAnomalyProbability': 0.9999993510847536,
'ModelFreshness': 0.2}
train(data, optimize=False, **kwargs)

This function trains a structural LAD model for a given time series.

Parameters
  • data (pandas.DataFrame) – Input time series data

  • optimize (bool, optional) – Flag to identify whether called from hyperparameter optimization

Returns

success flag, the model date and the trained lad structural model object

Return type

tuple[bool, str, LADStructuralModel object]

>>> data
               raw interpolated
2020-01-01  1326.0       1326.0
2020-01-02  1552.0       1552.0
2020-01-03  1432.0       1432.0
2020-01-04  1470.0       1470.0
2020-01-05  1565.0       1565.0
...            ...          ...
2020-06-03  1934.0       1934.0
2020-06-04  1873.0       1873.0
2020-06-05  1674.0       1674.0
2020-06-06  1747.0       1747.0
2020-06-07  1782.0       1782.0
>>> hyper = {"include_holidays_exog": 0, "is_log_transformed": 0, "max_ft_freq": 2, "p": 5, "q": 1}
>>> de_obj = DataExploration(freq='D', is_log_transformed=0)
>>> data, pre_prc = de_obj.profile(data)
>>> pre_prc
{'success': True, 'trend_change_list': ['2020-04-01 00:00:00'], 'change_point_list': ['2020-03-16 00:00:00'],
'is_log_transformed': 0, 'min_ts_mean': None, 'ts_start': '2020-01-01 00:00:00',
'ts_end': '2020-06-07 00:00:00'}
>>> lad_struct_obj = LADStructuralModel(hyper_params=hyper, freq='D')
>>> model = lad_struct_obj.train(data=data, **pre_prc)
>>> model
(True, '2020-06-07 00:00:00', <luminaire.model.lad_structural.LADStructuralModel object at 0x126edf588>)

Luminaire Outlier Detection Models: Factoring holidays as exogenous

class luminaire.model.model_utils.LADHolidays(name=None, holiday_rules=None)

A class that generates holiday calendars to be used as external features in the batch outlier detection model. By default, holidays include:

  • Memorial Day, plus the weekend leading into it

  • Veterans Day, plus the weekend leading into it

  • Labor Day

  • President’s Day

  • Martin Luther King Jr. Day

  • Valentine’s Day

  • Mother’s Day

  • Father’s Day

  • Independence Day (actual and observed)

  • Halloween

  • Superbowl

  • Easter

  • Thanksgiving, plus the following weekend

  • Christmas Eve, Christmas Day, and all dates up to New Year’s Day (actual and observed)

Luminaire Outlier Detection Models: Kalman Filter

class luminaire.model.lad_filtering.LADFilteringHyperParams(is_log_transformed=True)

Exception class for Luminaire filtering anomaly detection model.

Parameters

is_log_transformed (bool, optional) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.

class luminaire.model.lad_filtering.LADFilteringModel(hyper_params: {‘is_log_transformed’: True}, freq, min_ts_length=None, max_ts_length=None, **kwargs)

A Markovian state space model. This model detects anomaly based on the residual process obtained through Kalman Filter based model estimation.

Parameters
  • hyper_params (dict) – Hyper parameters for Luminaire structural modeling. See luminaire.optimization.hyperparameter_optimization.HyperparameterOptimization for detailed information.

  • freq (str) – The frequency of the time-series. A Pandas offset such as ‘D’, ‘H’, or ‘M’.

  • min_ts_length (int, optional) – The minimum required length of the time series for training.

  • max_ts_length (int, optional) – The maximum required length of the time series for training.

>>> hyper = {"is_log_transformed": 1}
lad_filtering_model = LADFilteringModel(hyper_params=hyper, freq='D')
>>> lad_filtering_model
<luminaire.model.filtering.LADFilteringModel object at 0x103efe320>
score(observed_value, pred_date, synthetic_actual=None, **kwargs)

This function scores a value observed at a data date given a trained LAD filtering model object.

Parameters
  • observed_value (float) – Observed time series value on the prediction date.

  • pred_date (str) – Prediction date. Needs to be in yyyy-mm-dd or yyyy-mm-dd hh:mm:ss format.

  • synthetic_actual (float, optional) – Synthetic time series value. This is an artificial value used to optimize classification accuracy in Luminaire hyperparameter optimization.

Returns

Model results and LAD filtering model object

Return type

tuple[dict, LADFilteringlModel object]

>>> model
<luminaire.model.lad_filtering.LADFilteringModel object at 0x11f0b2b38>
>>> model._params['training_end_date']
'2020-06-07 00:00:00'
>>> model.score(2000 ,'2020-06-08')
({'Success': True, 'AdjustedActual': 0.10110881711268949, 'ConfLevel': 90.0, 'Prediction': 1934.153554885343,
'PredStdErr': 212.4399633739204, 'IsAnomaly': False, 'IsAnomalyExtreme': False,
'AnomalyProbability': 0.4244056403219776, 'DownAnomalyProbability': 0.2877971798390112,
'UpAnomalyProbability': 0.7122028201609888, 'NonStationarityDiffOrder': 2, 'ModelFreshness': 0.1},
<luminaire.model.lad_filtering.LADFilteringModel object at 0x11f3c0860>)
train(data, **kwargs)

This function trains a filtering LAD model for a given time series.

Parameters

data (pandas.DataFrame) – Input time series data

Returns

The success flag, model date and a trained lad filtering object

Return type

tuple[bool, str, LADFilteringModel object]

>>> data
               raw interpolated
2020-01-01  1326.0       1326.0
2020-01-02  1552.0       1552.0
2020-01-03  1432.0       1432.0
2020-01-04  1470.0       1470.0
2020-01-05  1565.0       1565.0
...            ...          ...
2020-06-03  1934.0       1934.0
2020-06-04  1873.0       1873.0
2020-06-05  1674.0       1674.0
2020-06-06  1747.0       1747.0
2020-06-07  1782.0       1782.0
>>> hyper = {"is_log_transformed": 1}
>>> de_obj = DataExploration(freq='D', is_log_transformed=1, fill_rate=0.95)
>>> data, pre_prc = de_obj.profile(data)
>>> pre_prc
{'success': True, 'trend_change_list': ['2020-04-01 00:00:00'], 'change_point_list': ['2020-03-16 00:00:00'],
'is_log_transformed': 1, 'min_ts_mean': None, 'ts_start': '2020-01-01 00:00:00',
'ts_end': '2020-06-07 00:00:00'}
>>> lad_filter_obj = LADFilteringModel(hyper_params=hyper, freq='D')
>>> model = lad_filter_obj.train(data=data, **pre_prc)
>>> model
(True, '2020-06-07 00:00:00', <luminaire.model.lad_filtering.LADFilteringModel object at 0x11b6c4f60>)
exception luminaire.model.lad_filtering.LADFilteringModelError(message)

Exception class for Luminaire filtering anomaly detection model.