Luminaire Streaming Anomaly Detection Models: Window Density Model

class luminaire.model.window_density.WindowDensityHyperParams(freq='M', ignore_window=None, max_missing_train_prop=0.1, is_log_transformed=False, baseline_type='aggregated', detection_method=None, min_window_length=None, max_window_length=None, window_length=None, ma_window_length=None, detrend_method='ma')

Hyperparameter class for Luminaire Window density model.

Parameters
  • freq (str) – The frequency of the time-series. Luminaire supports default configuration for ‘S’, ‘M’, ‘QM’, ‘H’, ‘D’. Any other frequency type should be specified as ‘custom’ and configuration should be set manually.

  • ignore_window (int, optional) – ignore a time window to be considered for training.

  • max_missing_train_prop (float, optional) – Maximum proportion of missing observation allowed in the training data.

  • is_log_transformed (bool, optional) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.

  • baseline_type (str, optional) –

    A string flag to specify whether to take set a baseline as the previous sub-window from the training data for scoring or to aggregate the overall window as a baseline. Possible values:

    • ”last_window”

    • ”aggregated”

  • detection_method (str, optional) –

    A string that select between two window testing method. Possible values:

    • ”kldiv” (KL-divergence)

    • ”sign_test” (Wilcoxon sign rank test)

  • min_window_length (int, optional) – Minimum size of the scoring window / a stable training sub-window length.

Note

This is not the minimum size of the whole training window which is the combination of stable sub-windows.

Parameters

max_window_length (int, optional) – Maximum size of the scoring window / a stable training sub-window length.

Note

This is not the maximum size of the whole training window which is the combination of stable sub-windows.

Parameters

window_length (int, optional) – Size of the scoring window / a stable training sub-window length.

Note

This is not the size of the whole training window which is the combination of stable sub-windows.

Parameters

ma_window_length (int, optional) – Size of the window for detrending scoring window / stable training sub-windows through moving average method.

Note

ma_window_length should be small enough to maintain the stable structure of the training / scoring window and large enough to remove the trend. The ideal size can be somewhere between (0.1 * window_length) and (0.25 * window length).

Parameters

detrend_method (str, optional) – A string that select between two stationarizing method. Possible values: - “ma” (moving average based) - “diff” (differencing based).

class luminaire.model.window_density.WindowDensityModel(hyper_params: {‘freq’: ‘M’, ‘ignore_window’: None, ‘max_missing_train_prop’: 0.1, ‘is_log_transformed’: False, ‘baseline_type’: ‘aggregated’, ‘detection_method’: ‘kldiv’, ‘min_window_length’: 720, ‘max_window_length’: 120960, ‘window_length’: 1440, ‘ma_window_length’: 60, ‘detrend_method’: ‘ma’}, **kwargs)

This model detects anomalous windows using KL divergence (for high frequency data) and Wilcoxon sign rank test (for low frequency data).

Parameters

hyper_params (dict) – Hyper parameters for Luminaire window density model. See luminaire.model.window_density.WindowDensityHyperParams for detailed information.

Returns

Anomaly probability for the execution window and other related model outputs

Return type

list[dict]

score(data, **kwargs)

Function scores input series for anomalies

Parameters

data (pandas.DataFrame) – Input time series to score

Returns

Output dictionary with scoring summary.

Return type

dict

>>> data
                        raw interpolated
index
2018-10-06 00:00:00  204800       204800
2018-10-06 01:00:00  222218       222218
2018-10-06 02:00:00  218903       218903
2018-10-06 03:00:00  190639       190639
2018-10-06 04:00:00  148214       148214
2018-10-06 05:00:00  106358       106358
2018-10-06 06:00:00   70081        70081
2018-10-06 07:00:00   47748        47748
2018-10-06 08:00:00   36837        36837
2018-10-06 09:00:00   33023        33023
2018-10-06 10:00:00   44432        44432
2018-10-06 11:00:00   72773        72773
2018-10-06 12:00:00  115180       115180
2018-10-06 13:00:00  157568       157568
2018-10-06 14:00:00  180174       180174
2018-10-06 15:00:00  190048       190048
2018-10-06 16:00:00  188391       188391
2018-10-06 17:00:00  189233       189233
2018-10-06 18:00:00  191703       191703
2018-10-06 19:00:00  189848       189848
2018-10-06 20:00:00  192685       192685
2018-10-06 21:00:00  196743       196743
2018-10-06 22:00:00  193016       193016
2018-10-06 23:00:00  196441       196441
>>> model
<luminaire.model.window_density.WindowDensityModel object at 0x7fcaab72fdd8>
>>> model.score(data)
{'Success': True, 'ConfLevel': 99.9, 'IsAnomaly': False, 'AnomalyProbability': 0.6963188902776808}
train(data, **kwargs)

Input time series for training.

Parameters

data – Input time series.

Returns

Training summary with a success flag.

Return type

tuple(bool, python model object)

>>> data
                        raw interpolated
index
2017-10-02 00:00:00  118870       118870
2017-10-02 01:00:00  121914       121914
2017-10-02 02:00:00  116097       116097
2017-10-02 03:00:00   94511        94511
2017-10-02 04:00:00   68330        68330
...                     ...          ...
2018-10-10 19:00:00  219908       219908
2018-10-10 20:00:00  219149       219149
2018-10-10 21:00:00  207232       207232
2018-10-10 22:00:00  198741       198741
2018-10-10 23:00:00  213751       213751
>>> hyper_params = WindowDensityHyperParams(freq='H').params
>>> wdm_obj = WindowDensityModel(hyper_params=hyper_params)
>>> success, model = wdm_obj.train(data)
>>> success, model
(True, <luminaire.model.window_density.WindowDensityModel object at 0x7fd7c5a34e80>)