Luminaire Streaming Anomaly Detection Models: Window Density Model¶
-
class
luminaire.model.window_density.
WindowDensityHyperParams
(freq='M', ignore_window=None, max_missing_train_prop=0.1, is_log_transformed=False, baseline_type='aggregated', detection_method=None, min_window_length=None, max_window_length=None, window_length=None, ma_window_length=None, detrend_method='ma')¶ Hyperparameter class for Luminaire Window density model.
- Parameters
freq (str) – The frequency of the time-series. Luminaire supports default configuration for ‘S’, ‘M’, ‘QM’, ‘H’, ‘D’. Any other frequency type should be specified as ‘custom’ and configuration should be set manually.
ignore_window (int, optional) – ignore a time window to be considered for training.
max_missing_train_prop (float, optional) – Maximum proportion of missing observation allowed in the training data.
is_log_transformed (bool, optional) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.
baseline_type (str, optional) –
A string flag to specify whether to take set a baseline as the previous sub-window from the training data for scoring or to aggregate the overall window as a baseline. Possible values:
”last_window”
”aggregated”
detection_method (str, optional) –
A string that select between two window testing method. Possible values:
”kldiv” (KL-divergence)
”sign_test” (Wilcoxon sign rank test)
min_window_length (int, optional) – Minimum size of the scoring window / a stable training sub-window length.
Note
This is not the minimum size of the whole training window which is the combination of stable sub-windows.
- Parameters
max_window_length (int, optional) – Maximum size of the scoring window / a stable training sub-window length.
Note
This is not the maximum size of the whole training window which is the combination of stable sub-windows.
- Parameters
window_length (int, optional) – Size of the scoring window / a stable training sub-window length.
Note
This is not the size of the whole training window which is the combination of stable sub-windows.
- Parameters
ma_window_length (int, optional) – Size of the window for detrending scoring window / stable training sub-windows through moving average method.
Note
ma_window_length should be small enough to maintain the stable structure of the training / scoring window and large enough to remove the trend. The ideal size can be somewhere between (0.1 * window_length) and (0.25 * window length).
- Parameters
detrend_method (str, optional) – A string that select between two stationarizing method. Possible values: - “ma” (moving average based) - “diff” (differencing based).
-
class
luminaire.model.window_density.
WindowDensityModel
(hyper_params: {‘freq’: ‘M’, ‘ignore_window’: None, ‘max_missing_train_prop’: 0.1, ‘is_log_transformed’: False, ‘baseline_type’: ‘aggregated’, ‘detection_method’: ‘kldiv’, ‘min_window_length’: 720, ‘max_window_length’: 120960, ‘window_length’: 1440, ‘ma_window_length’: 60, ‘detrend_method’: ‘ma’}, **kwargs)¶ This model detects anomalous windows using KL divergence (for high frequency data) and Wilcoxon sign rank test (for low frequency data).
- Parameters
hyper_params (dict) – Hyper parameters for Luminaire window density model. See
luminaire.model.window_density.WindowDensityHyperParams
for detailed information.- Returns
Anomaly probability for the execution window and other related model outputs
- Return type
list[dict]
-
score
(data, **kwargs)¶ Function scores input series for anomalies
- Parameters
data (pandas.DataFrame) – Input time series to score
- Returns
Output dictionary with scoring summary.
- Return type
dict
>>> data raw interpolated index 2018-10-06 00:00:00 204800 204800 2018-10-06 01:00:00 222218 222218 2018-10-06 02:00:00 218903 218903 2018-10-06 03:00:00 190639 190639 2018-10-06 04:00:00 148214 148214 2018-10-06 05:00:00 106358 106358 2018-10-06 06:00:00 70081 70081 2018-10-06 07:00:00 47748 47748 2018-10-06 08:00:00 36837 36837 2018-10-06 09:00:00 33023 33023 2018-10-06 10:00:00 44432 44432 2018-10-06 11:00:00 72773 72773 2018-10-06 12:00:00 115180 115180 2018-10-06 13:00:00 157568 157568 2018-10-06 14:00:00 180174 180174 2018-10-06 15:00:00 190048 190048 2018-10-06 16:00:00 188391 188391 2018-10-06 17:00:00 189233 189233 2018-10-06 18:00:00 191703 191703 2018-10-06 19:00:00 189848 189848 2018-10-06 20:00:00 192685 192685 2018-10-06 21:00:00 196743 196743 2018-10-06 22:00:00 193016 193016 2018-10-06 23:00:00 196441 196441 >>> model <luminaire.model.window_density.WindowDensityModel object at 0x7fcaab72fdd8>
>>> model.score(data) {'Success': True, 'ConfLevel': 99.9, 'IsAnomaly': False, 'AnomalyProbability': 0.6963188902776808}
-
train
(data, **kwargs)¶ Input time series for training.
- Parameters
data – Input time series.
- Returns
Training summary with a success flag.
- Return type
tuple(bool, python model object)
>>> data raw interpolated index 2017-10-02 00:00:00 118870 118870 2017-10-02 01:00:00 121914 121914 2017-10-02 02:00:00 116097 116097 2017-10-02 03:00:00 94511 94511 2017-10-02 04:00:00 68330 68330 ... ... ... 2018-10-10 19:00:00 219908 219908 2018-10-10 20:00:00 219149 219149 2018-10-10 21:00:00 207232 207232 2018-10-10 22:00:00 198741 198741 2018-10-10 23:00:00 213751 213751 >>> hyper_params = WindowDensityHyperParams(freq='H').params >>> wdm_obj = WindowDensityModel(hyper_params=hyper_params) >>> success, model = wdm_obj.train(data)
>>> success, model (True, <luminaire.model.window_density.WindowDensityModel object at 0x7fd7c5a34e80>)