04 Model Training¶

[Estimated execution time: 9 min]

The toolkit allows to train the models while fixing some parameters, in this notebook we will show:

  • Training of the model
  • Training while keeping some parameters fixed
  • Saving and loading models
In [1]:
import mogptk
import numpy as np
import pandas as pd

Load data set¶

For this tutorial we will use the air quality data set. The data set contains hourly averaged responses from an array of five metal oxide chemical sensors embedded in an Air Quality Chemical Multisensor Device. The device was located in the field in a significantly polluted area in an Italian city. Data were recorded for one year from March 2004 representing the longest freely available recordings of a deployed air quality chemical sensor device.

We will only use five columns: CO(GT), NMHC(GT), C6H6(GT), NOx(GT), NO2(GT). For more information on data loading check out 01 Data Loading. For more information on data handling check out 02 Data Preparation.

For each sensor the minimum value is -200, which is also the default value when there is an error in the measurements. We will ignore them by converting them to NaN.

In [2]:
df = pd.read_csv('data/AirQualityUCI.csv', delimiter=';')

# Replace missing values with NaN
df.replace(-200.0, np.nan, inplace=True)

# First two columns are date and time
# We convert it to a single column with datetime format
df['Date'] = pd.to_datetime(df['Date'] + ' ' + df['Time'], format='%d/%m/%Y %H.%M.%S')

# Define an initial date to compare all other to it
ini_date = pd.Timestamp('2004-03-10 00:00:00.0')

# Get elapsed hours
df['Time'] = (df['Date'] - ini_date) / pd.Timedelta(hours=1)

# Use only the first eight days of data
df2 = df[df['Date'] < pd.Timestamp('2004-03-19 00:00:00.0')]

cols = ['CO(GT)', 'NMHC(GT)', 'C6H6(GT)', 'NOx(GT)', 'NO2(GT)']
dataset = mogptk.LoadDataFrame(df2, x_col='Time', y_col=cols)

Remove aditional data to simulate sensor failure. In this case for each channel we will first remove 40% of the observations and then remove complete sectors in order to get reconstructions from the other channels through learned cross correlations.

In [3]:
for channel in dataset:
    channel.remove_randomly(pct=0.4)

# drop relative ranges to simulate sensor failure
dataset[0].remove_relative_range(0.2, 0.3)
dataset[1].remove_relative_range(0.8, 1.0)
dataset[2].remove_relative_range(0.9, 1.0)
dataset[3].remove_relative_range(0.8, 1.0)
dataset[4].remove_relative_range(0.0, 0.2)

for channel in dataset:
    channel.transform(mogptk.TransformDetrend(degree=1))
    channel.transform(mogptk.TransformStandard())

dataset.plot(transformed=True);

Create model¶

We create a multi output Gaussian Process with a MOSM kernel and initialize the parameters using BNSE.

In [6]:
model = mogptk.MOSM(dataset, Q=3)
model.init_parameters(method='BNSE')
model.plot_prediction(title='Prediction of untrained model', transformed=True);

Training mode¶

We now train the model using mogptk.Model.train with a small number of iterations in order to get a preliminary estimation.

In [7]:
model.train(method='Adam', lr=0.1, iters=1000, verbose=True);
Starting optimization using Adam
‣ Model: MOSM
‣ Channels: 5
‣ Parameters: 80
‣ Training points: 487
‣ Initial loss: 654.106

Start Adam:
     0/1000   0:00:00  loss=     654.106
    10/1000   0:00:01  loss=     674.926
    20/1000   0:00:02  loss=     669.321
    30/1000   0:00:04  loss=     666.539
    40/1000   0:00:05  loss=     661.297
    50/1000   0:00:06  loss=     655.917
    60/1000   0:00:07  loss=     650.004
    70/1000   0:00:08  loss=     642.524
    80/1000   0:00:09  loss=     634.656
    90/1000   0:00:10  loss=     626.946
   100/1000   0:00:10  loss=     619.651
   110/1000   0:00:11  loss=      613.15
   120/1000   0:00:12  loss=     610.893
   130/1000   0:00:13  loss=     604.817
   140/1000   0:00:14  loss=     600.995
   150/1000   0:00:15  loss=     599.471
   160/1000   0:00:15  loss=     598.367
   170/1000   0:00:16  loss=     597.703
   180/1000   0:00:17  loss=     597.291
   190/1000   0:00:18  loss=     597.065
   200/1000   0:00:19  loss=     596.813
   210/1000   0:00:20  loss=     596.656
   220/1000   0:00:21  loss=     596.599
   230/1000   0:00:21  loss=     596.409
   240/1000   0:00:22  loss=     596.413
   250/1000   0:00:23  loss=     596.346
   260/1000   0:00:24  loss=     596.239
   270/1000   0:00:25  loss=     596.127
   280/1000   0:00:26  loss=     596.035
   290/1000   0:00:27  loss=     596.134
   300/1000   0:00:28  loss=     596.008
   310/1000   0:00:29  loss=     595.912
   320/1000   0:00:30  loss=     595.874
   330/1000   0:00:31  loss=     595.779
   340/1000   0:00:32  loss=     595.745
   350/1000   0:00:32  loss=     595.783
   360/1000   0:00:33  loss=     595.699
   370/1000   0:00:34  loss=     595.635
   380/1000   0:00:35  loss=     595.783
   390/1000   0:00:36  loss=     595.466
   400/1000   0:00:37  loss=     595.494
   410/1000   0:00:38  loss=     595.481
   420/1000   0:00:39  loss=     595.402
   430/1000   0:00:40  loss=     595.419
   440/1000   0:00:41  loss=     595.407
   450/1000   0:00:42  loss=     595.243
   460/1000   0:00:43  loss=     595.307
   470/1000   0:00:43  loss=     595.222
   480/1000   0:00:45  loss=     595.085
   490/1000   0:00:46  loss=     595.186
   500/1000   0:00:47  loss=     595.107
   510/1000   0:00:48  loss=     595.349
   520/1000   0:00:48  loss=      595.11
   530/1000   0:00:49  loss=      595.54
   540/1000   0:00:50  loss=     595.104
   550/1000   0:00:51  loss=     594.955
   560/1000   0:00:52  loss=     594.957
   570/1000   0:00:53  loss=     594.803
   580/1000   0:00:54  loss=     594.941
   590/1000   0:00:54  loss=     594.802
   600/1000   0:00:55  loss=     594.785
   610/1000   0:00:56  loss=     594.853
   620/1000   0:00:57  loss=     594.864
   630/1000   0:00:58  loss=     594.603
   640/1000   0:00:58  loss=     594.815
   650/1000   0:00:59  loss=     594.827
   660/1000   0:01:00  loss=     594.563
   670/1000   0:01:01  loss=     594.564
   680/1000   0:01:02  loss=     594.495
   690/1000   0:01:03  loss=      594.58
   700/1000   0:01:03  loss=      594.43
   710/1000   0:01:04  loss=     594.596
   720/1000   0:01:05  loss=     594.629
   730/1000   0:01:06  loss=     594.463
   740/1000   0:01:07  loss=     594.418
   750/1000   0:01:07  loss=      594.62
   760/1000   0:01:08  loss=     594.498
   770/1000   0:01:09  loss=     594.295
   780/1000   0:01:10  loss=     594.396
   790/1000   0:01:11  loss=     594.455
   800/1000   0:01:12  loss=      594.22
   810/1000   0:01:13  loss=     594.211
   820/1000   0:01:14  loss=     594.281
   830/1000   0:01:15  loss=     594.223
   840/1000   0:01:16  loss=     594.116
   850/1000   0:01:17  loss=     594.314
   860/1000   0:01:19  loss=      594.18
   870/1000   0:01:20  loss=     594.736
   880/1000   0:01:21  loss=     594.133
   890/1000   0:01:23  loss=      594.17
   900/1000   0:01:24  loss=     594.218
   910/1000   0:01:25  loss=     594.312
   920/1000   0:01:26  loss=     594.176
   930/1000   0:01:28  loss=     594.038
   940/1000   0:01:29  loss=     594.092
   950/1000   0:01:30  loss=     594.047
   960/1000   0:01:32  loss=     594.044
   970/1000   0:01:33  loss=     594.152
   980/1000   0:01:34  loss=     594.502
   990/1000   0:01:36  loss=     594.111
  1000/1000   0:01:37  loss=     594.342
Finished

Optimization finished in 1 minute 37 seconds
‣ Iterations: 1000
‣ Final loss: 594.342
In [8]:
model.plot_prediction();

Fixing parameters¶

We will now set the delay of the first component to zero and fix it so it is not being modified while optimizating. We will also fix the frequencies given that the periodicities were found fairly accurately in just a few iterations.

We can fix parameters by accessing the kernel in the model and setting train=False. We can also set the value of a parameter using assign(value). Each kernel exposes different parameters and we can extract their names using the print_parameters function. Be sure to select the right mixture by indexing the kernel attribute of the model.

In [9]:
model.gpr.kernel.delay.assign(np.zeros((5,3)), train=False)
model.gpr.kernel.mean.train = False

Now we can print which parameters are fixed and which are not

In [10]:
model.print_parameters()
NameRangeValue
MOSM.weight[1e-08, ∞)[[4.69705789 5.87990564 3.4396564 ] [3.21551217 5.06082294 2.71982326] [0.22543415 0.83956785 3.05787416] [2.86367564 1.0442355 6.79036392] [7.51541443 1.91071836 0.15951713]]
MOSM.meanfixed[[[0.08310058] [0.0418448 ] [0.08531389]] [[0.03145669] [0.12657541] [0.06897176]] [[0.05949989] [0.05451525] [0.03284833]] [[0.02596503] [0.05479789] [0.04252641]] [[0.04208808] [0.06834659] [0.05261081]]]
MOSM.variance[1e-08, ∞)[[[9.94415955e-06] [5.81471414e-06] [8.90008308e-06]] [[2.10327670e-06] [4.39357313e-06] [5.50383045e-05]] [[1.60642024e-06] [8.03545534e-06] [7.55700437e-06]] [[1.97086577e-06] [7.69422173e-06] [3.25577832e-06]] [[5.26383860e-06] [2.22929242e-06] [9.30638644e-07]]]
MOSM.delayfixed[[[0.] [0.] [0.]] [[0.] [0.] [0.]] [[0.] [0.] [0.]] [[0.] [0.] [0.]] [[0.] [0.] [0.]]]
MOSM.phase(-∞, ∞)[[-1.31571023e-48 -2.12128876e-10 1.28717538e-02] [ 2.18930725e-01 6.00550786e-01 -1.28717148e-02] [ 9.16404405e-02 1.93903152e-01 -5.77215222e-01] [ 6.59575064e-01 2.16089797e-01 -8.78020438e-01] [-4.38733103e-01 -4.34552734e-01 1.00904977e+00]]
Gaussian.scale[1e-08, ∞)[0.69024386 0.81211921 0.942778 0.77775096 0.66968503]

then we can train again, starting from the last parameters obtained, with some fixed values

In [11]:
model.train(method='Adam', lr=0.5, iters=50, verbose=True);
Starting optimization using Adam
‣ Model: MOSM
‣ Channels: 5
‣ Parameters: 50
‣ Training points: 487
‣ Initial loss: 594.34

Start Adam:
  1000/1050   0:00:00  loss=      594.34
  1001/1050   0:00:00  loss=      595.27
  1002/1050   0:00:00  loss=     594.297
  1003/1050   0:00:00  loss=     594.497
  1004/1050   0:00:00  loss=     595.459
  1005/1050   0:00:00  loss=     595.171
  1006/1050   0:00:00  loss=     594.808
  1007/1050   0:00:00  loss=     594.162
  1008/1050   0:00:00  loss=     594.715
  1009/1050   0:00:01  loss=     594.862
  1010/1050   0:00:01  loss=     594.713
  1011/1050   0:00:01  loss=     594.259
  1012/1050   0:00:01  loss=     594.268
  1013/1050   0:00:01  loss=     594.384
  1014/1050   0:00:01  loss=     594.368
  1015/1050   0:00:01  loss=     594.211
  1016/1050   0:00:01  loss=     594.275
  1017/1050   0:00:01  loss=     594.263
  1018/1050   0:00:01  loss=     594.146
  1019/1050   0:00:01  loss=     594.052
  1020/1050   0:00:01  loss=      594.13
  1021/1050   0:00:02  loss=     594.199
  1022/1050   0:00:02  loss=     594.149
  1023/1050   0:00:02  loss=     594.034
  1024/1050   0:00:02  loss=         594
  1025/1050   0:00:02  loss=     594.055
  1026/1050   0:00:02  loss=     594.073
  1027/1050   0:00:02  loss=     594.024
  1028/1050   0:00:02  loss=     593.987
  1029/1050   0:00:02  loss=     593.995
  1030/1050   0:00:02  loss=     593.985
  1031/1050   0:00:03  loss=     593.956
  1032/1050   0:00:03  loss=     593.955
  1033/1050   0:00:03  loss=     593.971
  1034/1050   0:00:03  loss=     593.947
  1035/1050   0:00:03  loss=     593.906
  1036/1050   0:00:03  loss=      593.91
  1037/1050   0:00:03  loss=     593.924
  1038/1050   0:00:03  loss=       593.9
  1039/1050   0:00:03  loss=     593.884
  1040/1050   0:00:03  loss=     593.888
  1041/1050   0:00:03  loss=     593.868
  1042/1050   0:00:03  loss=     593.858
  1043/1050   0:00:04  loss=     593.867
  1044/1050   0:00:04  loss=     593.848
  1045/1050   0:00:04  loss=     593.837
  1046/1050   0:00:04  loss=     593.839
  1047/1050   0:00:04  loss=     593.821
  1048/1050   0:00:04  loss=     593.823
  1049/1050   0:00:04  loss=     593.815
  1050/1050   0:00:04  loss=     593.802
Finished

Optimization finished in 4.610 seconds
‣ Iterations: 50
‣ Final loss: 593.802
In [12]:
model.plot_prediction();

Load and save models¶

In order to create checkpoints we can alternate between calls to mogptk.Model.train and mogptk.Model.save. These saved checkpoints can then be loaded using mogptk.LoadModel

In [13]:
model.gpr.kernel.delay.train = True
In [14]:
checkpoints = 4
iters_per_checkpoint = 100

for i in range(checkpoints):
    model.train(method='Adam', lr=0.5, iters=iters_per_checkpoint)
    model.save('air_quality_mosm_%d' % (i,))
In [15]:
model.plot_prediction();

We can then load a model as follows.

In [16]:
model = mogptk.LoadModel('air_quality_mosm_0')
model.plot_prediction();
In [ ]: