The goal of this notebook is to compare different multi-output Gaussian process models, such as the MOSM, CSM, SM-LMC, and the CONV. The dataset consists of four weather stations in south England, each registering since April 2012 for every 5 minutes the following statistics:
The four stations have the following locations:
We replicate a similar experiment by Álvarez and Lawrence (https://papers.nips.cc/paper/3553-sparse-convolved-gaussian-processes-for-multi-output-regression), in which we train a multi-output Gaussian process on the Tidal Height data of the four sensors.
import numpy as np
import pandas as pd
import mogptk
We will load the four data series and combine their date and time fields. Additionally, we select a single week in June 2020.
start_date = '2020-06-01'
end_date = '2020-06-08'
print('Loading Bramble')
bramblemet = pd.read_csv('data/bramblemet/bramblemet.csv.gz', compression='gzip', low_memory=False)
bramblemet['Date'] = pd.to_datetime(bramblemet['Date'] + ' ' + bramblemet['Time'], format='%d/%m/%Y %H:%M')
bramblemet = bramblemet.drop(columns=['Time'])
bramblemet = bramblemet.loc[(bramblemet['Date'] >= start_date) & (bramblemet['Date'] < end_date)]
print('Loading Camber')
cambermet = pd.read_csv('data/bramblemet/cambermet.csv.gz', compression='gzip', low_memory=False)
cambermet['Date'] = pd.to_datetime(cambermet['Date'] + ' ' + cambermet['Time'], format='%d/%m/%Y %H:%M')
cambermet = cambermet.drop(columns=['Time'])
cambermet = cambermet.loc[(cambermet['Date'] >= start_date) & (cambermet['Date'] < end_date)]
print('Loading Chi')
chimet = pd.read_csv('data/bramblemet/chimet.csv.gz', compression='gzip', low_memory=False)
chimet['Date'] = pd.to_datetime(chimet['Date'] + ' ' + chimet['Time'], format='%d/%m/%Y %H:%M')
chimet = chimet.drop(columns=['Time'])
chimet = chimet.loc[(chimet['Date'] >= start_date) & (chimet['Date'] < end_date)]
print('Loading Soton')
sotonmet = pd.read_csv('data/bramblemet/sotonmet.csv.gz', compression='gzip', low_memory=False)
sotonmet['Date'] = pd.to_datetime(sotonmet['Date'] + ' ' + sotonmet['Time'], format='%d/%m/%Y %H:%M')
sotonmet = sotonmet.drop(columns=['Time'])
sotonmet = sotonmet.loc[(sotonmet['Date'] >= start_date) & (sotonmet['Date'] < end_date)]
bramblemet.head(5)
Loading Bramble Loading Camber Loading Chi Loading Soton
Date | WSPD | WD | GST | ATMP | WTMP | BARO | DEPTH | VIS | |
---|---|---|---|---|---|---|---|---|---|
37972 | 2020-06-01 00:00:00 | 12.5 | 56 | 13.9 | 15.8 | 16.1 | 1021.9 | 1.91 | 27 |
37973 | 2020-06-01 00:05:00 | 11.5 | 56 | 13.0 | 15.7 | 16.3 | 1021.9 | 1.86 | 27 |
37974 | 2020-06-01 00:15:00 | 10.6 | 55 | 11.0 | 15.8 | 16.3 | 1021.8 | 1.80 | 27 |
37975 | 2020-06-01 00:20:00 | 9.5 | 47 | 10.3 | 15.7 | 16.4 | 1021.8 | 1.76 | 27 |
37976 | 2020-06-01 00:25:00 | 10.0 | 48 | 11.6 | 15.7 | 16.4 | 1021.7 | 1.75 | 27 |
We load the data series into MOGPTK and we detrend the data to improve trainability. We remove 90% of the data points at random so that training is quicker and required less memory. This has the added benefit that we can calculate the prediction error effectively for all the removed data points. In addition to removing data points at random, we also remove a few ranges in order to see how well data imputation for larger ranges works.
dataset = mogptk.DataSet(
mogptk.LoadDataFrame(bramblemet, x_col='Date', y_col=['DEPTH'], name='Bramble'),
mogptk.LoadDataFrame(cambermet, x_col='Date', y_col=['DEPTH'], name='Camber'),
mogptk.LoadDataFrame(chimet, x_col='Date', y_col=['DEPTH'], name='Chi'),
mogptk.LoadDataFrame(sotonmet, x_col='Date', y_col=['DEPTH'], name='Soton'),
)
for i, data in enumerate(dataset):
data.transform(mogptk.TransformDetrend)
data.remove_randomly(pct=0.9)
dataset['Bramble'].remove_range(start='2020-06-02', end='2020-06-03')
dataset['Camber'].remove_range(start='2020-06-06', end='2020-06-07')
dataset['Chi'].remove_range(start='2020-06-07', end='2020-06-08')
dataset['Soton'].remove_range(start='2020-06-04', end='2020-06-05')
dataset.plot(transformed=True);
dataset.plot_spectrum(per='day', maxfreq=5, transformed=True);
In the spectral information of our dataset we can observe that all channels have a periodicity of (almost) twice a day. This is true as we know that the ebb and flow tides have a period of about 12 hours and 25 minutes, so that we have slightly less than two full periods a day. $1.93$ times to be exact.
We will now train the MOSM, CSM, SM-LMC, and CONV kernels on this data set and compare their prediction errors. Training using the Adam optimizer with a relatively high learning rate has proven to give good results.
method = 'Adam'
lr = 0.02
iters = 500
mosm = mogptk.MOSM(dataset, Q=1)
mosm.init_parameters(method='LS')
mosm.train(method=method, lr=lr, iters=iters, verbose=True)
mosm.print_parameters()
mosm.plot_prediction(transformed=True);
Starting optimization using Adam ‣ Model: MOSM ‣ Channels: 4 ‣ Parameters: 24 ‣ Training points: 656 ‣ Initial loss: 786.062 Start Adam: 0/500 0:00:00 loss= 786.062 5/500 0:00:00 loss= 795.188 10/500 0:00:01 loss= 782 15/500 0:00:01 loss= 771.757 20/500 0:00:02 loss= 764.159 25/500 0:00:02 loss= 756.437 30/500 0:00:03 loss= 750.094 35/500 0:00:03 loss= 743.911 40/500 0:00:03 loss= 738.08 45/500 0:00:04 loss= 732.187 50/500 0:00:05 loss= 726.097 55/500 0:00:05 loss= 720.219 60/500 0:00:06 loss= 714.316 65/500 0:00:06 loss= 708.415 70/500 0:00:07 loss= 702.543 75/500 0:00:07 loss= 696.688 80/500 0:00:08 loss= 690.835 85/500 0:00:08 loss= 685 90/500 0:00:09 loss= 679.175 95/500 0:00:09 loss= 673.362 100/500 0:00:10 loss= 667.561 105/500 0:00:10 loss= 661.773 110/500 0:00:11 loss= 655.997 115/500 0:00:11 loss= 650.234 120/500 0:00:12 loss= 644.484 125/500 0:00:12 loss= 638.747 130/500 0:00:13 loss= 633.023 135/500 0:00:13 loss= 627.313 140/500 0:00:14 loss= 621.618 145/500 0:00:14 loss= 615.937 150/500 0:00:15 loss= 610.27 155/500 0:00:15 loss= 604.619 160/500 0:00:15 loss= 598.983 165/500 0:00:16 loss= 593.362 170/500 0:00:16 loss= 587.758 175/500 0:00:17 loss= 582.17 180/500 0:00:17 loss= 576.599 185/500 0:00:18 loss= 571.045 190/500 0:00:18 loss= 565.509 195/500 0:00:19 loss= 559.99 200/500 0:00:19 loss= 554.49 205/500 0:00:20 loss= 549.008 210/500 0:00:20 loss= 543.545 215/500 0:00:21 loss= 538.102 220/500 0:00:22 loss= 532.681 225/500 0:00:22 loss= 528.741 230/500 0:00:23 loss= 522.976 235/500 0:00:23 loss= 516.926 240/500 0:00:23 loss= 511.387 245/500 0:00:24 loss= 506.179 250/500 0:00:25 loss= 500.667 255/500 0:00:25 loss= 496.508 260/500 0:00:26 loss= 490.46 265/500 0:00:26 loss= 485.053 270/500 0:00:27 loss= 479.739 275/500 0:00:28 loss= 474.699 280/500 0:00:28 loss= 470.087 285/500 0:00:29 loss= 465.042 290/500 0:00:29 loss= 459.954 295/500 0:00:30 loss= 454.478 300/500 0:00:30 loss= 449.243 305/500 0:00:31 loss= 444.437 310/500 0:00:31 loss= 439.658 315/500 0:00:32 loss= 435.318 320/500 0:00:32 loss= 429.746 325/500 0:00:33 loss= 425.007 330/500 0:00:33 loss= 420.098 335/500 0:00:34 loss= 415.259 340/500 0:00:34 loss= 410.412 345/500 0:00:35 loss= 405.69 350/500 0:00:35 loss= 400.511 355/500 0:00:36 loss= 396.208 360/500 0:00:36 loss= 392.667 365/500 0:00:37 loss= 387.523 370/500 0:00:37 loss= 382.441 375/500 0:00:38 loss= 377.95 380/500 0:00:38 loss= 372.572 385/500 0:00:38 loss= 368.444 390/500 0:00:39 loss= 364.263 395/500 0:00:39 loss= 359.892 400/500 0:00:40 loss= 355.42 405/500 0:00:41 loss= 351.035 410/500 0:00:41 loss= 345.964 415/500 0:00:42 loss= 342.307 420/500 0:00:42 loss= 338.099 425/500 0:00:43 loss= 333.791 430/500 0:00:43 loss= 329.921 435/500 0:00:44 loss= 325.76 440/500 0:00:44 loss= 320.972 445/500 0:00:45 loss= 317.583 450/500 0:00:45 loss= 313.309 455/500 0:00:46 loss= 310.912 460/500 0:00:46 loss= 306.698 465/500 0:00:47 loss= 301.783 470/500 0:00:47 loss= 297.81 475/500 0:00:48 loss= 294.537 480/500 0:00:48 loss= 290.138 485/500 0:00:49 loss= 287.786 490/500 0:00:50 loss= 282.784 495/500 0:00:50 loss= 278.182 500/500 0:00:51 loss= 276.189 Finished Optimization finished in 51.212 seconds ‣ Iterations: 500 ‣ Final loss: 276.189
Name | Range | Value |
---|---|---|
MOSM.weight | [1e-08, ∞) | [[21.98881847] [24.92294238] [24.905429 ] [23.23260914]] |
MOSM.mean | [1e-08, 0.1] | [[[0.00134441]] [[0.00134449]] [[0.00133982]] [[0.00134212]]] |
MOSM.variance | [1e-08, ∞) | [[[1.e-08]] [[1.e-08]] [[1.e-08]] [[1.e-08]]] |
MOSM.delay | (-∞, ∞) | [[[-0.00430478]] [[-0.03177763]] [[ 0.06000117]] [[-0.0458639 ]]] |
MOSM.phase | (-∞, ∞) | [[ 0.00302489] [-0.01283903] [ 0.00893065] [-0.00023575]] |
Gaussian.scale | [1e-08, ∞) | [0.46666492 0.5027038 0.49471458 0.51616945] |
csm = mogptk.CSM(dataset, Q=2)
csm.init_parameters(method='LS')
csm.train(method=method, lr=lr, iters=iters, verbose=True)
csm.print_parameters()
csm.plot_prediction();
Starting optimization using Adam ‣ Model: CSM ‣ Channels: 4 ‣ Parameters: 24 ‣ Training points: 656 ‣ Initial loss: 776.742 Start Adam: 0/500 0:00:00 loss= 776.742 5/500 0:00:00 loss= 772.852 10/500 0:00:01 loss= 765.38 15/500 0:00:02 loss= 758.795 20/500 0:00:03 loss= 752.497 25/500 0:00:03 loss= 746.605 30/500 0:00:04 loss= 740.829 35/500 0:00:05 loss= 734.988 40/500 0:00:06 loss= 729.157 45/500 0:00:07 loss= 723.358 50/500 0:00:08 loss= 717.581 55/500 0:00:09 loss= 711.82 60/500 0:00:09 loss= 706.064 65/500 0:00:10 loss= 700.321 70/500 0:00:11 loss= 694.583 75/500 0:00:12 loss= 688.855 80/500 0:00:13 loss= 683.133 85/500 0:00:14 loss= 677.42 90/500 0:00:15 loss= 671.715 95/500 0:00:16 loss= 666.019 100/500 0:00:16 loss= 660.343 105/500 0:00:17 loss= 654.681 110/500 0:00:18 loss= 649.025 115/500 0:00:19 loss= 643.336 120/500 0:00:19 loss= 637.711 125/500 0:00:20 loss= 632.071 130/500 0:00:21 loss= 626.419 135/500 0:00:22 loss= 620.829 140/500 0:00:22 loss= 615.204 145/500 0:00:23 loss= 609.636 150/500 0:00:24 loss= 604.036 155/500 0:00:25 loss= 598.504 160/500 0:00:25 loss= 592.942 165/500 0:00:26 loss= 587.368 170/500 0:00:27 loss= 581.915 175/500 0:00:27 loss= 576.365 180/500 0:00:28 loss= 570.825 185/500 0:00:29 loss= 565.415 190/500 0:00:30 loss= 559.899 195/500 0:00:30 loss= 554.416 200/500 0:00:31 loss= 549.071 205/500 0:00:32 loss= 543.581 210/500 0:00:33 loss= 538.149 215/500 0:00:34 loss= 532.843 220/500 0:00:35 loss= 527.408 225/500 0:00:35 loss= 522.036 230/500 0:00:36 loss= 516.799 235/500 0:00:37 loss= 511.401 240/500 0:00:37 loss= 506.089 245/500 0:00:38 loss= 500.865 250/500 0:00:39 loss= 495.553 255/500 0:00:40 loss= 490.333 260/500 0:00:40 loss= 485.119 265/500 0:00:41 loss= 479.94 270/500 0:00:42 loss= 474.783 275/500 0:00:43 loss= 469.586 280/500 0:00:43 loss= 464.48 285/500 0:00:44 loss= 459.373 290/500 0:00:45 loss= 454.244 295/500 0:00:45 loss= 449.182 300/500 0:00:46 loss= 444.266 305/500 0:00:47 loss= 439.314 310/500 0:00:47 loss= 434.297 315/500 0:00:48 loss= 429.283 320/500 0:00:49 loss= 424.3 325/500 0:00:49 loss= 419.459 330/500 0:00:50 loss= 414.495 335/500 0:00:51 loss= 409.68 340/500 0:00:51 loss= 404.849 345/500 0:00:52 loss= 400.075 350/500 0:00:53 loss= 395.292 355/500 0:00:53 loss= 390.542 360/500 0:00:54 loss= 385.809 365/500 0:00:55 loss= 381.295 370/500 0:00:55 loss= 376.493 375/500 0:00:56 loss= 372.004 380/500 0:00:57 loss= 367.282 385/500 0:00:57 loss= 362.783 390/500 0:00:58 loss= 358.283 395/500 0:00:59 loss= 353.66 400/500 0:00:59 loss= 349.509 405/500 0:01:00 loss= 344.835 410/500 0:01:01 loss= 340.467 415/500 0:01:02 loss= 336.031 420/500 0:01:02 loss= 331.746 425/500 0:01:03 loss= 327.46 430/500 0:01:04 loss= 323.614 435/500 0:01:04 loss= 319.019 440/500 0:01:05 loss= 314.922 445/500 0:01:06 loss= 310.686 450/500 0:01:06 loss= 306.558 455/500 0:01:07 loss= 302.509 460/500 0:01:08 loss= 298.513 465/500 0:01:08 loss= 294.41 470/500 0:01:09 loss= 290.55 475/500 0:01:10 loss= 286.874 480/500 0:01:10 loss= 282.821 485/500 0:01:11 loss= 279.041 490/500 0:01:12 loss= 275.31 495/500 0:01:13 loss= 271.425 500/500 0:01:13 loss= 267.842 Finished Optimization finished in 1 minute 13 seconds ‣ Iterations: 500 ‣ Final loss: 267.842
Name | Range | Value |
---|---|---|
Mixture[0].CSM.amplitude | [1e-08, ∞) | [[1.33006447] [1.70562927] [1.73567783] [1.49981674]] |
Mixture[0].CSM.mean | [1e-08, 0.1] | [0.00134278] |
Mixture[0].CSM.variance | [1e-08, ∞) | [1.e-08] |
Mixture[0].CSM.shift | (-∞, ∞) | [[ 0.00027924] [-0.01055069] [ 0.01106413] [-0.00224129]] |
Mixture[1].CSM.amplitude | [1e-08, ∞) | [[0.12330041] [0.11775069] [0.09704823] [0.17043213]] |
Mixture[1].CSM.mean | [1e-08, 0.1] | [0.0474827] |
Mixture[1].CSM.variance | [1e-08, ∞) | [1.e-08] |
Mixture[1].CSM.shift | (-∞, ∞) | [[ 0.07079462] [-0.16967516] [-0.07995379] [ 0.06302846]] |
Gaussian.scale | [1e-08, ∞) | [0.46427281 0.50143449 0.49446221 0.5148525 ] |
sm_lmc = mogptk.SM_LMC(dataset, Q=2)
sm_lmc.init_parameters(method='LS')
sm_lmc.train(method=method, lr=lr, iters=iters, verbose=True)
sm_lmc.print_parameters()
sm_lmc.plot_prediction();
Starting optimization using Adam ‣ Model: SM-LMC ‣ Channels: 4 ‣ Parameters: 16 ‣ Training points: 656 ‣ Initial loss: 801.481 Start Adam: 0/500 0:00:00 loss= 801.481 5/500 0:00:00 loss= 795.616 10/500 0:00:01 loss= 789.659 15/500 0:00:01 loss= 783.863 20/500 0:00:02 loss= 778.003 25/500 0:00:02 loss= 772.165 30/500 0:00:03 loss= 766.387 35/500 0:00:04 loss= 760.58 40/500 0:00:04 loss= 754.804 45/500 0:00:05 loss= 749.023 50/500 0:00:05 loss= 743.249 55/500 0:00:06 loss= 737.482 60/500 0:00:06 loss= 731.724 65/500 0:00:07 loss= 725.977 70/500 0:00:07 loss= 720.236 75/500 0:00:08 loss= 714.504 80/500 0:00:08 loss= 708.781 85/500 0:00:09 loss= 703.068 90/500 0:00:09 loss= 697.364 95/500 0:00:10 loss= 691.67 100/500 0:00:10 loss= 685.986 105/500 0:00:11 loss= 680.312 110/500 0:00:11 loss= 674.649 115/500 0:00:11 loss= 668.998 120/500 0:00:12 loss= 663.358 125/500 0:00:12 loss= 657.729 130/500 0:00:13 loss= 652.113 135/500 0:00:13 loss= 646.51 140/500 0:00:14 loss= 640.919 145/500 0:00:14 loss= 635.341 150/500 0:00:15 loss= 629.778 155/500 0:00:15 loss= 624.228 160/500 0:00:16 loss= 618.692 165/500 0:00:16 loss= 613.174 170/500 0:00:17 loss= 607.667 175/500 0:00:17 loss= 602.177 180/500 0:00:18 loss= 596.704 185/500 0:00:19 loss= 591.244 190/500 0:00:19 loss= 585.809 195/500 0:00:20 loss= 580.378 200/500 0:00:20 loss= 574.975 205/500 0:00:21 loss= 569.582 210/500 0:00:21 loss= 564.214 215/500 0:00:22 loss= 558.863 220/500 0:00:22 loss= 553.528 225/500 0:00:23 loss= 548.213 230/500 0:00:23 loss= 542.922 235/500 0:00:24 loss= 537.654 240/500 0:00:24 loss= 532.395 245/500 0:00:25 loss= 527.174 250/500 0:00:25 loss= 521.954 255/500 0:00:26 loss= 516.769 260/500 0:00:26 loss= 511.606 265/500 0:00:27 loss= 506.467 270/500 0:00:27 loss= 501.344 275/500 0:00:28 loss= 496.258 280/500 0:00:28 loss= 491.179 285/500 0:00:29 loss= 486.146 290/500 0:00:29 loss= 481.114 295/500 0:00:30 loss= 476.125 300/500 0:00:30 loss= 471.153 305/500 0:00:31 loss= 466.211 310/500 0:00:31 loss= 461.299 315/500 0:00:32 loss= 456.407 320/500 0:00:32 loss= 451.551 325/500 0:00:33 loss= 446.718 330/500 0:00:33 loss= 441.915 335/500 0:00:34 loss= 437.145 340/500 0:00:34 loss= 432.397 345/500 0:00:35 loss= 427.689 350/500 0:00:35 loss= 423.001 355/500 0:00:36 loss= 418.354 360/500 0:00:36 loss= 413.73 365/500 0:00:37 loss= 409.144 370/500 0:00:37 loss= 404.586 375/500 0:00:38 loss= 400.064 380/500 0:00:38 loss= 395.574 385/500 0:00:39 loss= 391.118 390/500 0:00:40 loss= 386.696 395/500 0:00:40 loss= 382.31 400/500 0:00:41 loss= 377.956 405/500 0:00:41 loss= 373.641 410/500 0:00:42 loss= 369.357 415/500 0:00:42 loss= 365.116 420/500 0:00:43 loss= 360.905 425/500 0:00:43 loss= 356.737 430/500 0:00:44 loss= 352.6 435/500 0:00:44 loss= 348.508 440/500 0:00:45 loss= 344.448 445/500 0:00:45 loss= 340.432 450/500 0:00:46 loss= 336.45 455/500 0:00:46 loss= 332.512 460/500 0:00:47 loss= 328.609 465/500 0:00:47 loss= 324.752 470/500 0:00:48 loss= 320.929 475/500 0:00:48 loss= 317.153 480/500 0:00:49 loss= 313.412 485/500 0:00:49 loss= 309.719 490/500 0:00:50 loss= 306.061 495/500 0:00:50 loss= 302.452 500/500 0:00:51 loss= 298.878 Finished Optimization finished in 51.066 seconds ‣ Iterations: 500 ‣ Final loss: 298.878
Name | Range | Value |
---|---|---|
LMC[0].Spectral.magnitude | fixed | 1.0000000000000002 |
LMC[0].Spectral.mean | [1e-08, 0.1] | [0.00134533] |
LMC[0].Spectral.variance | [1e-08, ∞) | [1.e-08] |
LMC[1].Spectral.magnitude | fixed | 1.0000000000000002 |
LMC[1].Spectral.mean | [1e-08, 0.1] | [0.04775743] |
LMC[1].Spectral.variance | [1e-08, ∞) | [1.e-08] |
LMC.weight | [1e-08, ∞) | [[[0.77631484] [0.22574124]] [[0.88511433] [0.21209765]] [[0.87966789] [0.18451463]] [[0.81840271] [0.26983282]]] |
Gaussian.scale | [1e-08, ∞) | [0.46282907 0.50399841 0.49855575 0.5143879 ] |
Above we can see that the models fit the data fairly well. For a quantitative analysis, below we calculate the prediction error using the MAE, RMSE, and MAPE metrics. It can be observed that in this case the MOSM model fits well to the data set.
mogptk.error(mosm, csm, sm_lmc, disp=True)
MAE | MAPE | RMSE | |
---|---|---|---|
Name | |||
MOSM | 0.242323 | 11.379547 | 0.299774 |
CSM | 0.238431 | 11.189683 | 0.298232 |
SM-LMC | 0.245076 | 11.510077 | 0.304930 |