03 Parameter estimation¶

[Estimated execution time: 2 min]

Spectral kernels are flexible but difficult to optimize, to solve this we equip the toolkit with different ways to initialize the parameters before training.

In this notebook we will showcase the diferent initialization for single output spectral mixture and multi output kernels.

For single output spectral mixture kernel there are three ways:

  • IPS: uses a heuristic defined in Andrew Gordon Wilson's PhD thesis
    • Inverse of lengthscales drawn from truncated Gaussian |N(0, max_dist^2)|
    • Draw means from Unif(0, 0.5/minimum distance between two points)
    • Mixture weights should be roughly the std of the y values divided by the number of mixtures
  • BNSE: uses Bayesian Nonparametric Spectral Estimation (Tobar 2018) to estimate the power spectral density (PSD) of the signal
    • Find the peaks in the PSD and order it by magnitude
    • Take the means as the position of the first Q peaks
    • Take the lengthscales as the width of the peaks multiplied by 2
    • Mixture weight as the peaks magnitudes normalized so the sum is equal to the channel variance
  • Lomb Scargle: uses Lomb Scargle periodogram and obtain an estimate of the PSD, then follow the same heuristic as BNSE
  • SM: fit and independent GP with spectral mixture kernel for each channel, then use those parameters as initial parameters

The noise for each channel is initializated as $1/30$ of the channels' variances.

In [1]:
import mogptk
import numpy as np

Synthetic Data¶

To showcase the initialization, we first create an artificial dataset of channels of sums of sinuoids:

$$ y(t) = \sum_{i=1}^{3} a_i \sin(t f_i) + \epsilon, \quad \epsilon \sim \mathcal{N}(0, \sigma_n^2) $$

With known frequencies $f_i$ and amplitudes $a_i$, when fiting the spectral mixture kernel we should should expect the means to have similar values as the frequencies and the mixture weigths to be in proportion of the amplitudes.

First we create a numpy array with the function, then we erase half of the data in order to do predictions.

In [2]:
n_points = 500
frequencies = [0.2, 1.0, 2.0]
amplitudes = [1.0, 0.5, 0.5]

t = np.linspace(0.0, 20.0, n_points)
y = np.zeros(n_points)
for i in range(3):
    y += amplitudes[i] * np.sin(2.0*np.pi * frequencies[i] * t)
y += np.random.normal(scale=0.4, size=n_points)

# data class
data = mogptk.Data(t, y)
data.remove_range(start=10.0)
data.plot();

We show the spectrum and note that there are three peaks, one for each frequency.

In [3]:
data.plot_spectrum(maxfreq=3.0);

Without initialization¶

The kernel parameters are sampled randomly from $(0, 1]$, which is shown in the plot_spectrum function of the spectral mixture kernel.

In [4]:
model = mogptk.SM(data, Q=3)
model.plot_spectrum(title='PSD of spectral mixture kernel with random parameters', maxfreq=3.0);

Now we can do predictions with the non-trained non-initialized model.

In [5]:
model.plot_prediction(title='Untrained non-initialized model');

Initialization¶

Now we will initialize the parameters to improve training results. Other methods can be uncommented below to obtain the result of different initializations, see mogptk.SM.init_parameters for more information.

In [6]:
model.init_parameters(method='LS')
model.plot_spectrum(title='PSD with BNSE initialization', maxfreq=3.0);
In [7]:
model.print_parameters()
model.plot_prediction(title='Untrained model with BNSE initialization');
NameRangeValue
IMO[0].SM.magnitude[1e-08, ∞)[1.03716489 0.22403165 0.21952776]
IMO[0].SM.mean[1e-08, [[12.475000000000403], [12.475000000000403], [12.475000000000403]]][[0.197105 ] [1.9972475] [0.9892675]]
IMO[0].SM.variance[1e-08, ∞)[[0.00150509] [0.00143937] [0.00147651]]
Gaussian.scale[1e-08, ∞)[1.]

Training¶

In [8]:
model.train(iters=100, lr=0.2, verbose=True)
model.plot_spectrum(title='Spectrum of trained model', maxfreq=3.0);
Starting optimization using Adam
‣ Model: SM
‣ Channels: 1
‣ Parameters: 10
‣ Training points: 251
‣ Initial loss: 276.331

Start Adam:
    0/100   0:00:00  loss=     276.331
    1/100   0:00:00  loss=     300.771
    2/100   0:00:00  loss=     275.839
    3/100   0:00:00  loss=     274.518
    4/100   0:00:00  loss=     273.013
    5/100   0:00:00  loss=     258.905
    6/100   0:00:00  loss=     256.549
    7/100   0:00:00  loss=     258.029
    8/100   0:00:00  loss=     247.423
    9/100   0:00:00  loss=     244.125
   10/100   0:00:00  loss=     244.102
   11/100   0:00:00  loss=      238.85
   12/100   0:00:00  loss=     233.589
   13/100   0:00:00  loss=     230.749
   14/100   0:00:00  loss=     228.507
   15/100   0:00:00  loss=     223.948
   16/100   0:00:00  loss=     220.254
   17/100   0:00:00  loss=     217.788
   18/100   0:00:00  loss=     215.056
   19/100   0:00:00  loss=       211.2
   20/100   0:00:00  loss=     207.923
   21/100   0:00:00  loss=     205.399
   22/100   0:00:00  loss=     202.757
   23/100   0:00:00  loss=     199.434
   24/100   0:00:00  loss=     196.833
   25/100   0:00:00  loss=     194.453
   26/100   0:00:00  loss=     192.109
   27/100   0:00:00  loss=     189.386
   28/100   0:00:00  loss=      186.99
   29/100   0:00:00  loss=     185.008
   30/100   0:00:00  loss=     182.927
   31/100   0:00:00  loss=     180.632
   32/100   0:00:00  loss=     178.812
   33/100   0:00:00  loss=      177.17
   34/100   0:00:00  loss=     175.437
   35/100   0:00:00  loss=     173.672
   36/100   0:00:00  loss=     172.218
   37/100   0:00:00  loss=     171.004
   38/100   0:00:00  loss=     169.551
   39/100   0:00:00  loss=     168.333
   40/100   0:00:00  loss=     167.384
   41/100   0:00:00  loss=     166.388
   42/100   0:00:00  loss=     165.362
   43/100   0:00:00  loss=     164.644
   44/100   0:00:00  loss=     163.987
   45/100   0:00:00  loss=     163.211
   46/100   0:00:01  loss=     162.736
   47/100   0:00:01  loss=     162.297
   48/100   0:00:01  loss=     161.749
   49/100   0:00:01  loss=     161.491
   50/100   0:00:01  loss=     161.163
   51/100   0:00:01  loss=     160.852
   52/100   0:00:01  loss=     160.728
   53/100   0:00:01  loss=     160.485
   54/100   0:00:01  loss=     160.352
   55/100   0:00:01  loss=      160.27
   56/100   0:00:01  loss=     160.106
   57/100   0:00:01  loss=     160.104
   58/100   0:00:01  loss=     159.962
   59/100   0:00:01  loss=      159.96
   60/100   0:00:01  loss=     159.865
   61/100   0:00:01  loss=     159.835
   62/100   0:00:01  loss=     159.772
   63/100   0:00:01  loss=     159.737
   64/100   0:00:01  loss=     159.654
   65/100   0:00:01  loss=     159.622
   66/100   0:00:01  loss=     159.537
   67/100   0:00:01  loss=     159.515
   68/100   0:00:01  loss=     159.426
   69/100   0:00:01  loss=     159.414
   70/100   0:00:01  loss=     159.461
   71/100   0:00:01  loss=     159.916
   72/100   0:00:01  loss=     163.196
   73/100   0:00:01  loss=     174.534
   74/100   0:00:01  loss=     163.311
   75/100   0:00:01  loss=     170.107
   76/100   0:00:01  loss=     160.789
   77/100   0:00:01  loss=     175.552
   78/100   0:00:01  loss=     159.896
   79/100   0:00:01  loss=     174.038
   80/100   0:00:01  loss=     197.842
   81/100   0:00:01  loss=     203.956
   82/100   0:00:01  loss=     169.124
   83/100   0:00:01  loss=     183.666
   84/100   0:00:01  loss=     159.967
   85/100   0:00:01  loss=     176.526
   86/100   0:00:01  loss=     161.143
   87/100   0:00:01  loss=     162.098
   88/100   0:00:01  loss=     166.551
   89/100   0:00:01  loss=     161.725
   90/100   0:00:01  loss=     163.497
   91/100   0:00:01  loss=     161.155
   92/100   0:00:01  loss=     161.728
   93/100   0:00:01  loss=     163.276
   94/100   0:00:01  loss=     160.718
   95/100   0:00:01  loss=     159.806
   96/100   0:00:01  loss=     162.057
   97/100   0:00:01  loss=     161.631
   98/100   0:00:01  loss=     159.621
   99/100   0:00:01  loss=      160.05
  100/100   0:00:01  loss=     161.252
Finished

Optimization finished in 1.844 seconds
‣ Iterations: 100
‣ Final loss: 161.252
In [9]:
model.print_parameters()
model.plot_prediction(title='Prediction with trained model');
NameRangeValue
IMO[0].SM.magnitude[1e-08, ∞)[0.3331183 0.09311104 0.10393019]
IMO[0].SM.mean[1e-08, [[12.475000000000403], [12.475000000000403], [12.475000000000403]]][[0.20323631] [1.96778747] [1.00955058]]
IMO[0].SM.variance[1e-08, ∞)[[0.00032792] [0.00077363] [0.00054005]]
Gaussian.scale[1e-08, ∞)[0.42904129]

Now we can compare the trained model with the original frequencies. Note the magnitude and mean parameters of the spectral mixture kernel for each mixture.

In [10]:
print('Target frequencies:', frequencies)
print('Target amplitudes:', amplitudes)
Target frequencies: [0.2, 1.0, 2.0]
Target amplitudes: [1.0, 0.5, 0.5]

Initialization for multi output kernels¶

First we create a similar dataset as the one used in the 00 - Quick Start tutorial.

In [11]:
n_points = 100
t = np.linspace(0, 6, n_points)

y1 = np.sin(6.0*t) + 0.2*np.random.normal(size=len(t))
y2 = np.sin(6.0*t + 2.0) + 0.2*np.random.normal(size=len(t))
y3 = np.sin(6.0*t) - np.sin(4 * t) + 0.2*np.random.normal(size=len(t))
y4 = 4.0*np.sin(6.0*(t-2.0)) + 0.3*np.random.normal(size=len(t))

dataset = mogptk.DataSet(t, [y1, y2, y3, y4], names=['A','B','C','D'])
for data in dataset:
    data.remove_randomly(pct=0.3)
dataset[0].remove_range(start=2.0)

Create model¶

In [33]:
model = mogptk.MOSM(dataset, Q=2)
model.plot_prediction(title='Untrained non-initialized model');

Initialization¶

In [34]:
model.init_parameters(method='LS')
model.print_parameters()
model.plot_prediction(title='Untrained model with BNSE initialization');
NameRangeValue
MOSM.weight[1e-08, ∞)[[0.67934309 0.26302221] [0.64051521 0.19840758] [0.71189351 0.68434904] [2.64804312 0.72973996]]
MOSM.mean[1e-08, [[[8.250000000000007], [8.250000000000007]], [[8.250000000000007], [8.250000000000007]], [[8.250000000000007], [8.250000000000007]], [[8.250000000000007], [8.250000000000007]]]][[[0.9405 ] [0.789525]] [[0.957 ] [1.6929 ]] [[0.65835 ] [0.91905 ]] [[0.95535 ] [0.718575]]]
MOSM.variance[1e-08, ∞)[[[0.03938323] [0.00123242]] [[0.00334273] [0.00418076]] [[0.00370318] [0.00323805]] [[0.00370517] [0.00140309]]]
MOSM.delay(-∞, ∞)[[[0.] [0.]] [[0.] [0.]] [[0.] [0.]] [[0.] [0.]]]
MOSM.phase(-∞, ∞)[[0. 0.] [0. 0.] [0. 0.] [0. 0.]]
Gaussian.scale[1e-08, ∞)[1. 1. 1. 1.]

Training¶

In [35]:
model.train(iters=1000, lr=0.02, verbose=True)
model.print_parameters()
Starting optimization using Adam
‣ Model: MOSM
‣ Channels: 4
‣ Parameters: 44
‣ Training points: 233
‣ Initial loss: 272.922

Start Adam:
     0/1000   0:00:00  loss=     272.922
    10/1000   0:00:00  loss=     244.433
    20/1000   0:00:00  loss=       238.4
    30/1000   0:00:00  loss=     231.599
    40/1000   0:00:01  loss=     226.586
    50/1000   0:00:01  loss=     221.943
    60/1000   0:00:01  loss=     217.406
    70/1000   0:00:01  loss=      212.94
    80/1000   0:00:02  loss=       208.5
    90/1000   0:00:02  loss=     204.091
   100/1000   0:00:02  loss=     199.713
   110/1000   0:00:03  loss=     195.367
   120/1000   0:00:03  loss=     191.053
   130/1000   0:00:03  loss=     186.713
   140/1000   0:00:03  loss=     182.436
   150/1000   0:00:03  loss=     178.184
   160/1000   0:00:04  loss=     173.981
   170/1000   0:00:04  loss=     169.819
   180/1000   0:00:04  loss=     165.696
   190/1000   0:00:04  loss=     161.613
   200/1000   0:00:05  loss=     157.568
   210/1000   0:00:05  loss=     153.557
   220/1000   0:00:05  loss=     149.583
   230/1000   0:00:05  loss=     145.645
   240/1000   0:00:06  loss=     141.743
   250/1000   0:00:06  loss=     137.882
   260/1000   0:00:06  loss=     134.083
   270/1000   0:00:06  loss=     130.265
   280/1000   0:00:07  loss=     126.551
   290/1000   0:00:07  loss=     122.826
   300/1000   0:00:07  loss=     119.129
   310/1000   0:00:07  loss=     115.483
   320/1000   0:00:08  loss=     111.979
   330/1000   0:00:08  loss=     108.337
   340/1000   0:00:08  loss=     104.811
   350/1000   0:00:08  loss=     101.546
   360/1000   0:00:09  loss=     97.9782
   370/1000   0:00:09  loss=     94.5513
   380/1000   0:00:09  loss=     91.1818
   390/1000   0:00:10  loss=     87.8638
   400/1000   0:00:10  loss=     84.7113
   410/1000   0:00:10  loss=     81.3957
   420/1000   0:00:10  loss=     78.3696
   430/1000   0:00:11  loss=     75.0573
   440/1000   0:00:11  loss=     72.1413
   450/1000   0:00:11  loss=     68.9978
   460/1000   0:00:11  loss=     66.1592
   470/1000   0:00:12  loss=     63.1692
   480/1000   0:00:12  loss=     60.8196
   490/1000   0:00:12  loss=     57.5414
   500/1000   0:00:12  loss=     54.8551
   510/1000   0:00:13  loss=     52.5929
   520/1000   0:00:13  loss=     49.8251
   530/1000   0:00:13  loss=     48.1137
   540/1000   0:00:13  loss=     46.3959
   550/1000   0:00:14  loss=     47.5355
   560/1000   0:00:14  loss=     42.5368
   570/1000   0:00:14  loss=     39.1047
   580/1000   0:00:14  loss=     38.2302
   590/1000   0:00:15  loss=     35.6005
   600/1000   0:00:15  loss=      32.571
   610/1000   0:00:15  loss=     31.0133
   620/1000   0:00:15  loss=     27.4557
   630/1000   0:00:15  loss=     25.6062
   640/1000   0:00:16  loss=     22.9987
   650/1000   0:00:16  loss=     21.1538
   660/1000   0:00:16  loss=     19.9133
   670/1000   0:00:16  loss=     19.3223
   680/1000   0:00:17  loss=     18.7254
   690/1000   0:00:17  loss=     21.9106
   700/1000   0:00:17  loss=     19.1564
   710/1000   0:00:17  loss=     29.6331
   720/1000   0:00:18  loss=     18.4793
   730/1000   0:00:18  loss=     22.5961
   740/1000   0:00:18  loss=     42.4699
   750/1000   0:00:18  loss=     36.1078
   760/1000   0:00:19  loss=     31.2565
   770/1000   0:00:19  loss=      20.875
   780/1000   0:00:19  loss=     13.0969
   790/1000   0:00:19  loss=     11.1169
   800/1000   0:00:20  loss=     9.06666
   810/1000   0:00:20  loss=     6.37071
   820/1000   0:00:20  loss=     6.44954
   830/1000   0:00:20  loss=      4.8584
   840/1000   0:00:21  loss=     6.47215
   850/1000   0:00:21  loss=     12.9368
   860/1000   0:00:21  loss=     2.75654
   870/1000   0:00:22  loss=   -0.625024
   880/1000   0:00:22  loss=    -2.22882
   890/1000   0:00:22  loss=    -3.16455
   900/1000   0:00:22  loss=    -4.42509
   910/1000   0:00:23  loss=    -5.17897
   920/1000   0:00:23  loss=    -5.90046
   930/1000   0:00:23  loss=    -6.53462
   940/1000   0:00:23  loss=    -7.08328
   950/1000   0:00:24  loss=    -6.90281
   960/1000   0:00:24  loss=    -6.98623
   970/1000   0:00:24  loss=    -8.57248
   980/1000   0:00:25  loss=    -8.50756
   990/1000   0:00:25  loss=    -9.26701
  1000/1000   0:00:25  loss=    -9.52155
Finished

Optimization finished in 25.611 seconds
‣ Iterations: 1000
‣ Final loss: -9.52155
NameRangeValue
MOSM.weight[1e-08, ∞)[[0.28977133 1.253362 ] [1.30707872 0.16339175] [1.49662833 1.22960456] [3.9762585 1.78731461]]
MOSM.mean[1e-08, [[[8.250000000000007], [8.250000000000007]], [[8.250000000000007], [8.250000000000007]], [[8.250000000000007], [8.250000000000007]], [[8.250000000000007], [8.250000000000007]]]][[[1.10855194] [0.94898086]] [[0.9561575 ] [1.5064556 ]] [[0.63263751] [0.94939088]] [[0.9564591 ] [0.94890793]]]
MOSM.variance[1e-08, ∞)[[[0.00890093] [0.00454058]] [[0.00506477] [0.00400559]] [[0.00268854] [0.00388693]] [[0.00492967] [0.00398121]]]
MOSM.delay(-∞, ∞)[[[-2.59980282e-001] [ 3.21308374e-001]] [[ 3.36093980e-002] [-1.49722270e-105]] [[-1.02971189e-001] [-2.31448049e-001]] [[ 1.15299908e-003] [ 3.20473412e-001]]]
MOSM.phase(-∞, ∞)[[-2.32637571e-001 2.91212777e-001] [ 8.73330157e-002 -1.16733714e-105] [-1.00829815e-001 -1.68755757e-001] [-5.75270362e-002 2.70990855e-001]]
Gaussian.scale[1e-08, ∞)[0.21020249 0.20579194 0.20674967 0.27351801]
In [36]:
model.plot_prediction(title='Trained model');
In [ ]: