Getting Started:¶
30 seconds to InferPy¶
The core data structures of InferPy is a probabilistic model, defined as a set of random variables with a conditional dependency structure. A random varible is an object parameterized by a set of Numpy’s arrays.
Let’s look at a simple (Bayesian) probabilistic component analysis model. Graphically the model can be defined as follows,
We start defining the prior of the global parameters,
import inferpy as inf
from inferpy.models import Normal
# K defines the number of components.
K=10
# d defines the number of dimensions
d=20
#Prior for the principal components
with inf.replicate(size = K):
w = Normal(loc = 0, scale = 1, dim = d) # x.shape = [K,d]
InferPy supports the definition of plateau notation by using the
construct with inf.replicate(size = K)
, which replicates K times the
random variables enclosed within this anotator. Every replicated
variable is assumed to be independent.
This with inf.replicate(size = N)
construct is also useful when
defining the model for the data:
# Number of observations
N = 1000
# define the generative model
with inf.replicate(size=N):
z = Normal(0, 1, dim=K) # z.shape = [N,K]
x = Normal(inf.matmul(z,w), 1.0, observed=True, dim=d) # x.shape = [N,d]
As commented above, the variables are surrounded by a
with
statement to inidicate that the defined random variables will
be reapeatedly used in each data sample. In this case, every replicated
variable is conditionally idependent given the variable \(\mathbf{w}\)
defined above.
Once the random variables of the model are defined, the probablitic model itself can be created and compiled. The probabilistic model defines a joint probability distribuiton over all these random variables.
from inferpy import ProbModel
# Define the model
pca = ProbModel(varlist = [w,z,x])
# Compile the model
pca.compile(infMethod = 'KLqp')
During the model compilation we specify different inference methods that will be used to learn the model.
from inferpy import ProbModel
# Define the model
pca = ProbModel(varlist = [w,z,x])
# Compile the model
pca.compile(infMethod = 'Variational')
The inference method can be further configure. But, as in Keras, a core principle is to try make things reasonbly simple, while allowing the user the full control if needed.
Every random variable object is equipped with methods such as
log_prob()
and sample()
. Similarly, a probabilistic model is also
equipped with the same methods. Then, we can sample data from the model
anbd compute the log-likelihood of a data set:
# Sample data from the model
data = pca.sample(size = 100)
# Compute the log-likelihood of a data set
log_like = pca.log_prob(data)
Of course, you can fit your model with a given data set:
# compile and fit the model with training data
pca.compile()
pca.fit(data)
#extract the hidden representation from a set of observations
hidden_encoding = pca.posterior(z)