Guide to Building Probabilistic Models¶
Getting Started with Probabilistic Models¶
InferPy focuses on hirearchical probabilistic models structured in two different layers:
- A prior model defining a joint distribution \(p(\mathbf{w})\) over the global parameters of the model. \(\mathbf{w}\) can be a single random variable or a bunch of random variables with any given dependency structure.
- A data or observation model defining a joint conditional distribution \(p(\mathbf{x},\mathbf{z}|\mathbf{w})\) over the observed quantities \(\mathbf{x}\) and the the local hidden variables \(\mathbf{z}\) governing the observation \(\mathbf{x}\). This data model is specified in a single-sample basis. There are many models of interest without local hidden variables, in that case, we simply specify the conditional \(p(\mathbf{x}|\mathbf{w})\). Similarly, either \(\mathbf{x}\) or \(\mathbf{z}\) can be a single random variable or a bunch of random variables with any given dependency structure.
For example, a Bayesian PCA model has the following graphical structure,

Bayesian PCA
The prior model are the variables \(w_k\). The data model is the part of the model surrounded by the box indexed by N.
And this is how this Bayesian PCA model is denfined in InferPy:
import edward as ed
import inferpy as inf
from inferpy.models import Normal
K, d, N = 5, 10, 200
# model definition
with inf.ProbModel() as m:
#define the weights
with inf.replicate(size=K):
w = Normal(0, 1, dim=d)
# define the generative model
with inf.replicate(size=N):
z = Normal(0, 1, dim=K)
x = Normal(inf.matmul(z,w), 1.0, observed=True, dim=d)
m.compile()
The with inf.replicate(size = N)
sintaxis is used to replicate the
random variables contained within this construct. It follows from the
so-called plateau notation to define the data generation part of a
probabilistic model. Every replicated variable is conditionally
idependent given the previous random variables (if any) defined
outside the with statement.
Random Variables¶
Following Edward’s approach, a random variable \(x\) is an object parametrized by a tensor \(\theta\) (i.e. a TensorFlow’s tensor or numpy’s ndarray). The number of random variables in one object is determined by the dimensions of its parameters (like in Edward) or by the ‘dim’ argument (inspired by PyMC3 and Keras):
import inferpy as inf
import tensorflow as tf
import numpy as np
# different ways of declaring 1 batch of 5 Normal distributions
x = inf.models.Normal(loc = 0, scale=1, dim=5) # x.shape = [5]
x = inf.models.Normal(loc = [0, 0, 0, 0, 0], scale=1) # x.shape = [5]
x = inf.models.Normal(loc = np.zeros(5), scale=1) # x.shape = [5]
x = inf.models.Normal(loc = 0, scale=tf.ones(5)) # x.shape = [5]
The with inf.replicate(size = N)
sintaxis can also be used to define
multi-dimensional objects:
with inf.replicate(size=10):
x = inf.models.Normal(loc=0, scale=1, dim=5) # x.shape = [10,5]
Following Edward’s approach, the multivariate dimension is the innermost (right-most) dimension of the parameters.
Note that indexing is supported:
y = x[7,4] # y.shape = [1]
y2 = x[7] # y2.shape = [5]
y3 = x[7,:] # y2.shape = [5]
y4 = x[:,4] # y4.shape = [10]
Moreover, we may use indexation for defining new variables whose indexes may be other (discrete) variables:
z = inf.models.Categorical(logits = np.zeros(5))
yz = inf.models.Normal(loc=x[0,z], scale=1) # yz.shape = [1]
Any random variable in InferPy contain the following (optional) input parameters in the constructor:
validate_args
: Python boolean indicating that possibly expensive checks with the input parameters are enabled. By default, it is set toFalse
.allow_nan_stats
: WhenTrue
, the value “NaN” is used to indicate the result is undefined. Otherwise an exception is raised. Its default value isTrue
.name
: Python string with the name of the underlying Tensor object.observed
: Python boolean which is used to indicate whether a variable is observable or not . The default value isFalse
dim
: dimension of the variable. The default value isNone
Inferpy supports a wide range of probability distributions. Details of the specific arguments for each supported distributions are specified in the following sections. `
Probabilistic Models¶
A probabilistic model defines a joint distribution over observable
and non-observable variables, \(p(\mathbf{w}, \mathbf{z}, \mathbf{x})\) for the
running example. The variables in the model are the ones defined using the
with inf.ProbModel() as pca:
construct. Alternatively, we can also use a builder,
m = inf.ProbModel(varlist=[w,z,x])
m.compile()
The model must be compiled before it can be used.
Like any random variable object, a probabilistic model is equipped with
methods such as sample()
, log_prob()
and sum_log_prob()
. Then, we can sample data
from the model and compute the log-likelihood of a data set:
data = m.sample(1000)
log_like = m.log_prob(data)
sum_log_like = m.sum_log_prob(data)
Random variables can be involved in expressive deterministic operations. Dependecies between variables are modelled by setting a given variable as a parameter of another variable. For example:
with inf.ProbModel() as m:
theta = inf.models.Beta(0.5,0.5)
z = inf.models.Categorical(probs=[theta, 1-theta], name="z")
m.sample()
Moreover, we might consider using the function inferpy.case
as the parameter of other random variables:
# Categorical variable depending on another categorical variable
with inf.ProbModel() as m2:
y = inf.models.Categorical(probs=[0.4,0.6], name="y")
x = inf.models.Categorical(probs=inf.case({y.equal(0): [0.0, 1.0],
y.equal(1): [1.0, 0.0] }), name="x")
m2.sample()
# Categorical variable depending on a Normal distributed variable
with inf.ProbModel() as m3:
a = inf.models.Normal(0,1, name="a")
b = inf.models.Categorical(probs=inf.case({a>0: [0.0, 1.0],
a<=0: [1.0, 0.0]}), name="b")
m3.sample()
# Normal distributed variable depending on a Categorical variable
with inf.ProbModel() as m4:
d = inf.models.Categorical(probs=[0.4,0.6], name="d")
c = inf.models.Normal(loc=inf.case({d.equal(0): 0.,
d.equal(1): 100.}), scale=1., name="c")
m4.sample()
Note that we might use the case function inside the replicate construct. The result will be a multi-batch random variable having the same distribution for each batch. When obtaining a sample from the model, each sample of a given batch in x is independent of the rest.
with inf.ProbModel() as m:
y = inf.models.Categorical(probs=[0.4,0.6], name="y")
with inf.replicate(size=10):
x = inf.models.Categorical(probs=inf.case({y.equal(0): [0.5, 0.5],
y.equal(1): [1.0, 0.0] }), name="x")
m.sample()
We can also use the functions inferpy.case_states
or inferpy.gather
for defining
the same model.
with inf.ProbModel() as m:
y = inf.models.Categorical(probs=[0.4,0.6], name="y")
x = inf.models.Categorical(probs=inf.case_states(y, {0: [0.0, 1.0],
1: [1.0, 0.0] }), name="x")
m.sample()
with inf.ProbModel() as m:
y = inf.models.Categorical(probs=[0.4,0.6], name="y")
with inf.replicate(size=10):
x = inf.models.Categorical(probs=inf.gather([[0.5, 0.5], [1.0, 0.0]], y), name="x")
m.sample()
We can use the function inferpy.case_states
with a list of variables (or multidimensional variables):
y = inf.models.Categorical(probs=[0.5,0.5], name="y", dim=2)
p = inf.case_states(y, {(0,0): [1.0, 0.0, 0.0, 0.0], (0,1): [0.0, 1.0, 0.0, 0.0],
(1, 0): [0.0, 0.0, 1.0, 0.0], (1,1): [0.0, 0.0, 0.0, 1.0]} )
with inf.replicate(size=10):
x = inf.models.Categorical(probs=p, name="x")
####
y = inf.models.Categorical(probs=[0.5,0.5], name="y", dim=1)
z = inf.models.Categorical(probs=[0.5,0.5], name="z", dim=1)
p = inf.case_states((y,z), {(0,0): [1.0, 0.0, 0.0, 0.0], (0,1): [0.0, 1.0, 0.0, 0.0],
(1, 0): [0.0, 0.0, 1.0, 0.0], (1,1): [0.0, 0.0, 0.0, 1.0]} )
with inf.replicate(size=10):
x = inf.models.Categorical(probs=p, name="x")
####
p = inf.case_states([y,z], {(0,0): [1.0, 0.0, 0.0, 0.0], (0,1): [0.0, 1.0, 0.0, 0.0],
(1, 0): [0.0, 0.0, 1.0, 0.0], (1,1): [0.0, 0.0, 0.0, 1.0]} )
with inf.replicate(size=10):
x = inf.models.Categorical(probs=p, name="x")
Supported Probability Distributions¶
Supported probability distributions are located in the package inferpy.models
. All of them
have inferpy.models.RandomVariable
as superclass. A list with all the supported distributions can be obtained as
as follows.
>>> inf.models.ALLOWED_VARS
['Bernoulli', 'Beta', 'Categorical', 'Deterministic', 'Dirichlet', 'Exponential', 'Gamma', 'InverseGamma', 'Laplace', 'Multinomial', 'Normal', 'Poisson', 'Uniform']
Bernoulli¶
Binary distribution which takes the value 1 with probability \(p\) and the value with \(1-p\). Its probability mass function is
An example of definition in InferPy of a random variable following a Bernoulli distribution is shown below. Note that the
input parameter probs
corresponds to \(p\) in the previous equation.
x = inf.models.Bernoulli(probs=0.5)
# or
x = inf.models.Bernoulli(logits=0)
This distribution can be initialized by indicating the logit function of the probability, i.e., \(logit(p) = log(\frac{p}{1-p})\).
Beta¶
Continuous distribution defined in the interval \([0,1]\) and parametrized by two positive shape parameters, denoted \(\alpha\) and \(\beta\).
where B is the beta function
The definition of a random variable following a Beta distribution is done as follows.
x = inf.models.Beta(concentration0=0.5, concentration1=0.5)
# or simply:
x = inf.models.Beta(0.5,0.5)
Note that the input parameters concentration0
and concentration1
correspond to the shape
parameters \(\alpha\) and \(\beta\) respectively.
Categorical¶
Discrete probability distribution that can take \(k\) possible states or categories. The probability of each state is separately defined:
where \(\mathbf{p} = (p_1, p_2, \ldots, p_k)\) is a k-dimensional vector with the probability associated to each possible state.
The definition of a random variable following a Categorical distribution is done as follows.
x = inf.models.Categorical(probs=[0.5,0.5])
# or
x = inf.models.Categorical(logits=[0,0])
Deterministic¶
The deterministic distribution is a probability distribution in a space (continuous or discrete) that always takes the same value \(k_0\). Its probability density (or mass) function can be defined as follows.
The definition of a random variable following a Beta distribution is done as follows:
x = inf.models.Deterministic(loc=5)
where the input parameter loc
corresponds to the value \(k_0\).
Dirichlet¶
Dirichlet distribution is a continuous multivariate probability distribution parmeterized by a vector of positive reals \((\alpha_1,\alpha_2,\ldots,\alpha_k)\). It is a multivariate generalization of the beta distribution. Dirichlet distributions are commonly used as prior distributions in Bayesian statistics. The Dirichlet distribution of order \(k \geq 2\) has the following density function.
The definition of a random variable following a Beta distribution is done as follows:
x = inf.models.Dirichlet(concentration=[5,1])
# or simply:
x = inf.models.Dirichlet([5,1])
where the input parameter concentration
is the vector \((\alpha_1,\alpha_2,\ldots,\alpha_k)\).
Exponential¶
The exponential distribution (also known as negative exponential distribution) is defined over a continuous domain and describes the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. Its probability density function is
where \(\lambda>0\) is the rate or inverse scale.
The definition of a random variable following a exponential distribution is done as follows:
x = inf.models.Exponential(rate=1)
# or simply
x = inf.models.Exponential(1)
where the input parameter rate
corresponds to the value \(\lambda\).
Gamma¶
The Gamma distribution is a continuous probability distribution parametrized by a concentration (or shape) parameter \(\alpha>0\), and an inverse scale parameter \(\lambda>0\) called rate. Its density function is defined as follows.
for \(x > 0\) and where \(\Gamma(\alpha)\) is the gamma function.
The definition of a random variable following a gamma distribution is done as follows:
x = inf.models.Gamma(concentration=3, rate=2)
where the input parameters concentration
and rate
corespond to \(\alpha\) and \(\beta\) respectively.
Inverse-gamma¶
The Inverse-gamma distribution is a continuous probability distribution which is the distribution of the reciprocal of a variable distributed according to the gamma distribution. It is also parametrized by a concentration (or shape) parameter \(\alpha>0\), and an inverse scale parameter \(\lambda>0\) called rate. Its density function is defined as follows.
for \(x > 0\) and where \(\Gamma(\alpha)\) is the gamma function.
The definition of a random variable following a inverse-gamma distribution is done as follows:
x = inf.models.InverseGamma(concentration=3, rate=2)
where the input parameters concentration
and rate
corespond to \(\alpha\) and \(\beta\) respectively.
Laplace¶
The Laplace distribution is a continuous probability distribution with the following density function
The definition of a random variable following a Beta distribution is done as follows:
x = inf.models.Laplace(loc=0, scale=1)
# or simply
x = inf.models.Laplace(0,1)
where the input parameter loc
and scale
correspond to \(\mu\) and \(\sigma\) respectively.
Multinomial¶
The multinomial is a discrete distribution which models the probability of counts resulting from repeating \(n\) times an experiment with \(k\) possible outcomes. Its probability mass function is defined below.
where \(\mathbf{p}\) is a k-dimensional vector defined as \(\mathbf{p} = (p_1, p_2, \ldots, p_k)\) with the probability associated to each possible outcome.
The definition of a random variable following a multinomial distribution is done as follows:
x = inf.models.Multinomial(total_count=4, probs=[0.5,0.5])
# or
x = inf.models.Multinomial(total_count=4, logits=[0,0])
Multivariate-Normal¶
A multivariate-normal (or Gaussian) defines a set of normal-distributed variables which are assumed to be idependent. In other words, the covariance matrix is diagonal.
A single multivariate-normal distribution defined on \(\mathbb{R}^2\) can be defined as follows.
x = inf.models.MultivariateNormalDiag(
loc=[1., -1],
scale_diag=[1, 2.]
)
Normal¶
The normal (or Gaussian) distribution is a continuous probability distribution with the following density function
where \(\mu\) is the mean or expectation of the distribution, \(\sigma\) is the standard deviation, and \(\sigma^{2}\) is the variance.
A normal distribution can be defined as follows.
x = inf.models.Normal(loc=0, scale=1)
# or
x = inf.models.Normal(0,1)
where the input parameter loc
and scale
correspond to \(\mu\) and \(\sigma\) respectively.
Poisson¶
The Poisson distribution is a discrete probability distribution for modeling the number of times an event occurs in an interval of time or space. Its probability mass function is
where \(\lambda\) is the rate or number of events per interval.
A Poisson distribution can be defined as follows.
x = inf.models.Poisson(rate=4)
# or
x = inf.models.Poisson(4)
Uniform¶
The continuous uniform distribution or rectangular distribution assings the same probability to any \(x\) in the interval \([a,b]\).
A uniform distribution can be defined as follows.
x = inf.models.Uniform(low=1, high=3)
# or
inf.models.Uniform(1,3)
where the input parameters low
and high
correspond to the lower and upper bounds of the interval \([a,b]\).