waltlabtools.core module

Core functionality for the waltlabtools module.

Includes the classes Model and CalCurve, and core functions for assay analysis.

Everything in waltlabtools.core is automatically imported with waltlabtools, so it can be accessed via, e.g.,

import waltlabtools as wlt  # waltlabtools main functionality

cal_curve = wlt.CalCurve()  # creates a new empty calibration curve

class CalCurve(model=None, coefs=(), lod=- inf, lod_sds=3, force_lod=False)[source]

Bases: object

Calibration curve.

A calibration curve is the result of regressing the calibrator data with a specific functional form.

Parameters
  • model (Model or str) -- The functional model to use. Should be a valid Model object or a string referring to a built-in Model.

  • coefs (list-like) -- Numerical values of the parameters specified by model.

  • lod (numeric, optional) -- Lower limit of detection (LOD).

  • lod_sds (numeric, default 3) -- Number of standard deviations above blank at which the lower limit of detection is calculated. Common values include 2.5 (Quanterix), 3 (Walt Lab), and 10 (lower limit of quantification, LLOQ).

  • force_lod (bool, default False) -- Should readings below the LOD be set to the LOD?

bound_lod(x_flat)[source]

Sets values below the limit of detection (LOD) to the LOD.

If force_lod is True, returns a version of the given data with all values below the LOD set to be the LOD. Otherwise, returns the original data.

Parameters

x_flat (array) -- Data to be bounded. Must be an array, such as the output of flatten.

Returns

array -- A copy of x_flat, with all its values above the LOD if force_lod is True, or the original array otherwise.

classmethod from_data(x, y, model, lod_sds=3, force_lod: bool = False, use_inverse: bool = False, weights='1/y^2', p0=None, bounds=None, method=None, corr='c4')[source]

Constructs a calibration curve from data.

Parameters
  • x (array-like) -- The independent variable, e.g., concentration.

  • y (array-like) -- The dependent variable, e.g., fluorescence.

  • model (Model) -- Mathematical model used.

  • lod (numeric, optinal) -- Lower limit of detection (LOD).

  • lod_sds (numeric, default 3) -- Number of standard deviations above blank at which the lower limit of detection is calculated. CCommon values include 2.5 (Quanterix), 3 (Walt Lab), and 10 (lower limit of quantification, LLOQ).

  • force_lod (bool, default False) -- Should readings below the LOD be set to the LOD?

  • use_inverse (bool, default False) -- Should x be regressed as a function of y instead?

  • weights (str or array-like, default "1/y^2") -- Weights to be used. If array-like, should be the same size as x and y. Otherwise, can be one of the following:

    • "1/y^2" : Inverse-squared (1/y^2) weighting.

    • "1" : Equal weighting for all data points.

    Other strings raise a NotImplementedError.

  • p0 (array-like, optional) -- Initial guess for the parameters. If provided, must have the same length as the number of parameters. If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised).

  • bounds (2-tuple of array-like, optional) -- Lower and upper bounds on parameters. Defaults to no bounds. Each element of the tuple must be either an array with the length equal to the number of parameters, or a scalar (in which case the bound is taken to be the same for all parameters). Use np.inf with an appropriate sign to disable bounds on all or some parameters.

  • method ({"lm", "trf", "dogbox"}, optional) -- Method to use for optimization. See scipy.optimize.least_squares for more details. Default is "lm" for unconstrained problems and "trf" if bounds are provided. The method "lm" won’t work when the number of observations is less than the number of variables; use "trf" or "dogbox" in this case.

  • corr ({"n", "n-1", "n-1.5", "c4"} or numeric, default "c4") -- The sample standard deviation under-estimates the population standard deviation for a normally distributed variable. Specifies how this should be addressed. Options:

    • "n" : Divide by the number of samples to yield the uncorrected sample standard deviation.

    • "n-1" : Divide by the number of samples minus one to yield the square root of the unbiased sample variance.

    • "n-1.5" : Divide by the number of samples minus 1.5 to yield the approximate unbiased sample standard deviation.

    • "c4" : Divide by the correction factor to yield the exact unbiased sample standard deviation.

    • If numeric, gives the delta degrees of freedom.

Returns

CalCurve

classmethod from_function(fun, inverse, lod: float = - inf, lod_sds=3, force_lod=False, xscale='linear', yscale='linear')[source]

Constructs a calibration curve from a function.

Parameters
  • fun (function) -- Forward function, mapping values to measurement readings.

  • inverse (function) -- Inverse function, mapping measurement readings to values.

  • lod (numeric, optinal) -- Lower limit of detection (LOD).

  • lod_sds (numeric, default 3) -- Number of standard deviations above blank at which the lower limit of detection is calculated. Common values include 2.5 (Quanterix), 3 (Walt Lab), and 10 (lower limit of quantification, LLOQ).

  • force_lod (bool, default False) -- Should readings below the LOD be set to the LOD?

  • xscale, yscale ({"linear", "log", "symlog", "logit"}, default "linear") -- The natural scaling transformations for x and y. For example, "log" means that the data may be distributed log-normally and are best visualized on a log scale.

Returns

CalCurve

fun(x)[source]

Forward function, mapping values to measurement readings.

Use fun to convert values (e.g., concentration) to the measurement readings (e.g., fluorescence) that they should yield.

Parameters

x (numeric or array-like) -- Values, such as concentration.

Returns

y (same as input or array) -- Measurement readings, such as fluorescence, calculated from the values x using the calibration curve. If possible, y will be the same type, size, and shape as x; if not, y will be a 1D array of the same size as x.

inverse(y)[source]

Inverse function, mapping measurement readings to values.

Use inverse to convert measurement readings (e.g., fluorescence) to values (e.g., concentration) of the sample.

Parameters

y (numeric or array-like) -- Measurement readings, such as fluorescence.

Returns

x (same as input or array) -- Values, such as concentration, calculated from the values y using the calibration curve. If possible, x will be the same type, size, and shape as y; if not, x will be a 1D array of the same size as y.

class Model(fun=None, inverse=None, name: str = '', params=(), xscale='linear', yscale='linear')[source]

Bases: object

Mathematical model for calibration curve fitting.

A Model is an object with a function and its inverse, with one or more free parameters that can be fit to calibration curve data.

Parameters
  • fun (function) -- Forward functional form. Should be a function which takes in x and other parameters and returns y. The first parameter of fun should be x, and the remaining parameters should be the coefficients which are fit to the data (typically floats).

  • inverse (function) -- Inverse functional form. Should be a function which takes in y and other parameters and returns x. The first parameter of inverse should be y, and the remaining parameters should be the same coefficients as in fun.

  • name (str) -- The name of the function. For example, "4PL" or "linear".

  • params (list-like of str) -- The names of the parameters for the function. This should be the same length as the number of arguments which fun and inverse take after their inputs x and y, respectively.

  • xscale, yscale ({"linear", "log", "symlog", "logit"}, default "linear") -- The natural scaling transformations for x and y. For example, "log" means that the data may be distributed log-normally and are best visualized on a log scale.

aeb(fon_)[source]

The average number of enzymes per bead.

Converts the fraction of on-beads (fon) to the average number of enzymes per bead (AEB) using Poisson statistics. The formula used is aeb_ = -log(1 - fon_).

Parameters

fon_ (numeric or array-like) -- A scalar or array of fractions of beads which are "on."

Returns

aeb_ (same as input, or array) -- The average number of enzymes per bead.

See also

fon

inverse of aeb

c4(n)[source]

Factor c4 for unbiased estimation of the standard deviation.

For a finite sample, the sample standard deviation tends to underestimate the population standard deviation. See, e.g., https://www.spcpress.com/pdf/DJW353.pdf for details. Dividing the sample standard deviation by the correction factor c4 gives an unbiased estimator of the population standard deviation.

Parameters

n (numeric or array) -- The number of samples.

Returns

numeric or array -- The correction factor, usually written c4 or b(n).

See also

numpy.std

standard deviation

lod

limit of detection

flatten(data, on_bad_data='warn')[source]

Flattens most data structures.

Parameters
  • data (any) -- The data structure to be flattened. Can also be a primitive.

  • on_bad_data ({"error", "ignore", "warn"}, default "warn") -- Specifies what to do when the data cannot be coerced to an ndarray. Options are as follows:

    • "error" : Raises TypeError.

    • "ignore" : Returns a list or, failing that, the original object.

    • "warn" : Returns as in "ignore", but raises a warning.

Returns

flattened_data (array, list, or primitive) -- Flattened version of data. If on_bad_data="error", always an array.

fon(aeb_)[source]

The fraction of beads which are on.

Converts the average enzymes per bead (AEB) to the fraction of on-beads (fon) using Poisson statistics. The formula used is fon_ = 1 - exp(-aeb_).

Parameters

aeb_ (numeric or array-like) -- A scalar or array of the average number of enzymes per bead.

Returns

fon_ (same as input, or array) -- The fractions of beads which are "on."

See also

aeb

inverse of fon

gmnd(data)[source]

Geometric meandian.

For details, see https://xkcd.com/2435/. This function compares the three most common measures of central tendency for a given dataset: the arithmetic mean, the geometric mean, and the median.

Parameters

data (array-like) -- The data for which to take the measure of central tendency.

Returns

central_tendencies (dict of str -> numeric) -- The measures of central tendency, ordered by their distance from the geometric meandian. Its keys are:

  • "gmnd" : geometric meandian (always first)

  • "arithmetic" : arithmetic mean

  • "geometric" : geometric mean

  • "median" : median

lod(blank_signal, inverse_fun=None, sds=3, corr='c4')[source]

Compute the limit of detection (LOD).

Parameters
  • blank_signal (array-like) -- Signal (e.g., average number of enzymes per bead, AEB) of the zero calibrator. Must have at least two elements.

  • inverse_fun (function or CalCurve) -- The functional form used for the calibration curve. If a function, it should accept the measurement reading (y, e.g., fluorescence) as its only argument and return the value (x, e.g., concentration). If inverse_fun is a CalCurve object, the LOD will be calculated from its inverse method.

  • sds (numeric, optional) -- How many standard deviations above the mean should the background should the limit of detection be calculated at? Common values include 2.5 (Quanterix), 3 (Walt Lab), and 10 (lower limit of quantification, LLOQ).

  • corr ({"n", "n-1", "n-1.5", "c4"} or numeric, default "c4") -- The sample standard deviation under-estimates the population standard deviation for a normally distributed variable. Specifies how this should be addressed. Options:

    • "n" : Divide by the number of samples to yield the uncorrected sample standard deviation.

    • "n-1" : Divide by the number of samples minus one to yield the square root of the unbiased sample variance.

    • "n-1.5" : Divide by the number of samples minus 1.5 to yield the approximate unbiased sample standard deviation.

    • "c4" : Divide by the correction factor to yield the exact unbiased sample standard deviation.

    • If numeric, gives the delta degrees of freedom.

Returns

lod_x (numeric) -- The limit of detection, in units of x (e.g., concentration).

See also

c4

unbiased estimation of the population standard deviation

numpy.std

standard deviation

regress(model, x, y, use_inverse: bool = False, weights='1/y^2', p0=None, bounds=None, method=None)[source]

Performs a (nonlinear) regression and return coefficients.

Parameters
  • model (waltlabtools.Model or str) -- The functional model to use. Should be a valid waltlabtools.Model object or a string referring to a built-in Model.

  • x (array-like) -- The independent variable, e.g., concentration.

  • y (array-like) -- The dependent variable, e.g., fluorescence.

  • use_inverse (bool, default False) -- Should x be regressed as a function of y instead?

  • weights (str or array-like, default "1/y^2") -- Weights to be used. If array-like, should be the same size as x and y. Otherwise, can be one of the following:

    • "1/y^2" : Inverse-squared (1/y^2) weighting.

    • "1" : Equal weighting for all data points.

    Other strings raise a NotImplementedError.

  • p0 (array-like, optional) -- Initial guess for the parameters. If provided, must have the same length as the number of parameters. If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised).

  • bounds (2-tuple of array-like, optional) -- Lower and upper bounds on parameters. Defaults to no bounds. Each element of the tuple must be either an array with the length equal to the number of parameters, or a scalar (in which case the bound is taken to be the same for all parameters). Use np.inf with an appropriate sign to disable bounds on all or some parameters.

  • method ({"lm", "trf", "dogbox"}, optional) -- Method to use for optimization. See scipy.optimize.least_squares for more details. Default is "lm" for unconstrained problems and "trf" if bounds are provided. The method "lm" won’t work when the number of observations is less than the number of variables; use "trf" or "dogbox" in this case.

Returns

popt (array) -- Optimal values for the parameters so that the sum of the squared residuals is minimized.

See also

scipy.optimize.curve_fit

backend function used by regress