Simd Library Documentation.

Home | Release Notes | Download | Documentation | Issues | GitHub

Functions for accelerating of inference of neural network in Synet Framework. More...

Modules

 Activation functions
 Functions to acceleratе activation functions in Synet Framework.
 
 Conversion functions
 Functions to acceleratе conversion in Synet Framework.
 
 Convolution framework
 A framework to accelerate convolution in Synet Framework.
 
 Winograd functions
 Functions to acceleratе Winograd convolution algorithm in Synet Framework.
 

Data Structures

struct  SimdConvolutionParameters
 

Typedefs

typedef void(* SimdGemm32fNNPtr) (size_t M, size_t N, size_t K, const float *alpha, const float *A, size_t lda, const float *B, size_t ldb, const float *beta, float *C, size_t ldc)
 Callback function type "SimdGemm32fNNPtr";. More...
 
typedef struct SimdConvolutionParameters SimdConvolutionParameters
 

Enumerations

enum  SimdConvolutionActivationType {
  SimdConvolutionActivationIdentity = 0,
  SimdConvolutionActivationRelu,
  SimdConvolutionActivationLeakyRelu,
  SimdConvolutionActivationRestrictRange,
  SimdConvolutionActivationPrelu,
  SimdConvolutionActivationElu,
  SimdConvolutionActivationHswish
}
 
enum  SimdSynetEltwiseOperationType {
  SimdSynetEltwiseOperationProduct,
  SimdSynetEltwiseOperationSum,
  SimdSynetEltwiseOperationMax,
  SimdSynetEltwiseOperationMin
}
 
enum  SimdSynetUnaryOperation32fType {
  SimdSynetUnaryOperation32fAbs,
  SimdSynetUnaryOperation32fExp,
  SimdSynetUnaryOperation32fLog,
  SimdSynetUnaryOperation32fNeg,
  SimdSynetUnaryOperation32fRsqrt,
  SimdSynetUnaryOperation32fSqrt,
  SimdSynetUnaryOperation32fTanh,
  SimdSynetUnaryOperation32fZero
}
 
enum  SimdTensorFormatType {
  SimdTensorFormatUnknown = -1,
  SimdTensorFormatNchw,
  SimdTensorFormatNhwc,
  SimdTensorFormatNchw4c,
  SimdTensorFormatNchw8c,
  SimdTensorFormatNchw16c,
  SimdTensorFormatNchwXc,
  SimdTensorFormatOiyx,
  SimdTensorFormatYxio,
  SimdTensorFormatOyxi4o,
  SimdTensorFormatOyxi8o,
  SimdTensorFormatOyxi16o,
  SimdTensorFormatOyxiXo
}
 
enum  SimdTensorDataType {
  SimdTensorDataUnknown = -1,
  SimdTensorData32f,
  SimdTensorData32i,
  SimdTensorData8i,
  SimdTensorData8u
}
 

Functions

SIMD_API void SimdSynetAddBias (const float *bias, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 Adds a bias to given vector. More...
 
SIMD_API void * SimdSynetDeconvolution32fInit (size_t batch, const SimdConvolutionParameters *conv, SimdGemm32fNNPtr gemm)
 Initilizes FP32 deconvolution algorithm. More...
 
SIMD_API size_t SimdSynetDeconvolution32fExternalBufferSize (const void *context)
 Gets size of external temporary buffer required for FP32 deconvolution algorithm. More...
 
SIMD_API size_t SimdSynetDeconvolution32fInternalBufferSize (const void *context)
 Gets size of internal buffer used inside FP32 deconvolution algorithm. More...
 
SIMD_API void SimdSynetDeconvolution32fSetParams (void *context, const float *weight, SimdBool *internal, const float *bias, const float *params)
 Sets weights, beases and parameters of activation function required for FP32 deconvolution algorithm. More...
 
SIMD_API void SimdSynetDeconvolution32fForward (void *context, const float *src, float *buf, float *dst)
 Performs forward propagation of FP32 deconvolution algorithm. More...
 
SIMD_API void SimdSynetEltwiseLayerForward (float const *const *src, const float *weight, size_t count, size_t size, SimdSynetEltwiseOperationType type, float *dst)
 This function is used for forward propagation of EltwiseLayer. More...
 
SIMD_API void SimdSynetFusedLayerForward0 (const float *src, const float *bias, const float *scale, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 0). More...
 
SIMD_API void SimdSynetFusedLayerForward1 (const float *src, const float *bias0, const float *scale1, const float *bias1, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 1). More...
 
SIMD_API void SimdSynetFusedLayerForward2 (const float *src, const float *scale, const float *bias, size_t channels, size_t spatial, const float *slope, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 2). More...
 
SIMD_API void SimdSynetFusedLayerForward3 (const float *src, const float *scale, const float *bias, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 3). More...
 
SIMD_API void SimdSynetFusedLayerForward4 (const float *src, const float *bias0, const float *scale1, const float *bias1, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 4). More...
 
SIMD_API void SimdSynetFusedLayerForward8 (const float *src0, const float *src1, const float *src2, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 8). More...
 
SIMD_API void SimdSynetFusedLayerForward9 (const float *src0, const float *src1, const float *scale, const float *bias, size_t channels0, size_t channels1, size_t spatial, float *dst0, float *dst1, SimdTensorFormatType format)
 This function is used for forward propagation of FusedLayer (type 9). More...
 
SIMD_API void SimdSynetInnerProductLayerForward (const float *src, const float *weight, const float *bias, size_t count, size_t size, float *dst)
 This function is used for forward propagation of InnerProductLayer. More...
 
SIMD_API void SimdSynetLrnLayerCrossChannels (const float *src, size_t half, size_t channels, size_t spatial, const float *k, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of LrnLayer (cross channels normalization). More...
 
SIMD_API void * SimdSynetMergedConvolution32fInit (size_t batch, const SimdConvolutionParameters *convs, size_t count, SimdBool add)
 Initilizes FP32 merged convolution algorithm. More...
 
SIMD_API size_t SimdSynetMergedConvolution32fExternalBufferSize (const void *context)
 Gets size of external temporary buffer required for FP32 merged convolution algorithm. More...
 
SIMD_API size_t SimdSynetMergedConvolution32fInternalBufferSize (const void *context)
 Gets size of internal buffer used inside FP32 merged convolution algorithm. More...
 
SIMD_API void SimdSynetMergedConvolution32fSetParams (void *context, const float *const *weight, SimdBool *internal, const float *const *bias, const float *const *params)
 Sets weights, beases and parameters of activation function required for FP32 merged convolution algorithm. More...
 
SIMD_API void SimdSynetMergedConvolution32fForward (void *context, const float *src, float *buf, float *dst)
 Performs forward propagation of FP32 merged convolution algorithm. More...
 
SIMD_API void SimdSynetPoolingForwardAverage (const float *src, size_t srcC, size_t srcH, size_t srcW, size_t kernelY, size_t kernelX, size_t strideY, size_t strideX, size_t padY, size_t padX, float *dst, size_t dstH, size_t dstW, SimdBool excludePad, SimdTensorFormatType format)
 This function is used for forward propagation of PoolingLayer (AveragePooling). More...
 
SIMD_API void SimdSynetPoolingForwardMax (const float *src, size_t srcC, size_t srcH, size_t srcW, size_t kernelY, size_t kernelX, size_t strideY, size_t strideX, size_t padY, size_t padX, float *dst, size_t dstH, size_t dstW, SimdBool trans)
 This function is used for forward propagation of PoolingLayer (MaxPooling). More...
 
SIMD_API void SimdSynetScaleLayerForward (const float *src, const float *scale, const float *bias, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format)
 This function is used for forward propagation of ScaleLayer. More...
 
SIMD_API void SimdSynetShuffleLayerForward (const float *src0, size_t srcC0, const float *src1, size_t srcC1, size_t spatial, float *dst0, float *dst1, size_t dstC, SimdTensorFormatType format)
 This function is used for forward propagation of ShuffleLayer. More...
 
SIMD_API void SimdSynetSoftmaxLayerForward (const float *src, size_t outer, size_t count, size_t inner, float *dst)
 This function is used for forward propagation of SoftmaxLayer. More...
 
SIMD_API SimdTensorFormatType SimdSynetSpecifyTensorFormat (SimdTensorFormatType format)
 Specifies hardware optimized tensor format of 5D-tensor for (input/output) image or 2D-convolution filter. More...
 
SIMD_API size_t SimdSynetTensorAlignment (SimdTensorFormatType format)
 Gets alignment requred for current tensor format. More...
 
SIMD_API void SimdSynetUnaryOperation32fLayerForward (const float *src, size_t size, SimdSynetUnaryOperation32fType type, float *dst)
 This function is used for forward propagation of UnaryOperationLayer. More...
 

Detailed Description

Functions for accelerating of inference of neural network in Synet Framework.

Typedef Documentation

◆ SimdGemm32fNNPtr

typedef void(* SimdGemm32fNNPtr) (size_t M, size_t N, size_t K, const float *alpha, const float *A, size_t lda, const float *B, size_t ldb, const float *beta, float *C, size_t ldc)

Callback function type "SimdGemm32fNNPtr";.

The function has to perform general matrix multiplication (for 32-bit float numbers).

C(M, N) = alpha*A(M, K)*B(K, N) + beta*C(M, N);
Parameters
[in]M- a height of A and height of C matrices.
[in]N- a width of B and width of C matrices.
[in]K- a width of A and height of B matrices.
[in]alpha- a pointer to multiplier of the first term.
[in]A- a pointer to input A matrix.
[in]lda- a leading dimension of A matrix.
[in]B- a pointer to input B matrix.
[in]ldb- a leading dimension of B matrix.
[in]beta- a pointer to multiplier of the second term.
[out]C- a pointer to output C matrix.
[in]ldc- a leading dimension of C matrix.

◆ SimdConvolutionParameters

Enumeration Type Documentation

◆ SimdConvolutionActivationType

Describes type of activation function. It is used in SimdSynetConvolution32fInit, SimdSynetConvolution8iInit, SimdSynetDeconvolution32fInit and SimdSynetMergedConvolution32fInit.

Enumerator
SimdConvolutionActivationIdentity 

Identity (activation function is absent).

SimdConvolutionActivationRelu 

ReLU activation function.

dst[i] = Max(0, src[i]);
SimdConvolutionActivationLeakyRelu 

Leaky ReLU activation function. It has one parameter: slope (params[0]).

dst[i] = src[i] > 0 ? src[i] : slope*src[i];
SimdConvolutionActivationRestrictRange 

The activation function restricts range. It has two parameters: lower (params[0]) and upper (params[1]) bound.

dst[i] = Min(Max(lower, src[i]), upper);
SimdConvolutionActivationPrelu 

Leaky PReLU activation function. It has m parameters: slopes[m] (m = dstC, n = dstH*dstW).

dst[i*n + j] = src[i*n + j] > 0 ? src[i*n + j] : slopes[i]*src[i*n + j];
SimdConvolutionActivationElu 

Leaky ELU activation function. It has one parameter: alpha (params[0]).

dst[i] = src[i] >= 0 ? src[i] : alpha*(Exp(src[i]) - 1);
SimdConvolutionActivationHswish 

H-Swish (https://arxiv.org/pdf/1905.02244.pdf) activation function. It has two parameters: shift (params[0]) and scale (params[1]).

dst[i] = Max(Min(src[i], shift) + shift, 0)*scale*src[i];

◆ SimdSynetEltwiseOperationType

Describes operation type used in function SimdSynetEltwiseLayerForward.

Enumerator
SimdSynetEltwiseOperationProduct 

Product.

SimdSynetEltwiseOperationSum 

Weighted sum.

SimdSynetEltwiseOperationMax 

Maximum.

SimdSynetEltwiseOperationMin 

Minimum.

◆ SimdSynetUnaryOperation32fType

Describes operation type used in function SimdSynetUnaryOperation32fLayerForward.

Enumerator
SimdSynetUnaryOperation32fAbs 

Gets absolute value for every point of input tensor.

SimdSynetUnaryOperation32fExp 

Gets exponent for every point of input tensor.

SimdSynetUnaryOperation32fLog 

Gets logarithm for every point of input tensor.

SimdSynetUnaryOperation32fNeg 

Gets negative for every point of input tensor.

SimdSynetUnaryOperation32fRsqrt 

Gets reverse square root for every point of input tensor.

SimdSynetUnaryOperation32fSqrt 

Gets square root for every point of input tensor.

SimdSynetUnaryOperation32fTanh 

Gets hyperbolic tangent for every point of input tensor.

SimdSynetUnaryOperation32fZero 

Gets zero value for every point of input tensor.

◆ SimdTensorFormatType

Describes Synet Framework 4D-tensor format type.

Enumerator
SimdTensorFormatUnknown 

Unknown tensor format.

SimdTensorFormatNchw 

NCHW (N - batch, C - channels, H - height, W - width) 4D-tensor format of (input/output) image.

SimdTensorFormatNhwc 

NHWC (N - batch, H - height, W - width, C - channels) 4D-tensor format of (input/output) image.

SimdTensorFormatNchw4c 

NCHW4c (N - batch, C - (channels + 3) / 4, H - height, W - width, 4c - channels gropped by 4) special 5D-tensor format of (input/output) image optimized for SSE and NEON.

SimdTensorFormatNchw8c 

NCHW8c (N - batch, C - (channels + 7) / 8, H - height, W - width, 8c - channels gropped by 8) special 5D-tensor format of (input/output) image optimized for AVX and AVX2.

SimdTensorFormatNchw16c 

NCHW16c (N - batch, C - (channels + 15) / 16, H - height, W - width, 16c - channels gropped by 16) special 5D-tensor format of (input/output) image optimized for AVX-512.

SimdTensorFormatNchwXc 

Unspecified hardware optimized 5D-tensor format of (input/output) image. Specific format (SimdTensorFormatNchw4c, SimdTensorFormatNchw8c or SimdTensorFormatNchw16c) is determinated by function SimdSynetSpecifyTensorFormat.

SimdTensorFormatOiyx 

OIYX (O - output channels, I - input channels, Y - kernel height, X - kernel width) 4D-tensor format of 2D-convolution filter.

SimdTensorFormatYxio 

YXIO (Y - kernel height, X - kernel width, I - input channels, O - output channels) 4D-tensor format of 2D-convolution filter.

SimdTensorFormatOyxi4o 

OYXI4o (O - (output channels + 3)/4, Y - kernel height, X - kernel width, I - input channels, 4o - output channels gropped by 4) special 5D-tensor format of 2D-convolution filter optimized for SSE and NEON.

SimdTensorFormatOyxi8o 

OYXI8o (O - (output channels + 7)/8, Y - kernel height, X - kernel width, I - input channels, 8o - output channels gropped by 8) special 5D-tensor format of 2D-convolution filter optimized for AVX and AVX2.

SimdTensorFormatOyxi16o 

OYXI16o (O - (output channels + 15)/16, Y - kernel height, X - kernel width, I - input channels, 16o - output channels gropped by 16) special 5D-tensor format of 2D-convolution filter optimized for AVX-512.

SimdTensorFormatOyxiXo 

Unspecified hardware optimized 5D-tensor format of 2D-convolution filter. Specific format (SimdTensorFormatOyxi4o, SimdTensorFormatOyxi8o or SimdTensorFormatOyxi16o) is determinated by function SimdSynetSpecifyTensorFormat.

◆ SimdTensorDataType

Describes Synet Framework tensor data type.

Enumerator
SimdTensorDataUnknown 

Unknown tensor data type.

SimdTensorData32f 

32-bit float point.

SimdTensorData32i 

32-bit signed integer.

SimdTensorData8i 

8-bit signed integer.

SimdTensorData8u 

8-bit unsigned integer.

Function Documentation

◆ SimdSynetAddBias()

void SimdSynetAddBias ( const float *  bias,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

Adds a bias to given vector.

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(j = 0; j < spatial; ++j)
         dst[c*spatial + s] += bias[c];
Note
This function is used in Synet Framework.
Parameters
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]channels- a number of channels in the image tensor.
[in]spatial- a spatial size of image tensor.
[in,out]dst- a pointer to cumulative 32-bit image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of image tensor.

◆ SimdSynetDeconvolution32fInit()

void * SimdSynetDeconvolution32fInit ( size_t  batch,
const SimdConvolutionParameters conv,
SimdGemm32fNNPtr  gemm 
)

Initilizes FP32 deconvolution algorithm.

Parameters
[in]batch- a batch size.
[in]conv- a pointer to deconvolution parameters.
[in]gemm- a pointer to external function of matrix multiplication. Can be NULL.
Returns
a pointer to FP32 deconvolution context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetDeconvolution32fExternalBufferSize, SimdSynetDeconvolution32fInternalBufferSize, SimdSynetDeconvolution32fSetParams and SimdSynetDeconvolution32fForward.

◆ SimdSynetDeconvolution32fExternalBufferSize()

size_t SimdSynetDeconvolution32fExternalBufferSize ( const void *  context)

Gets size of external temporary buffer required for FP32 deconvolution algorithm.

Parameters
[in]context- a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease.
Returns
size of external temporary buffer required for FP32 deconvolution algorithm.

◆ SimdSynetDeconvolution32fInternalBufferSize()

size_t SimdSynetDeconvolution32fInternalBufferSize ( const void *  context)

Gets size of internal buffer used inside FP32 deconvolution algorithm.

Parameters
[in]context- a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease.
Returns
size of internal buffer used inside FP32 deconvolution algorithm.

◆ SimdSynetDeconvolution32fSetParams()

void SimdSynetDeconvolution32fSetParams ( void *  context,
const float *  weight,
SimdBool internal,
const float *  bias,
const float *  params 
)

Sets weights, beases and parameters of activation function required for FP32 deconvolution algorithm.

Parameters
[in,out]context- a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease.
[in]weight- a pointer to deconvolution weights.
[out]internal- a flag signalized that weight is stored in the internal buffer. Can be NULL.
[in]bias- a pointer to bias. Can be NULL.
[in]params- a pointer to parameters of activation functions (see SimdConvolutionActivationType). Can be NULL.

◆ SimdSynetDeconvolution32fForward()

void SimdSynetDeconvolution32fForward ( void *  context,
const float *  src,
float *  buf,
float *  dst 
)

Performs forward propagation of FP32 deconvolution algorithm.

Parameters
[in]context- a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease.
[in]src- a pointer to input tensor.
[out]buf- a pointer to external temporary buffer. The size of the external temporary buffer is determined by function SimdSynetDeconvolution32fExternalBufferSize. Can be NULL (it causes usage of internal buffer).
[out]dst- a pointer to output tensor.

◆ SimdSynetEltwiseLayerForward()

void SimdSynetEltwiseLayerForward ( float const *const *  src,
const float *  weight,
size_t  count,
size_t  size,
SimdSynetEltwiseOperationType  type,
float *  dst 
)

This function is used for forward propagation of EltwiseLayer.

Algorithm's details for SimdSynetEltwiseOperationProduct:

for(j = 0; j < size; ++j)
    dst[j] = 1;
for(i = 0; i < count; ++i)
    for(j = 0; j < size; ++j)
        dst[j] *= src[i][j];

Algorithm's details for SimdSynetEltwiseOperationSum:

for(j = 0; j < size; ++j)
    dst[j] = 0;
for(i = 0; i < count; ++i)
    for(j = 0; j < size; ++j)
        dst[j] += src[i][j]*weight[i];

Algorithm's details for SimdSynetEltwiseOperationMax:

for(j = 0; j < size; ++j)
    dst[j] = -FLT_MAX;
for(i = 0; i < count; ++i)
    for(j = 0; j < size; ++j)
        dst[j] = Max(dst[j], src[i][j]);

Algorithm's details for SimdSynetEltwiseOperationMin:

for(j = 0; j < size; ++j)
    dst[j] = FLT_MAX;
for(i = 0; i < count; ++i)
    for(j = 0; j < size; ++j)
        dst[j] = Min(dst[j], src[i][j]);
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to poitres to the input 32-bit float arrays.
[in]weight- a pointer to the 32-bit float array with sum coefficients. It is need only for SimdSynetEltwiseOperationSum operation type otherwise it can be NULL.
[in]count- a count of input arrays. Must be at least 2.
[in]size- a size of the input and output arrays.
[in]type- a type of operation (see SimdSynetEltwiseOperationType).
[out]dst- a pointer to the output 32-bit float array.

◆ SimdSynetFusedLayerForward0()

void SimdSynetFusedLayerForward0 ( const float *  src,
const float *  bias,
const float *  scale,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 0).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        o = c*spatial + s;
        x = src[o] + bias[c];
        dst[o] = (x - abs(x))*scale[c] + max(0, x);
    }
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]scale- a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]channels- a number of channels in the (input/output) image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetFusedLayerForward1()

void SimdSynetFusedLayerForward1 ( const float *  src,
const float *  bias0,
const float *  scale1,
const float *  bias1,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 1).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        o = c*spatial + s;
        x = src[o] + bias0[c];
        dst[o] = max(0, -x)*scale1[c] + bias1[c] + max(0, x);
    }
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]bias0- a pointer to the 32-bit float array with bias0 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]scale1- a pointer to the 32-bit float array with scale1 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]bias1- a pointer to the 32-bit float array with bias1 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]channels- a number of channels in the (input/output) image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetFusedLayerForward2()

void SimdSynetFusedLayerForward2 ( const float *  src,
const float *  scale,
const float *  bias,
size_t  channels,
size_t  spatial,
const float *  slope,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 2).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        o = c*spatial + s;
        x = src[o]*scale[c]  + bias[c];
        dst[o] = max(0, x) + min(0, x)*slope[0];
    }
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]scale- a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]channels- a number of channels in the (input/output) image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[in]slope- a pointer to the 32-bit float slope coefficient.
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetFusedLayerForward3()

void SimdSynetFusedLayerForward3 ( const float *  src,
const float *  scale,
const float *  bias,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 3).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        o = c*spatial + s;
        x = src[o] + bias[c];
        dst[o] = max(0, x) + min(0, x)*scale[c];
    }
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]scale- a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]channels- a number of channels in the (input/output) image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetFusedLayerForward4()

void SimdSynetFusedLayerForward4 ( const float *  src,
const float *  bias0,
const float *  scale1,
const float *  bias1,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 4).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        x = src[c*spatial + s] + bias0[c];
        dst[c*spatial + s] = std::max((T)0, x);
        dst[(c + channels)*spatial + s] = std::max((T)0, x*scale1[0] + bias1[0]);
    }
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]bias0- a pointer to the 32-bit float array with bias0 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]scale1- a pointer to the 32-bit float array with scale1 coefficients. The size of the array is 1.
[in]bias1- a pointer to the 32-bit float array with bias1 coefficients. The size of the array is 1.
[in]channels- a number of channels in the input image tensor. Output image tensor has 2 * channels.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (2 * channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetFusedLayerForward8()

void SimdSynetFusedLayerForward8 ( const float *  src0,
const float *  src1,
const float *  src2,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 8).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        o = c*spatial + s;
        dst[o] = src0[o] + src1[o]*src2[c];
    }
Note
This function is used in Synet Framework.
Parameters
[in]src0- a pointer to the first input 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]src1- a pointer to the second input 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]src2- a pointer to the third input 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]channels- a number of channels in the (input/output) image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst- a pointer to the output 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetFusedLayerForward9()

void SimdSynetFusedLayerForward9 ( const float *  src0,
const float *  src1,
const float *  scale,
const float *  bias,
size_t  channels0,
size_t  channels1,
size_t  spatial,
float *  dst0,
float *  dst1,
SimdTensorFormatType  format 
)

This function is used for forward propagation of FusedLayer (type 9).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels0; ++c)
    for(s = 0; s < spatial; ++s)
    {
        dst0[c*spatial + s] = max(0, src0[c*spatial + s]*scale[c] + bias[c]);
        if(dst1)
            dst1[c*spatial + s] = src0[c*spatial + s];
    }
for(c = 0; c < channels1; ++c)
    for(s = 0; s < spatial; ++s)
    {
        dst0[(c + channels0)*spatial + s] = max(0, src1[c*spatial + s]*scale[channels0 + c] + bias[channels0 + c]);
        if(dst1)
            dst1[(c + channels0)*spatial + s] = src1[c*spatial + s];
    }
Note
This function is used in Synet Framework.
Parameters
[in]src0- a pointer to the first input 32-bit float array. The size of the array is SimdAlign (channels0, SimdSynetTensorAlignment (format)) * spatial.
[in]src1- a pointer to the second input 32-bit float array. The size of the array is SimdAlign (channels1, SimdSynetTensorAlignment (format)) * spatial.
[in]scale- a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)).
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)).
[in]channels0- a number of channels in the first input image tensor.
[in]channels1- a number of channels in the second input image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst0- a pointer to the first output 32-bit float array. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)) * spatial.
[out]dst1- a pointer to the second output 32-bit float array. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)) * spatial. The pointer can be NULL.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetInnerProductLayerForward()

void SimdSynetInnerProductLayerForward ( const float *  src,
const float *  weight,
const float *  bias,
size_t  count,
size_t  size,
float *  dst 
)

This function is used for forward propagation of InnerProductLayer.

Algorithm's details:

for(i = 0; i < count; ++i)
{
    dst[i] = (bias ? bias[i] : 0);
    for(j = 0; j < size; ++j)
       dst[i] += src[j]*weight[i*size + j];
}
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the input 32-bit float array. The size of the array must be equal to size.
[in]weight- a pointer to the 32-bit float array with weight coefficients. The size of the array must be equal to count*size.
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array must be equal to count. Can be NULL.
[in]count- a size of output array.
[in]size- a size of input array.
[out]dst- a pointer to the output 32-bit float array. The size of the array must be equal to count.

◆ SimdSynetLrnLayerCrossChannels()

void SimdSynetLrnLayerCrossChannels ( const float *  src,
size_t  half,
size_t  channels,
size_t  spatial,
const float *  k,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of LrnLayer (cross channels normalization).

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
    {
        lo = Max(0, c - half);
        hi = Min(channels, c + half + 1);
        sum = 0;
        for(i = lo; i < ln; ++i)
            sum += Square(src[i*spatial + s]);
        dst[c*spatial + s] = src[c*spatial + s]*Pow(k[0] + sum*k[1], k[2]);
    }
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]half- a local normalization half size.
[in]channels- a number of channels in the (input/output) image tensor
[in]spatial- a spatial size of (input/output) image tensor.
[in]k- a pointer to the 32-bit float array with 3 coefficients (see algorithm details).
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetMergedConvolution32fInit()

void * SimdSynetMergedConvolution32fInit ( size_t  batch,
const SimdConvolutionParameters convs,
size_t  count,
SimdBool  add 
)

Initilizes FP32 merged convolution algorithm.

Parameters
[in]batch- a batch size.
[in]convs- an array with convolutions parameters.
[in]count- a number of merged convolutions.
[in]add- a flag that signilizes if we need to add output to source value.
Returns
a pointer to FP32 merged convolution context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetMergedConvolution32fExternalBufferSize, SimdSynetMergedConvolution32fInternalBufferSize, SimdSynetMergedConvolution32fSetParams and SimdSynetMergedConvolution32fForward.

◆ SimdSynetMergedConvolution32fExternalBufferSize()

size_t SimdSynetMergedConvolution32fExternalBufferSize ( const void *  context)

Gets size of external temporary buffer required for FP32 merged convolution algorithm.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
Returns
size of external temporary buffer required for FP32 merged convolution algorithm.

◆ SimdSynetMergedConvolution32fInternalBufferSize()

size_t SimdSynetMergedConvolution32fInternalBufferSize ( const void *  context)

Gets size of internal buffer used inside FP32 merged convolution algorithm.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
Returns
size of internal buffer used inside FP32 merged convolution algorithm.

◆ SimdSynetMergedConvolution32fSetParams()

void SimdSynetMergedConvolution32fSetParams ( void *  context,
const float *const *  weight,
SimdBool internal,
const float *const *  bias,
const float *const *  params 
)

Sets weights, beases and parameters of activation function required for FP32 merged convolution algorithm.

Parameters
[in,out]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
[in]weight- a pointer to the array with pointers to convolution weights. The array size is determined by number of merged convolutions.
[out]internal- a ponter to the array of flags signalized that weights are stored in the internal buffer. The array size is determined by number of merged convolutions. Can be NULL.
[in]bias- a pointer to the array with pointers to bias. The array size is determined by number of merged convolutions. Can be NULL.
[in]params- a pointer to the array with pointers to parameters of the activation functions (see SimdConvolutionActivationType). The array size is determined by number of merged convolutions. Can be NULL.

◆ SimdSynetMergedConvolution32fForward()

void SimdSynetMergedConvolution32fForward ( void *  context,
const float *  src,
float *  buf,
float *  dst 
)

Performs forward propagation of FP32 merged convolution algorithm.

Parameters
[in]context- a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
[in]src- a pointer to input image.
[out]buf- a pointer to external temporary buffer. The size of the external temporary buffer is determined by function SimdSynetMergedConvolution32fExternalBufferSize. Can be NULL (it causes usage of internal buffer).
[out]dst- a pointer to output image.

◆ SimdSynetPoolingForwardAverage()

void SimdSynetPoolingForwardAverage ( const float *  src,
size_t  srcC,
size_t  srcH,
size_t  srcW,
size_t  kernelY,
size_t  kernelX,
size_t  strideY,
size_t  strideX,
size_t  padY,
size_t  padX,
float *  dst,
size_t  dstH,
size_t  dstW,
SimdBool  excludePad,
SimdTensorFormatType  format 
)

This function is used for forward propagation of PoolingLayer (AveragePooling).

Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the input 32-bit float array. The size of the array must be equal to srcC*srcH*srcW.
[in]srcC- a number of input and output channels.
[in]srcH- an input height.
[in]srcW- an input width.
[in]kernelY- a height of the pooling kernel.
[in]kernelX- a width of the pooling kernel.
[in]strideY- a y-stride of the pooling.
[in]strideX- a x-stride of the pooling.
[in]padY- a pad to the top of the input image.
[in]padX- a pad to the left of the input image.
[out]dst- a pointer to the output 32-bit float array. The size of the array must be equal to srcC*dstH*dstW.
[in]dstH- an output height.
[in]dstW- an output width.
[in]excludePad- a flag of exclude pad from average value calculation.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetPoolingForwardMax()

void SimdSynetPoolingForwardMax ( const float *  src,
size_t  srcC,
size_t  srcH,
size_t  srcW,
size_t  kernelY,
size_t  kernelX,
size_t  strideY,
size_t  strideX,
size_t  padY,
size_t  padX,
float *  dst,
size_t  dstH,
size_t  dstW,
SimdBool  trans 
)

This function is used for forward propagation of PoolingLayer (MaxPooling).

Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the input 32-bit float array. The size of the array must be equal to srcC*srcH*srcW.
[in]srcC- a number of input and output channels.
[in]srcH- an input height.
[in]srcW- an input width.
[in]kernelY- a height of the pooling kernel.
[in]kernelX- a width of the pooling kernel.
[in]strideY- a y-stride of the pooling.
[in]strideX- a x-stride of the pooling.
[in]padY- a pad to the top of the input image.
[in]padX- a pad to the left of the input image.
[out]dst- a pointer to the output 32-bit float array. The size of the array must be equal to srcC*dstH*dstW.
[in]dstH- an output height.
[in]dstW- an output width.
[in]trans- a flag of transposed input and output data (SimdFalse - NCHW order, SimdTrue - NHWC order).

◆ SimdSynetScaleLayerForward()

void SimdSynetScaleLayerForward ( const float *  src,
const float *  scale,
const float *  bias,
size_t  channels,
size_t  spatial,
float *  dst,
SimdTensorFormatType  format 
)

This function is used for forward propagation of ScaleLayer.

Algorithm's details (example for NCHW tensor format):

for(c = 0; c < channels; ++c)
    for(s = 0; s < spatial; ++s)
        dst[c*spatial + s] = src[c*spatial + s]*scale[c] + (bias ? bias[c] : 0);
Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]scale- a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)).
[in]bias- a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). Can be NULL.
[in]channels- a number of channels in the (input/output) image tensor.
[in]spatial- a spatial size of (input/output) image tensor.
[out]dst- a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial.
[in]format- a format of (input/output) image tensor.

◆ SimdSynetShuffleLayerForward()

void SimdSynetShuffleLayerForward ( const float *  src0,
size_t  srcC0,
const float *  src1,
size_t  srcC1,
size_t  spatial,
float *  dst0,
float *  dst1,
size_t  dstC,
SimdTensorFormatType  format 
)

This function is used for forward propagation of ShuffleLayer.

Note
This function is used in Synet Framework.
Parameters
[in]src0- a pointer to the32-bit float array with the first input image tensor.
[in]srcC0- a number of channels in the first input image tensor. It must be even number.
[in]src1- a pointer to the32-bit float array with the second input image tensor.
[in]srcC1- a number of channels in the second input image tensor. It must be even number.
[in]spatial- a spatial size of (input/output) image tensors.
[out]dst0- a pointer to the 32-bit float array with the first output image tensor.
[out]dst1- a pointer to the 32-bit float array with the second output image tensor.
[in]dstC- a number of channels in the first and the second output image tensors.
[in]format- a format of (input/output) image tensors.

◆ SimdSynetSoftmaxLayerForward()

void SimdSynetSoftmaxLayerForward ( const float *  src,
size_t  outer,
size_t  count,
size_t  inner,
float *  dst 
)

This function is used for forward propagation of SoftmaxLayer.

Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to the input 32-bit float array. The size of the array must be equal to outer*count*inner.
[in]outer- an outer size of input and output arrays.
[in]count- a size of softmax dimmension.
[in]inner- an inner size of input and output arrays.
[out]dst- a pointer to the output 32-bit float array. The size of the array must be equal to outer*count*inner.

◆ SimdSynetSpecifyTensorFormat()

SimdTensorFormatType SimdSynetSpecifyTensorFormat ( SimdTensorFormatType  format)

Specifies hardware optimized tensor format of 5D-tensor for (input/output) image or 2D-convolution filter.

Note
This function is used in Synet Framework.
Parameters
[in]format- an unspecified hardware optimized 5D-tensor format of (input/output) image or 2D-convolution filter. It can be SimdTensorFormatNchwXc or SimdTensorFormatOyxiXo.
Returns
specified hardware optimized 5D-tensor format.

◆ SimdSynetTensorAlignment()

size_t SimdSynetTensorAlignment ( SimdTensorFormatType  format)

Gets alignment requred for current tensor format.

Note
This function is used in Synet Framework.
Parameters
[in]format- a tensor format.
Returns
alignment requred for current tensor format.

◆ SimdSynetUnaryOperation32fLayerForward()

void SimdSynetUnaryOperation32fLayerForward ( const float *  src,
size_t  size,
SimdSynetUnaryOperation32fType  type,
float *  dst 
)

This function is used for forward propagation of UnaryOperationLayer.

Note
This function is used in Synet Framework.
Parameters
[in]src- a pointer to poitres to the input 32-bit float arrays.
[in]size- a size of the input and output arrays.
[in]type- an unary operation type (see SimdSynetUnaryOperation32fType).
[out]dst- a pointer to the output 32-bit float array.