Functions for accelerating of inference of neural network in Synet Framework. More...
Modules | |
Activation functions | |
Functions to acceleratе activation functions in Synet Framework. | |
Conversion functions | |
Functions to acceleratе conversion in Synet Framework. | |
Convolution framework | |
A framework to accelerate convolution in Synet Framework. | |
Winograd functions | |
Functions to acceleratе Winograd convolution algorithm in Synet Framework. | |
Data Structures | |
struct | SimdConvolutionParameters |
Typedefs | |
typedef void(* | SimdGemm32fNNPtr) (size_t M, size_t N, size_t K, const float *alpha, const float *A, size_t lda, const float *B, size_t ldb, const float *beta, float *C, size_t ldc) |
Callback function type "SimdGemm32fNNPtr";. More... | |
typedef struct SimdConvolutionParameters | SimdConvolutionParameters |
Functions | |
SIMD_API void | SimdSynetAddBias (const float *bias, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
Adds a bias to given vector. More... | |
SIMD_API void * | SimdSynetDeconvolution32fInit (size_t batch, const SimdConvolutionParameters *conv, SimdGemm32fNNPtr gemm) |
Initilizes FP32 deconvolution algorithm. More... | |
SIMD_API size_t | SimdSynetDeconvolution32fExternalBufferSize (const void *context) |
Gets size of external temporary buffer required for FP32 deconvolution algorithm. More... | |
SIMD_API size_t | SimdSynetDeconvolution32fInternalBufferSize (const void *context) |
Gets size of internal buffer used inside FP32 deconvolution algorithm. More... | |
SIMD_API void | SimdSynetDeconvolution32fSetParams (void *context, const float *weight, SimdBool *internal, const float *bias, const float *params) |
Sets weights, beases and parameters of activation function required for FP32 deconvolution algorithm. More... | |
SIMD_API void | SimdSynetDeconvolution32fForward (void *context, const float *src, float *buf, float *dst) |
Performs forward propagation of FP32 deconvolution algorithm. More... | |
SIMD_API void | SimdSynetEltwiseLayerForward (float const *const *src, const float *weight, size_t count, size_t size, SimdSynetEltwiseOperationType type, float *dst) |
This function is used for forward propagation of EltwiseLayer. More... | |
SIMD_API void | SimdSynetFusedLayerForward0 (const float *src, const float *bias, const float *scale, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 0). More... | |
SIMD_API void | SimdSynetFusedLayerForward1 (const float *src, const float *bias0, const float *scale1, const float *bias1, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 1). More... | |
SIMD_API void | SimdSynetFusedLayerForward2 (const float *src, const float *scale, const float *bias, size_t channels, size_t spatial, const float *slope, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 2). More... | |
SIMD_API void | SimdSynetFusedLayerForward3 (const float *src, const float *scale, const float *bias, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 3). More... | |
SIMD_API void | SimdSynetFusedLayerForward4 (const float *src, const float *bias0, const float *scale1, const float *bias1, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 4). More... | |
SIMD_API void | SimdSynetFusedLayerForward8 (const float *src0, const float *src1, const float *src2, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 8). More... | |
SIMD_API void | SimdSynetFusedLayerForward9 (const float *src0, const float *src1, const float *scale, const float *bias, size_t channels0, size_t channels1, size_t spatial, float *dst0, float *dst1, SimdTensorFormatType format) |
This function is used for forward propagation of FusedLayer (type 9). More... | |
SIMD_API void | SimdSynetInnerProductLayerForward (const float *src, const float *weight, const float *bias, size_t count, size_t size, float *dst) |
This function is used for forward propagation of InnerProductLayer. More... | |
SIMD_API void | SimdSynetLrnLayerCrossChannels (const float *src, size_t half, size_t channels, size_t spatial, const float *k, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of LrnLayer (cross channels normalization). More... | |
SIMD_API void * | SimdSynetMergedConvolution32fInit (size_t batch, const SimdConvolutionParameters *convs, size_t count, SimdBool add) |
Initilizes FP32 merged convolution algorithm. More... | |
SIMD_API size_t | SimdSynetMergedConvolution32fExternalBufferSize (const void *context) |
Gets size of external temporary buffer required for FP32 merged convolution algorithm. More... | |
SIMD_API size_t | SimdSynetMergedConvolution32fInternalBufferSize (const void *context) |
Gets size of internal buffer used inside FP32 merged convolution algorithm. More... | |
SIMD_API void | SimdSynetMergedConvolution32fSetParams (void *context, const float *const *weight, SimdBool *internal, const float *const *bias, const float *const *params) |
Sets weights, beases and parameters of activation function required for FP32 merged convolution algorithm. More... | |
SIMD_API void | SimdSynetMergedConvolution32fForward (void *context, const float *src, float *buf, float *dst) |
Performs forward propagation of FP32 merged convolution algorithm. More... | |
SIMD_API void | SimdSynetPoolingForwardAverage (const float *src, size_t srcC, size_t srcH, size_t srcW, size_t kernelY, size_t kernelX, size_t strideY, size_t strideX, size_t padY, size_t padX, float *dst, size_t dstH, size_t dstW, SimdBool excludePad, SimdTensorFormatType format) |
This function is used for forward propagation of PoolingLayer (AveragePooling). More... | |
SIMD_API void | SimdSynetPoolingForwardMax (const float *src, size_t srcC, size_t srcH, size_t srcW, size_t kernelY, size_t kernelX, size_t strideY, size_t strideX, size_t padY, size_t padX, float *dst, size_t dstH, size_t dstW, SimdBool trans) |
This function is used for forward propagation of PoolingLayer (MaxPooling). More... | |
SIMD_API void | SimdSynetScaleLayerForward (const float *src, const float *scale, const float *bias, size_t channels, size_t spatial, float *dst, SimdTensorFormatType format) |
This function is used for forward propagation of ScaleLayer. More... | |
SIMD_API void | SimdSynetShuffleLayerForward (const float *src0, size_t srcC0, const float *src1, size_t srcC1, size_t spatial, float *dst0, float *dst1, size_t dstC, SimdTensorFormatType format) |
This function is used for forward propagation of ShuffleLayer. More... | |
SIMD_API void | SimdSynetSoftmaxLayerForward (const float *src, size_t outer, size_t count, size_t inner, float *dst) |
This function is used for forward propagation of SoftmaxLayer. More... | |
SIMD_API SimdTensorFormatType | SimdSynetSpecifyTensorFormat (SimdTensorFormatType format) |
Specifies hardware optimized tensor format of 5D-tensor for (input/output) image or 2D-convolution filter. More... | |
SIMD_API size_t | SimdSynetTensorAlignment (SimdTensorFormatType format) |
Gets alignment requred for current tensor format. More... | |
SIMD_API void | SimdSynetUnaryOperation32fLayerForward (const float *src, size_t size, SimdSynetUnaryOperation32fType type, float *dst) |
This function is used for forward propagation of UnaryOperationLayer. More... | |
Detailed Description
Functions for accelerating of inference of neural network in Synet Framework.
Typedef Documentation
◆ SimdGemm32fNNPtr
typedef void(* SimdGemm32fNNPtr) (size_t M, size_t N, size_t K, const float *alpha, const float *A, size_t lda, const float *B, size_t ldb, const float *beta, float *C, size_t ldc) |
Callback function type "SimdGemm32fNNPtr";.
The function has to perform general matrix multiplication (for 32-bit float numbers).
C(M, N) = alpha*A(M, K)*B(K, N) + beta*C(M, N);
- Parameters
-
[in] M - a height of A and height of C matrices. [in] N - a width of B and width of C matrices. [in] K - a width of A and height of B matrices. [in] alpha - a pointer to multiplier of the first term. [in] A - a pointer to input A matrix. [in] lda - a leading dimension of A matrix. [in] B - a pointer to input B matrix. [in] ldb - a leading dimension of B matrix. [in] beta - a pointer to multiplier of the second term. [out] C - a pointer to output C matrix. [in] ldc - a leading dimension of C matrix.
◆ SimdConvolutionParameters
typedef struct SimdConvolutionParameters SimdConvolutionParameters |
Describes convolution (deconvolution) parameters. It is used in SimdSynetConvolution32fInit, SimdSynetConvolution8iInit, SimdSynetDeconvolution32fInit and SimdSynetMergedConvolution32fInit.
Enumeration Type Documentation
◆ SimdConvolutionActivationType
Describes type of activation function. It is used in SimdSynetConvolution32fInit, SimdSynetConvolution8iInit, SimdSynetDeconvolution32fInit and SimdSynetMergedConvolution32fInit.
Enumerator | |
---|---|
SimdConvolutionActivationIdentity | Identity (activation function is absent). |
SimdConvolutionActivationRelu | ReLU activation function. dst[i] = Max(0, src[i]); |
SimdConvolutionActivationLeakyRelu | Leaky ReLU activation function. It has one parameter: slope (params[0]). dst[i] = src[i] > 0 ? src[i] : slope*src[i]; |
SimdConvolutionActivationRestrictRange | The activation function restricts range. It has two parameters: lower (params[0]) and upper (params[1]) bound. dst[i] = Min(Max(lower, src[i]), upper); |
SimdConvolutionActivationPrelu | Leaky PReLU activation function. It has m parameters: slopes[m] (m = dstC, n = dstH*dstW). dst[i*n + j] = src[i*n + j] > 0 ? src[i*n + j] : slopes[i]*src[i*n + j]; |
SimdConvolutionActivationElu | Leaky ELU activation function. It has one parameter: alpha (params[0]). dst[i] = src[i] >= 0 ? src[i] : alpha*(Exp(src[i]) - 1); |
SimdConvolutionActivationHswish | H-Swish (https://arxiv.org/pdf/1905.02244.pdf) activation function. It has two parameters: shift (params[0]) and scale (params[1]). dst[i] = Max(Min(src[i], shift) + shift, 0)*scale*src[i]; |
◆ SimdSynetEltwiseOperationType
Describes operation type used in function SimdSynetEltwiseLayerForward.
Enumerator | |
---|---|
SimdSynetEltwiseOperationProduct | Product. |
SimdSynetEltwiseOperationSum | Weighted sum. |
SimdSynetEltwiseOperationMax | Maximum. |
SimdSynetEltwiseOperationMin | Minimum. |
◆ SimdSynetUnaryOperation32fType
Describes operation type used in function SimdSynetUnaryOperation32fLayerForward.
◆ SimdTensorFormatType
enum SimdTensorFormatType |
Describes Synet Framework 4D-tensor format type.
Enumerator | |
---|---|
SimdTensorFormatUnknown | Unknown tensor format. |
SimdTensorFormatNchw | NCHW (N - batch, C - channels, H - height, W - width) 4D-tensor format of (input/output) image. |
SimdTensorFormatNhwc | NHWC (N - batch, H - height, W - width, C - channels) 4D-tensor format of (input/output) image. |
SimdTensorFormatNchw4c | NCHW4c (N - batch, C - (channels + 3) / 4, H - height, W - width, 4c - channels gropped by 4) special 5D-tensor format of (input/output) image optimized for SSE and NEON. |
SimdTensorFormatNchw8c | NCHW8c (N - batch, C - (channels + 7) / 8, H - height, W - width, 8c - channels gropped by 8) special 5D-tensor format of (input/output) image optimized for AVX and AVX2. |
SimdTensorFormatNchw16c | NCHW16c (N - batch, C - (channels + 15) / 16, H - height, W - width, 16c - channels gropped by 16) special 5D-tensor format of (input/output) image optimized for AVX-512. |
SimdTensorFormatNchwXc | Unspecified hardware optimized 5D-tensor format of (input/output) image. Specific format (SimdTensorFormatNchw4c, SimdTensorFormatNchw8c or SimdTensorFormatNchw16c) is determinated by function SimdSynetSpecifyTensorFormat. |
SimdTensorFormatOiyx | OIYX (O - output channels, I - input channels, Y - kernel height, X - kernel width) 4D-tensor format of 2D-convolution filter. |
SimdTensorFormatYxio | YXIO (Y - kernel height, X - kernel width, I - input channels, O - output channels) 4D-tensor format of 2D-convolution filter. |
SimdTensorFormatOyxi4o | OYXI4o (O - (output channels + 3)/4, Y - kernel height, X - kernel width, I - input channels, 4o - output channels gropped by 4) special 5D-tensor format of 2D-convolution filter optimized for SSE and NEON. |
SimdTensorFormatOyxi8o | OYXI8o (O - (output channels + 7)/8, Y - kernel height, X - kernel width, I - input channels, 8o - output channels gropped by 8) special 5D-tensor format of 2D-convolution filter optimized for AVX and AVX2. |
SimdTensorFormatOyxi16o | OYXI16o (O - (output channels + 15)/16, Y - kernel height, X - kernel width, I - input channels, 16o - output channels gropped by 16) special 5D-tensor format of 2D-convolution filter optimized for AVX-512. |
SimdTensorFormatOyxiXo | Unspecified hardware optimized 5D-tensor format of 2D-convolution filter. Specific format (SimdTensorFormatOyxi4o, SimdTensorFormatOyxi8o or SimdTensorFormatOyxi16o) is determinated by function SimdSynetSpecifyTensorFormat. |
◆ SimdTensorDataType
enum SimdTensorDataType |
Describes Synet Framework tensor data type.
Function Documentation
◆ SimdSynetAddBias()
void SimdSynetAddBias | ( | const float * | bias, |
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
Adds a bias to given vector.
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(j = 0; j < spatial; ++j) dst[c*spatial + s] += bias[c];
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] channels - a number of channels in the image tensor. [in] spatial - a spatial size of image tensor. [in,out] dst - a pointer to cumulative 32-bit image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of image tensor.
◆ SimdSynetDeconvolution32fInit()
void * SimdSynetDeconvolution32fInit | ( | size_t | batch, |
const SimdConvolutionParameters * | conv, | ||
SimdGemm32fNNPtr | gemm | ||
) |
Initilizes FP32 deconvolution algorithm.
- Parameters
-
[in] batch - a batch size. [in] conv - a pointer to deconvolution parameters. [in] gemm - a pointer to external function of matrix multiplication. Can be NULL.
- Returns
- a pointer to FP32 deconvolution context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetDeconvolution32fExternalBufferSize, SimdSynetDeconvolution32fInternalBufferSize, SimdSynetDeconvolution32fSetParams and SimdSynetDeconvolution32fForward.
◆ SimdSynetDeconvolution32fExternalBufferSize()
size_t SimdSynetDeconvolution32fExternalBufferSize | ( | const void * | context | ) |
Gets size of external temporary buffer required for FP32 deconvolution algorithm.
- Parameters
-
[in] context - a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease.
- Returns
- size of external temporary buffer required for FP32 deconvolution algorithm.
◆ SimdSynetDeconvolution32fInternalBufferSize()
size_t SimdSynetDeconvolution32fInternalBufferSize | ( | const void * | context | ) |
Gets size of internal buffer used inside FP32 deconvolution algorithm.
- Parameters
-
[in] context - a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease.
- Returns
- size of internal buffer used inside FP32 deconvolution algorithm.
◆ SimdSynetDeconvolution32fSetParams()
void SimdSynetDeconvolution32fSetParams | ( | void * | context, |
const float * | weight, | ||
SimdBool * | internal, | ||
const float * | bias, | ||
const float * | params | ||
) |
Sets weights, beases and parameters of activation function required for FP32 deconvolution algorithm.
- Parameters
-
[in,out] context - a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease. [in] weight - a pointer to deconvolution weights. [out] internal - a flag signalized that weight is stored in the internal buffer. Can be NULL. [in] bias - a pointer to bias. Can be NULL. [in] params - a pointer to parameters of activation functions (see SimdConvolutionActivationType). Can be NULL.
◆ SimdSynetDeconvolution32fForward()
void SimdSynetDeconvolution32fForward | ( | void * | context, |
const float * | src, | ||
float * | buf, | ||
float * | dst | ||
) |
Performs forward propagation of FP32 deconvolution algorithm.
- Parameters
-
[in] context - a pointer to FP32 deconvolution context. It must be created by function SimdSynetDeconvolution32fInit and released by function SimdRelease. [in] src - a pointer to input tensor. [out] buf - a pointer to external temporary buffer. The size of the external temporary buffer is determined by function SimdSynetDeconvolution32fExternalBufferSize. Can be NULL (it causes usage of internal buffer). [out] dst - a pointer to output tensor.
◆ SimdSynetEltwiseLayerForward()
void SimdSynetEltwiseLayerForward | ( | float const *const * | src, |
const float * | weight, | ||
size_t | count, | ||
size_t | size, | ||
SimdSynetEltwiseOperationType | type, | ||
float * | dst | ||
) |
This function is used for forward propagation of EltwiseLayer.
Algorithm's details for SimdSynetEltwiseOperationProduct:
for(j = 0; j < size; ++j) dst[j] = 1; for(i = 0; i < count; ++i) for(j = 0; j < size; ++j) dst[j] *= src[i][j];
Algorithm's details for SimdSynetEltwiseOperationSum:
for(j = 0; j < size; ++j) dst[j] = 0; for(i = 0; i < count; ++i) for(j = 0; j < size; ++j) dst[j] += src[i][j]*weight[i];
Algorithm's details for SimdSynetEltwiseOperationMax:
for(j = 0; j < size; ++j) dst[j] = -FLT_MAX; for(i = 0; i < count; ++i) for(j = 0; j < size; ++j) dst[j] = Max(dst[j], src[i][j]);
Algorithm's details for SimdSynetEltwiseOperationMin:
for(j = 0; j < size; ++j) dst[j] = FLT_MAX; for(i = 0; i < count; ++i) for(j = 0; j < size; ++j) dst[j] = Min(dst[j], src[i][j]);
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to poitres to the input 32-bit float arrays. [in] weight - a pointer to the 32-bit float array with sum coefficients. It is need only for SimdSynetEltwiseOperationSum operation type otherwise it can be NULL. [in] count - a count of input arrays. Must be at least 2. [in] size - a size of the input and output arrays. [in] type - a type of operation (see SimdSynetEltwiseOperationType). [out] dst - a pointer to the output 32-bit float array.
◆ SimdSynetFusedLayerForward0()
void SimdSynetFusedLayerForward0 | ( | const float * | src, |
const float * | bias, | ||
const float * | scale, | ||
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 0).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { o = c*spatial + s; x = src[o] + bias[c]; dst[o] = (x - abs(x))*scale[c] + max(0, x); }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] scale - a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] channels - a number of channels in the (input/output) image tensor. [in] spatial - a spatial size of (input/output) image tensor. [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetFusedLayerForward1()
void SimdSynetFusedLayerForward1 | ( | const float * | src, |
const float * | bias0, | ||
const float * | scale1, | ||
const float * | bias1, | ||
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 1).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { o = c*spatial + s; x = src[o] + bias0[c]; dst[o] = max(0, -x)*scale1[c] + bias1[c] + max(0, x); }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] bias0 - a pointer to the 32-bit float array with bias0 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] scale1 - a pointer to the 32-bit float array with scale1 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] bias1 - a pointer to the 32-bit float array with bias1 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] channels - a number of channels in the (input/output) image tensor. [in] spatial - a spatial size of (input/output) image tensor. [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetFusedLayerForward2()
void SimdSynetFusedLayerForward2 | ( | const float * | src, |
const float * | scale, | ||
const float * | bias, | ||
size_t | channels, | ||
size_t | spatial, | ||
const float * | slope, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 2).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { o = c*spatial + s; x = src[o]*scale[c] + bias[c]; dst[o] = max(0, x) + min(0, x)*slope[0]; }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] scale - a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] channels - a number of channels in the (input/output) image tensor. [in] spatial - a spatial size of (input/output) image tensor. [in] slope - a pointer to the 32-bit float slope coefficient. [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetFusedLayerForward3()
void SimdSynetFusedLayerForward3 | ( | const float * | src, |
const float * | scale, | ||
const float * | bias, | ||
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 3).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { o = c*spatial + s; x = src[o] + bias[c]; dst[o] = max(0, x) + min(0, x)*scale[c]; }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] scale - a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] channels - a number of channels in the (input/output) image tensor. [in] spatial - a spatial size of (input/output) image tensor. [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetFusedLayerForward4()
void SimdSynetFusedLayerForward4 | ( | const float * | src, |
const float * | bias0, | ||
const float * | scale1, | ||
const float * | bias1, | ||
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 4).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { x = src[c*spatial + s] + bias0[c]; dst[c*spatial + s] = std::max((T)0, x); dst[(c + channels)*spatial + s] = std::max((T)0, x*scale1[0] + bias1[0]); }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] bias0 - a pointer to the 32-bit float array with bias0 coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] scale1 - a pointer to the 32-bit float array with scale1 coefficients. The size of the array is 1. [in] bias1 - a pointer to the 32-bit float array with bias1 coefficients. The size of the array is 1. [in] channels - a number of channels in the input image tensor. Output image tensor has 2 * channels. [in] spatial - a spatial size of (input/output) image tensor. [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (2 * channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetFusedLayerForward8()
void SimdSynetFusedLayerForward8 | ( | const float * | src0, |
const float * | src1, | ||
const float * | src2, | ||
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 8).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { o = c*spatial + s; dst[o] = src0[o] + src1[o]*src2[c]; }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src0 - a pointer to the first input 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] src1 - a pointer to the second input 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] src2 - a pointer to the third input 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] channels - a number of channels in the (input/output) image tensor. [in] spatial - a spatial size of (input/output) image tensor. [out] dst - a pointer to the output 32-bit float array. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetFusedLayerForward9()
void SimdSynetFusedLayerForward9 | ( | const float * | src0, |
const float * | src1, | ||
const float * | scale, | ||
const float * | bias, | ||
size_t | channels0, | ||
size_t | channels1, | ||
size_t | spatial, | ||
float * | dst0, | ||
float * | dst1, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of FusedLayer (type 9).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels0; ++c) for(s = 0; s < spatial; ++s) { dst0[c*spatial + s] = max(0, src0[c*spatial + s]*scale[c] + bias[c]); if(dst1) dst1[c*spatial + s] = src0[c*spatial + s]; } for(c = 0; c < channels1; ++c) for(s = 0; s < spatial; ++s) { dst0[(c + channels0)*spatial + s] = max(0, src1[c*spatial + s]*scale[channels0 + c] + bias[channels0 + c]); if(dst1) dst1[(c + channels0)*spatial + s] = src1[c*spatial + s]; }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src0 - a pointer to the first input 32-bit float array. The size of the array is SimdAlign (channels0, SimdSynetTensorAlignment (format)) * spatial. [in] src1 - a pointer to the second input 32-bit float array. The size of the array is SimdAlign (channels1, SimdSynetTensorAlignment (format)) * spatial. [in] scale - a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)). [in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)). [in] channels0 - a number of channels in the first input image tensor. [in] channels1 - a number of channels in the second input image tensor. [in] spatial - a spatial size of (input/output) image tensor. [out] dst0 - a pointer to the first output 32-bit float array. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)) * spatial. [out] dst1 - a pointer to the second output 32-bit float array. The size of the array is SimdAlign (channels0 + channels1, SimdSynetTensorAlignment (format)) * spatial. The pointer can be NULL. [in] format - a format of (input/output) image tensor.
◆ SimdSynetInnerProductLayerForward()
void SimdSynetInnerProductLayerForward | ( | const float * | src, |
const float * | weight, | ||
const float * | bias, | ||
size_t | count, | ||
size_t | size, | ||
float * | dst | ||
) |
This function is used for forward propagation of InnerProductLayer.
Algorithm's details:
for(i = 0; i < count; ++i) { dst[i] = (bias ? bias[i] : 0); for(j = 0; j < size; ++j) dst[i] += src[j]*weight[i*size + j]; }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the input 32-bit float array. The size of the array must be equal to size. [in] weight - a pointer to the 32-bit float array with weight coefficients. The size of the array must be equal to count*size. [in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array must be equal to count. Can be NULL. [in] count - a size of output array. [in] size - a size of input array. [out] dst - a pointer to the output 32-bit float array. The size of the array must be equal to count.
◆ SimdSynetLrnLayerCrossChannels()
void SimdSynetLrnLayerCrossChannels | ( | const float * | src, |
size_t | half, | ||
size_t | channels, | ||
size_t | spatial, | ||
const float * | k, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of LrnLayer (cross channels normalization).
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) { lo = Max(0, c - half); hi = Min(channels, c + half + 1); sum = 0; for(i = lo; i < ln; ++i) sum += Square(src[i*spatial + s]); dst[c*spatial + s] = src[c*spatial + s]*Pow(k[0] + sum*k[1], k[2]); }
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] half - a local normalization half size. [in] channels - a number of channels in the (input/output) image tensor [in] spatial - a spatial size of (input/output) image tensor. [in] k - a pointer to the 32-bit float array with 3 coefficients (see algorithm details). [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetMergedConvolution32fInit()
void * SimdSynetMergedConvolution32fInit | ( | size_t | batch, |
const SimdConvolutionParameters * | convs, | ||
size_t | count, | ||
SimdBool | add | ||
) |
Initilizes FP32 merged convolution algorithm.
- Parameters
-
[in] batch - a batch size. [in] convs - an array with convolutions parameters. [in] count - a number of merged convolutions. [in] add - a flag that signilizes if we need to add output to source value.
- Returns
- a pointer to FP32 merged convolution context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetMergedConvolution32fExternalBufferSize, SimdSynetMergedConvolution32fInternalBufferSize, SimdSynetMergedConvolution32fSetParams and SimdSynetMergedConvolution32fForward.
◆ SimdSynetMergedConvolution32fExternalBufferSize()
size_t SimdSynetMergedConvolution32fExternalBufferSize | ( | const void * | context | ) |
Gets size of external temporary buffer required for FP32 merged convolution algorithm.
- Parameters
-
[in] context - a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
- Returns
- size of external temporary buffer required for FP32 merged convolution algorithm.
◆ SimdSynetMergedConvolution32fInternalBufferSize()
size_t SimdSynetMergedConvolution32fInternalBufferSize | ( | const void * | context | ) |
Gets size of internal buffer used inside FP32 merged convolution algorithm.
- Parameters
-
[in] context - a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease.
- Returns
- size of internal buffer used inside FP32 merged convolution algorithm.
◆ SimdSynetMergedConvolution32fSetParams()
void SimdSynetMergedConvolution32fSetParams | ( | void * | context, |
const float *const * | weight, | ||
SimdBool * | internal, | ||
const float *const * | bias, | ||
const float *const * | params | ||
) |
Sets weights, beases and parameters of activation function required for FP32 merged convolution algorithm.
- Parameters
-
[in,out] context - a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease. [in] weight - a pointer to the array with pointers to convolution weights. The array size is determined by number of merged convolutions. [out] internal - a ponter to the array of flags signalized that weights are stored in the internal buffer. The array size is determined by number of merged convolutions. Can be NULL. [in] bias - a pointer to the array with pointers to bias. The array size is determined by number of merged convolutions. Can be NULL. [in] params - a pointer to the array with pointers to parameters of the activation functions (see SimdConvolutionActivationType). The array size is determined by number of merged convolutions. Can be NULL.
◆ SimdSynetMergedConvolution32fForward()
void SimdSynetMergedConvolution32fForward | ( | void * | context, |
const float * | src, | ||
float * | buf, | ||
float * | dst | ||
) |
Performs forward propagation of FP32 merged convolution algorithm.
- Parameters
-
[in] context - a pointer to FP32 merged convolution context. It must be created by function SimdSynetMergedConvolution32fInit and released by function SimdRelease. [in] src - a pointer to input image. [out] buf - a pointer to external temporary buffer. The size of the external temporary buffer is determined by function SimdSynetMergedConvolution32fExternalBufferSize. Can be NULL (it causes usage of internal buffer). [out] dst - a pointer to output image.
◆ SimdSynetPoolingForwardAverage()
void SimdSynetPoolingForwardAverage | ( | const float * | src, |
size_t | srcC, | ||
size_t | srcH, | ||
size_t | srcW, | ||
size_t | kernelY, | ||
size_t | kernelX, | ||
size_t | strideY, | ||
size_t | strideX, | ||
size_t | padY, | ||
size_t | padX, | ||
float * | dst, | ||
size_t | dstH, | ||
size_t | dstW, | ||
SimdBool | excludePad, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of PoolingLayer (AveragePooling).
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the input 32-bit float array. The size of the array must be equal to srcC*srcH*srcW. [in] srcC - a number of input and output channels. [in] srcH - an input height. [in] srcW - an input width. [in] kernelY - a height of the pooling kernel. [in] kernelX - a width of the pooling kernel. [in] strideY - a y-stride of the pooling. [in] strideX - a x-stride of the pooling. [in] padY - a pad to the top of the input image. [in] padX - a pad to the left of the input image. [out] dst - a pointer to the output 32-bit float array. The size of the array must be equal to srcC*dstH*dstW. [in] dstH - an output height. [in] dstW - an output width. [in] excludePad - a flag of exclude pad from average value calculation. [in] format - a format of (input/output) image tensor.
◆ SimdSynetPoolingForwardMax()
void SimdSynetPoolingForwardMax | ( | const float * | src, |
size_t | srcC, | ||
size_t | srcH, | ||
size_t | srcW, | ||
size_t | kernelY, | ||
size_t | kernelX, | ||
size_t | strideY, | ||
size_t | strideX, | ||
size_t | padY, | ||
size_t | padX, | ||
float * | dst, | ||
size_t | dstH, | ||
size_t | dstW, | ||
SimdBool | trans | ||
) |
This function is used for forward propagation of PoolingLayer (MaxPooling).
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the input 32-bit float array. The size of the array must be equal to srcC*srcH*srcW. [in] srcC - a number of input and output channels. [in] srcH - an input height. [in] srcW - an input width. [in] kernelY - a height of the pooling kernel. [in] kernelX - a width of the pooling kernel. [in] strideY - a y-stride of the pooling. [in] strideX - a x-stride of the pooling. [in] padY - a pad to the top of the input image. [in] padX - a pad to the left of the input image. [out] dst - a pointer to the output 32-bit float array. The size of the array must be equal to srcC*dstH*dstW. [in] dstH - an output height. [in] dstW - an output width. [in] trans - a flag of transposed input and output data (SimdFalse - NCHW order, SimdTrue - NHWC order).
◆ SimdSynetScaleLayerForward()
void SimdSynetScaleLayerForward | ( | const float * | src, |
const float * | scale, | ||
const float * | bias, | ||
size_t | channels, | ||
size_t | spatial, | ||
float * | dst, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of ScaleLayer.
Algorithm's details (example for NCHW tensor format):
for(c = 0; c < channels; ++c) for(s = 0; s < spatial; ++s) dst[c*spatial + s] = src[c*spatial + s]*scale[c] + (bias ? bias[c] : 0);
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the 32-bit float array with input image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] scale - a pointer to the 32-bit float array with scale coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). [in] bias - a pointer to the 32-bit float array with bias coefficients. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)). Can be NULL. [in] channels - a number of channels in the (input/output) image tensor. [in] spatial - a spatial size of (input/output) image tensor. [out] dst - a pointer to the 32-bit float array with output image tensor. The size of the array is SimdAlign (channels, SimdSynetTensorAlignment (format)) * spatial. [in] format - a format of (input/output) image tensor.
◆ SimdSynetShuffleLayerForward()
void SimdSynetShuffleLayerForward | ( | const float * | src0, |
size_t | srcC0, | ||
const float * | src1, | ||
size_t | srcC1, | ||
size_t | spatial, | ||
float * | dst0, | ||
float * | dst1, | ||
size_t | dstC, | ||
SimdTensorFormatType | format | ||
) |
This function is used for forward propagation of ShuffleLayer.
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src0 - a pointer to the32-bit float array with the first input image tensor. [in] srcC0 - a number of channels in the first input image tensor. It must be even number. [in] src1 - a pointer to the32-bit float array with the second input image tensor. [in] srcC1 - a number of channels in the second input image tensor. It must be even number. [in] spatial - a spatial size of (input/output) image tensors. [out] dst0 - a pointer to the 32-bit float array with the first output image tensor. [out] dst1 - a pointer to the 32-bit float array with the second output image tensor. [in] dstC - a number of channels in the first and the second output image tensors. [in] format - a format of (input/output) image tensors.
◆ SimdSynetSoftmaxLayerForward()
void SimdSynetSoftmaxLayerForward | ( | const float * | src, |
size_t | outer, | ||
size_t | count, | ||
size_t | inner, | ||
float * | dst | ||
) |
This function is used for forward propagation of SoftmaxLayer.
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to the input 32-bit float array. The size of the array must be equal to outer*count*inner. [in] outer - an outer size of input and output arrays. [in] count - a size of softmax dimmension. [in] inner - an inner size of input and output arrays. [out] dst - a pointer to the output 32-bit float array. The size of the array must be equal to outer*count*inner.
◆ SimdSynetSpecifyTensorFormat()
SimdTensorFormatType SimdSynetSpecifyTensorFormat | ( | SimdTensorFormatType | format | ) |
Specifies hardware optimized tensor format of 5D-tensor for (input/output) image or 2D-convolution filter.
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] format - an unspecified hardware optimized 5D-tensor format of (input/output) image or 2D-convolution filter. It can be SimdTensorFormatNchwXc or SimdTensorFormatOyxiXo.
- Returns
- specified hardware optimized 5D-tensor format.
◆ SimdSynetTensorAlignment()
size_t SimdSynetTensorAlignment | ( | SimdTensorFormatType | format | ) |
Gets alignment requred for current tensor format.
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] format - a tensor format.
- Returns
- alignment requred for current tensor format.
◆ SimdSynetUnaryOperation32fLayerForward()
void SimdSynetUnaryOperation32fLayerForward | ( | const float * | src, |
size_t | size, | ||
SimdSynetUnaryOperation32fType | type, | ||
float * | dst | ||
) |
This function is used for forward propagation of UnaryOperationLayer.
- Note
- This function is used in Synet Framework.
- Parameters
-
[in] src - a pointer to poitres to the input 32-bit float arrays. [in] size - a size of the input and output arrays. [in] type - an unary operation type (see SimdSynetUnaryOperation32fType). [out] dst - a pointer to the output 32-bit float array.