2020 |
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013
April 1, 2020 (version 4.6.88)
Algorithms
New features
- AVX-512VNNI extension support.
- AVX2, AVX-512BW, AVX-512VNNI and NEON optimizations of SynetConvolution8iNhwcDirect class.
- Base implementation and SSE4.1, AVX2 AVX-512BW and NEON optimizations of function SynetPoolingForwardMax8u.
Renaming
- SynetPoolingForwardMax to SynetPoolingForwardMax32f.
Improving
- SSE4.1 optimization of SynetConvolution8iNhwcDirect class.
Bug fixing
- Microsoft Visual Studio 2015 compiler error in function SynetConvert32fTo8u.
- Degradation of performance of AVX2 code.
- Microsoft Visual Studio compiler error in function Extract64i (32-bit mode).
Test framework
New features
- Tests for verifying functionality of function SynetPoolingForwardMax8u.
Home
March 2, 2020 (version 4.5.87)
Algorithms
New features
- Add parameter of bitwise compatibility of function SynetScaleLayerForward and Inference Engine.
- Add parameter 'type' to function SynetShuffleLayerForward.
- Base implementation, SSE2, AVX2, AVX-512BW amd NEON optimizations of function SynetConvert32fTo8u.
- SimdSynetCompatibilityType enumeration.
- Base implementation of SynetConvolution8iGemmNN class.
- Base implementation and SSE4.1 optimization of SynetConvolution8iNhwcDirect class.
Renaming
- SimdSynetConvertImage to SimdSynetReorderImage.
- SimdSynetConvertFilter to SimdSynetReorderFilter.
Test framework
New features
- A new commandline test parameter -c - a number of channels in test image for performance testing.
- A new commandline test parameter -mt - a minimal test execution time (in milliseconds).
- Tests for verifying functionality of SynetConvolution8i framework.
- Tests for verifying functionality of function SynetConvert32fTo8u.
Documentation
Bug fixing
- Error in description of method Detection::LoadStringXml.
Home
February 3, 2020 (version 4.5.86)
Algorithms
New features
- SimdResizeMethodInferenceEngineInterp method in Resizer framework.
Improving
- Performance of Convolution32f framework (NHWC format, kernel=3x3, stride=1x1, large H and W).
- Performance of AVX-512F and NEON optimizations of function GemmPackA.
- Performance of Convolution32f framework (NHWC format, GemmNN method).
- Performance of SSE2, AVX, AVX2, AVX-512F and NEON optimizations of Convolution32f framework (NHWC format, NhwcDirect method, kernel=1x1).
- Performance of AVX-512F optimization of MergedConvolution32f framework (input convolution).
- Performance of AVX2 and AVX-512F optimizations of MergedConvolution32f framework (output convolution).
- Performance of Convolution32f framework (stride > 1).
- Performance of AVX-512F optimization of Gemm32fNN function (add 6x64 and 6x48 micro kernel).
Bug fixing
- Error in AVX-512F optimization of function WinogradKernel3x3Block2x2SetOutput (NCHW format).
- Error in SSE, AVX, AVX-512F and NEON optimizations of function SynetPoolingForwardAverage (NHWC format).
- Error in AVX-512F optimization of function SynetInnerProductLayerForward.
- Error in AVX, AVX2 and AVX-512F optimizations of function Gemm32fNT.
- Error in function WinogradKernel3x3Block4x4SetInput (padX != padY != padW != padH).
- Error in debug FLOPS annotation of Deconvolution32f framework.
- MergedConvolution32f framework doesn't work with stride == 3.
Home
January 3, 2020 (version 4.5.85)
Algorithms
New features
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetUnaryOperation32fLayerForward.
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetSoftplus32f.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block2x2SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block2x2SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block2x2SetOutput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block4x4SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block4x4SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block4x4SetOutput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x3Block1x4SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x3Block1x4SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x3Block1x4SetOutput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x5Block1x4SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x5Block1x4SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x5Block1x4SetOutput.
Improving
- Performance of Convolution32f framework (NHWC format, kernel=1x1x1).
- Performance of Convolution32f framework (NHWC format, kernel=2x2).
- Performance of Convolution32f framework (NHWC format, kernel=1x3).
- Performance of Convolution32f framework (NHWC format, kernel=1x5).
Renaming
- NeuralSigmoid to SynetSigmoid32f.
- NeuralTanh to SynetTanh32f.
- NeuralRelu to SynetRelu32f.
- Winograd2x3SetFilter to WinogradKernel3x3Block2x2SetFilter.
- Winograd2x3SetInput to WinogradKernel3x3Block2x2SetInput.
- Winograd2x3SetOutput to WinogradKernel3x3Block2x2SetOutput.
- Winograd3x3SetFilter to WinogradKernel3x3Block3x3SetFilter.
- Winograd3x3SetInput to WinogradKernel3x3Block3x3SetInput.
- Winograd3x3SetOutput to WinogradKernel3x3Block3x3SetOutput.
- Winograd4x4SetFilter to WinogradKernel3x3Block4x4SetFilter.
- Winograd4x4SetInput to WinogradKernel3x3Block4x4SetInput.
- Winograd4x4SetOutput to WinogradKernel3x3Block4x4SetOutput.
Bug fixing
- Error in Convolution32f framework (kernel greater than input size, NHWC format).
- Potential crash in ContourDetector.
Test framework
New features
- Tests for verifying functionality of function SynetUnaryOperation32fLayerForward.
- Tests for verifying functionality of function SynetSoftplus32f.
- Tests for verifying functionality of function WinogradKernel2x2Block2x2SetFilter.
- Tests for verifying functionality of function WinogradKernel2x2Block2x2SetInput.
- Tests for verifying functionality of function WinogradKernel2x2Block2x2SetOutput.
- Tests for verifying functionality of function WinogradKernel2x2Block4x4SetFilter.
- Tests for verifying functionality of function WinogradKernel2x2Block4x4SetInput.
- Tests for verifying functionality of function WinogradKernel2x2Block4x4SetOutput.
- Tests for verifying functionality of function WinogradKernel1x3Block1x4SetFilter.
- Tests for verifying functionality of function WinogradKernel1x3Block1x4SetInput.
- Tests for verifying functionality of function WinogradKernel1x3Block1x4SetOutput.
- Tests for verifying functionality of function WinogradKernel1x5Block1x4SetFilter.
- Tests for verifying functionality of function WinogradKernel1x5Block1x4SetInput.
- Tests for verifying functionality of function WinogradKernel1x5Block1x4SetOutput.
Home
2020 |
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013
|