![]() |
AIfES 2 2.0.0
|
Functions required for inference of models. More...
Go to the source code of this file.
Functions | |
uint32_t | aialgo_sizeof_inference_memory (aimodel_t *model) |
Calculate the memory requirements for intermediate results of an inference. More... | |
uint32_t | aialgo_sizeof_parameter_memory (aimodel_t *model) |
Calculate the memory requirements for the trainable parameters (like weights, bias, ...) of the model. More... | |
uint8_t | aialgo_schedule_inference_memory (aimodel_t *model, void *memory_ptr, uint32_t memory_size) |
Assign the memory for intermediate results of an inference to the model. More... | |
void | aialgo_distribute_parameter_memory (aimodel_t *model, void *memory_ptr, uint32_t memory_size) |
Assign the memory for the trainable parameters (like weights, bias, ...) of the model. More... | |
aitensor_t * | aialgo_forward_model (aimodel_t *model, aitensor_t *input_data) |
Perform a forward pass on the model. More... | |
aitensor_t * | aialgo_inference_model (aimodel_t *model, aitensor_t *input_data, aitensor_t *output_data) |
Perform an inference on the model / Run the model. More... | |
uint8_t | aialgo_compile_model (aimodel_t *model) |
Initialize the model structure. More... | |
void | aialgo_quantize_model_f32_to_q7 (aimodel_t *model_f32, aimodel_t *model_q7, aitensor_t *representative_dataset) |
Quantize model parameters (weights and bias) More... | |
void | aialgo_set_model_result_precision_q31 (aimodel_t *model, uint16_t shift) |
Initialize the quantization parameters of the layer results for Q31 data type. More... | |
void | aialgo_set_model_delta_precision_q31 (aimodel_t *model, uint16_t shift) |
Initialize the quantization parameters of the layer deltas for Q31 data type. More... | |
void | aialgo_set_model_gradient_precision_q31 (aimodel_t *model, uint16_t shift) |
Initialize the quantization parameters of the gradients for Q31 data type. More... | |
void | aialgo_print_model_structure (aimodel_t *model) |
Print the layer structure of the model with the configured parameters. More... | |
Functions required for inference of models.
AIfES is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.
The functions target memory allocation/scheduling, the calculation of the forward pass and quantization for model inference
uint8_t aialgo_compile_model | ( | aimodel_t * | model | ) |
Initialize the model structure.
Counts the number of layers and trainable parameters in a model as preparation for inference or training.
*model | The model |
void aialgo_distribute_parameter_memory | ( | aimodel_t * | model, |
void * | memory_ptr, | ||
uint32_t | memory_size | ||
) |
Assign the memory for the trainable parameters (like weights, bias, ...) of the model.
Only use this function if the parameters are not pre-trained or manually configured. Afterwards an initialization of the memory (for example by initializing the weights) has to be performed.
The required memory size can be calculated with aialgo_sizeof_parameter_memory()
*model | The model |
*memory_ptr | Pointer to the memory block |
memory_size | Size of the memory block (for error checking) |
aitensor_t * aialgo_forward_model | ( | aimodel_t * | model, |
aitensor_t * | input_data | ||
) |
Perform a forward pass on the model.
The result is stored in the result tensor of the output layer and a pointer to this is returned. This output result is stored in the inference memory and is only valid as long as the inference memory is valid. To get the output as a separate tensor, use aialgo_inference_model() instead.
*model | The model |
*input_data | Input data tensor of the same shape as the input_layer shape |
aitensor_t * aialgo_inference_model | ( | aimodel_t * | model, |
aitensor_t * | input_data, | ||
aitensor_t * | output_data | ||
) |
Perform an inference on the model / Run the model.
Make shure to initialize the model (aialgo_compile_model()) and schedule the inference memory (for example with aialgo_schedule_inference_memory() or aialgo_schedule_training_memory()) before calling this function.
Example:
*model | The model |
*input_data | Input data tensor of the same shape as the input_layer shape |
*output_data | Empty tensor for the results of the inference with the size of your outputs |
void aialgo_print_model_structure | ( | aimodel_t * | model | ) |
Print the layer structure of the model with the configured parameters.
*model | The model |
void aialgo_quantize_model_f32_to_q7 | ( | aimodel_t * | model_f32, |
aimodel_t * | model_q7, | ||
aitensor_t * | representative_dataset | ||
) |
Quantize model parameters (weights and bias)
*model_f32 | Pointer to model with single-precision floating point parameters that should be quantized |
*model_q7 | Pointer to model with quantized, fixed-point parameters in q7 format |
*representative_dataset | Pointer to a dataset that represents real model inputs to determine fixed-point quantization parameters |
uint8_t aialgo_schedule_inference_memory | ( | aimodel_t * | model, |
void * | memory_ptr, | ||
uint32_t | memory_size | ||
) |
Assign the memory for intermediate results of an inference to the model.
The required memory size can be calculated with aialgo_sizeof_inference_memory()
*model | The model |
*memory_ptr | Pointer to the memory block |
memory_size | Size of the memory block (for error checking) |
void aialgo_set_model_delta_precision_q31 | ( | aimodel_t * | model, |
uint16_t | shift | ||
) |
Initialize the quantization parameters of the layer deltas for Q31 data type.
Initializes the quantization parameters of the layer deltas tensor (ailayer.deltas; output of the backward function) to the given shift and zero_point = 0.
Use this function when you train a model in Q31 data-type.
*model | The model |
shift | Number of decimal places (shift in Q31) of the layer deltas |
void aialgo_set_model_gradient_precision_q31 | ( | aimodel_t * | model, |
uint16_t | shift | ||
) |
Initialize the quantization parameters of the gradients for Q31 data type.
Initializes the quantization parameters of the gradients tensors to the given shift and zero_point = 0.
Use this function when you train a model in Q31 data-type.
*model | The model |
shift | Number of decimal places (shift in Q31) of the gradients |
void aialgo_set_model_result_precision_q31 | ( | aimodel_t * | model, |
uint16_t | shift | ||
) |
Initialize the quantization parameters of the layer results for Q31 data type.
Initializes the quantization parameters of the layer output (ailayer.result; output of the forward function) to the given shift and zero_point = 0.
Use this function for example when you train a model in Q31 data-type.
*model | The model |
shift | Number of decimal places (shift in Q31) of the layer results |
uint32_t aialgo_sizeof_inference_memory | ( | aimodel_t * | model | ) |
Calculate the memory requirements for intermediate results of an inference.
This memory is mainly for the result buffers of the layers.
Use aialgo_schedule_inference_memory() to set the memory to the model.
*model | The model |
uint32_t aialgo_sizeof_parameter_memory | ( | aimodel_t * | model | ) |
Calculate the memory requirements for the trainable parameters (like weights, bias, ...) of the model.
Use aialgo_distribute_parameter_memory() to set the memory to the model.
*model | The model |