Kernel Fusion
Contents
Kernel Fusion¶
You can specify your own kernel fusion related configuration under kernel_fusion
like:
{
"kernel_fusion": {
"enable": "bool",
"memory_efficient_fusion": "bool",
"custom_cuda_kernels": "list",
}
}
3. custom_cuda_kernels: list
¶
type: list
default: []
List of the custom CUDA kernels to use.
Currently, the following kernels are supported.
FusedRMSNorm
: Efficient RMSNorm kernel, it’s available when using the T5.FusedNoRepeatNGram
: Execute ngram blocking in GPU when generating text, it’s very effective for large batch text generation.