Session

Exhaustive Sparse Kernel Search with the Universal Sparse Tensor

The Universal Sparse Tensor (UST) decouples a tensor's sparsity from its actual memory layout for greater flexibility and performance. A tensor format DSL (Domain Specific Language) describes how the sparse tensor should be represented. Type polymorphism on a small set of base operations defines the vast space of instances for these operations. This talk demonstrates how the UST can be used to perform an exhaustive state space search over all CUDA kernels for PyTorch operations, accounting for many different iteration orderings and sparse storage formats. When integrated with heuristic pruning techniques and potential agent-assisted optimization, this methodology facilitates the acceleration of sparsity in a manner that was previously unattainable for most model developers.

Aart Bik

distinguished engineer at NVIDIA

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top