Session

Beyond the Basics: Authoring Custom Operators in PyTorch for Performance Gains

Unlock the full potential of PyTorch by building high-performance custom operators in C++ and CUDA. This session provides a deep dive into defining, implementing, and registering custom ops using the latest PyTorch APIs, including TORCH_LIBRARY for operator definition and backend-specific kernels. Attendees will learn how to write device-specific code, manage memory efficiently, and implement both forward and backward passes for autograd support. The talk covers hybrid Python/C++ registration, dynamic loading of compiled extensions, and integration with TorchScript and ExecuTorch for deployment on diverse hardware. Real-world examples will demonstrate how to profile bottlenecks, leverage operator schema management, and replace standard modules (e.g., nn.Linear) with optimized custom variants. By the end, you’ll be equipped to extend PyTorch for specialized workloads, accelerate inference, and contribute robust extensions to the open source community

Harshita Varma

Associate Product Manager

Bengaluru, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top