Proposed by: Shivay Lamba
Efficient AI inference on Edge with TensorFlow, WASM & Kubernetes
Machine Learning inference (though what we talk about can be applied to other computationally intensive tasks) is often a computationally intensive task and could greatly benefit from the speed of WebAssembly. What if I told you, that you could improve the performance of your application deployed on Linux containers. Enter Wasm and Kubernetes. This talk starts off by introducing the audience to WebAssembly (Wasm) and how they could make use of the speed and security among others of Wasm for their deployments. Another problem we face is that the standard WebAssembly provides very limited access to the native OS and hardware, such as multi-core CPUs, GPU, or TPUs which is not ideal for the kind of systems we target. The talk also shows how one could use the WebAssembly System Interface (WASI) to get security, portability, and native speed for Machine Learning models. The talk then shows how one can use Krustlet to run Wasm on Kubernetes with a demo of deploying a model while also talking about necessary considerations while doing so.
Source code/Reference: https://docs.google.com/presentation/d/1dMedpKK0OQ0Zo29X6Ln4BNXLPEeivZBJs9DUYqyqRtA/edit?usp=sharing
Talk duration: