Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
It looks like executorch is for edge devices, although not all.
I'm currently doing inference on GPUs for libtorch and have a few concerns: (1) It seems like libtorch/torchscript are on a path to getting deprecated and (2) libtorch/torchscript pull in enormously bloated libraries. Should I be looking at executorch? I currently don't see an nvidia backend / integration with tensor rt in https://github.com/pytorch/executorch/tree/main/backends , but seems like it might be possible. Is this something you are thinking about?
Is it possible to execute a light weight language model, perhaps this https://github.com/facebookresearch/llama using ExecuTorch to run on smartphone in real time for a chatbot app ? Please share some guidance.
[2] https://github.com/huggingface/candle/issues/313