Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
An interesting side observation was to learn the semantics of such examples where 2 opposite samples share the same words but not the meaning. Later on, this allowed me to take a few actions to increase the overall performance of the model. I left out the lemmatization in the preprocessing pipeline and created feature engineering that applies additional weights to queries which start with verbs or include time for example. I have used quickadd, an open-source library for parsing time & date I have forked from ctparse (and upgraded with a lots of modifications).
PyInstaller seemed like the most maintained and developed tool to freeze python script into an executable, so I went with it. As expected, the freezed interface with the model was gigabytes large, so I had to figure out how to squeeze this. Fortunately, Onnx worked wonders and packaged the model into an inference only state, so I could throw away the Pytorch and Torchtext dependencies when freezing with Pyinstaller.Now the size of the executable with the model was 43MB instead of 4GB.
PyInstaller seemed like the most maintained and developed tool to freeze python script into an executable, so I went with it. As expected, the freezed interface with the model was gigabytes large, so I had to figure out how to squeeze this. Fortunately, Onnx worked wonders and packaged the model into an inference only state, so I could throw away the Pytorch and Torchtext dependencies when freezing with Pyinstaller.Now the size of the executable with the model was 43MB instead of 4GB.
Inspired by the IDE language server protocol, I created an API interface between the electron and the Python ML interface. ZeroMQ turned out be an invaluable resource as a fast and lightweight messaging queue between the two.