Our great sponsors
-
NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
I have been playing around with some models locally and creating a discord bot as a fun side project, and I wanted to setup some guardrails on inputs / outputs of the bot to make sure that it isn't violating any ethical boundaries. I was going to use Nvidia's Nemo guardrails, but they only support openai currently. Are there any other good ways to control inputs?
I think of guardrails as another dimension of human preferences: whether you are training a model to answer questions more gooder or avoid saying horrifying stuff, you are teaching the model a preference. So I thinks it's a straightforward RLHF problem but from a different perspective.
Related posts
- OpenDILab Awesome Paper Collection: RL with Human Feedback (2)
- Best option for creating a custom GPT AI
- Fast and secure translation on your local machine with a GUI
- Understand how transformers work by demystifying all the math behind them
- [P] Why the Original Transformer Figure Is Wrong, And Some Other Interesting Tidbits