detoxify
xatkit
detoxify | xatkit | |
---|---|---|
4 | 16 | |
839 | 174 | |
1.9% | 1.1% | |
6.2 | 3.1 | |
24 days ago | about 1 month ago | |
Python | ||
Apache License 2.0 | Eclipse Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
detoxify
-
ML Discord Moderation Bot
I created a small discord moderation bot, src can be found at https://gist.github.com/KrautByte/975f404969f4de8f4147e1bb4f7b64cb using https://github.com/unitaryai/detoxify
- Cedille, the largest French language model , released in open source
-
Show HN: Cedille, the largest French language model, released in open source
Yeah, this kind of toxic output sadly still can happen :-/
We have fully analyzed the training dataset (1128 GB) using Detoxify (https://github.com/unitaryai/detoxify) to filter out problematic content. But of course detecting toxicity is a tough challenge in itself, so this process is imperfect at best.
We are using the RealToxicityPrompt framework (https://realtoxicityprompts.apps.allenai.org/) to analyse how toxic our models are and to steer our efforts in this direction. This means we are generating thousands of completions and analysing them to see how "nasty" the model is. We plan to write more on this topic soon.
But yeah, this is definitely far from being a solved problem, and our model (as well as all large language models) should be handled with care.
-
Implementing a toxicity detector in your chatbots
Detoxify is the result of three Kaggle competitions proposed to improve toxicity classifiers. Each had a different purpose within the toxicity classifiers context.
xatkit
-
The full tech stack to run a chatbot — behind the scenes of an open source bot platform
While we wait for these tools to pop up, any tech question on the internals of Xatkit you'd like to know? And if you want to read more about the technologies we have listed above, this twitter thread gives some pointers to good tutorials for them:
-
How to build your own chatbot NLP engine
(obviously) Create your own chatbots (pairing it up with Xatkit or any other chatbot platform for all the front-end and behaviour processing components)
-
How to program a chatbot that reads all your website and answers questions based on its content
The easiest part is to create the chatbot. We'll obviously use Xatkit for this. The bot can have as many intents as you wish. The only part that we care about here is the default fallback state. Here, instead of saying something useless, e.g. "sorry I didn't get your question, can you rephrase it and try again?", we will ask Haystack to find us a solution.
-
On premises chatbot
Take a look at Xatkit (https://github.com/xatkit-bot-platform/xatkit). It's an open source chatbot development platform and very easy to deploy on your own premises as the bot is compiled into a single .jar.
-
Chatbots for freelancers or small business
But, IMHO, many business owners do not really want to create a bot by themselves, no matter how easy is the chatbot development interface. They want to give you the data (whatever type of data they already have, e.g. an excel file with collected questions) and get a bot out of it. This is way at Xatkit we're now providing this type of "chatbot automatic generation services"
-
Choosing Java as your language for a Machine Learning project - Are we crazy???
There are ML libraries available for every language. So there is always a way to execute/train your neural networks outside the python world. For instance, in Xatkit, we reuse Stanfords' Core NLP models in some of our language processors. And, if needed, there is always the option to wrap the ML models code in a Python server (I like the simplicity of Flask for this) and consume them via API calls to this server.
- Show HN: Chatbots generated from your eCommerce data
-
Feedback on our new product and website "Chatbots for e-commerce"
We have recently launched Xatkit, a pretrained chatbot for eCommerce. The reception so far has been lukewarm and we wonder whether:
-
(Beta testers needed) Xatkit - pretrained expert eCommerce bots to sell more doing less
Interested to give it a try? For FREE during the next two months? Visit: https://xatkit.com/ (and pls redistribute to your colleagues if you know anybody that could be interested, thanks!)
-
Beyond no-code: no-learn and no-work development
But this doesn't mean your no-code tool needs to stick to one specific category. As we do in Xatkit, you can offer different interfaces/importers on top of the same engine. You can even offer a low-code version for advanced users willing to use your tool's API to complement the result of the no-code approach.
What are some alternatives?
quickai - QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.
rasa - 💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
kogpt - KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)
GerVADER - GerVADER - A German adaptation of the VADER sentiment analysis tool for social media texts
multi-label-sentiment-classifier - How to build a multi-label sentiment classifiers with Tez and PyTorch
WooCommerce - A customizable, open-source ecommerce platform built on WordPress. Build any commerce solution you can imagine.
mesh-transformer-jax - Model parallel transformers in JAX and Haiku
Foundation - The most advanced responsive front-end framework in the world. Quickly create prototypes and production code for sites that work on any kind of device.
cedille-ai - ✒️ Cedille is a large French language model (6B), released under an open-source license
sagan - The spring.io site and reference application
finetune-gpt2xl - Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
BombPartyBot - A bot for JKLM bomb party