toxicity
datapane
toxicity | datapane | |
---|---|---|
11 | 30 | |
166 | 1,349 | |
0.0% | 0.5% | |
0.0 | 7.3 | |
almost 2 years ago | 7 months ago | |
Python | ||
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
toxicity
-
Perhaps It Is a Bad Thing That the Leading AI Companies Cannot Control Their AIs
I'm a PM at a human data company (https://www.surgehq.ai) that helps the large language model companies ensure their models are safe (we're the “clever prompt engineers” who helped Redwood assess their model performance).
We actually just published a blog today that includes our perspective on building “AI red teams” and best practices for AI alignment/safety: https://www.surgehq.ai/blog/ai-red-teams-for-adversarial-tra...
-
30% of Google's Emotions Dataset Is Mislabeled
I'd love to chat. Want to reach out to the email in my profile? I'm the founder of a much higher-quality data startup (https://www.surgehq.ai), and previously built the human computation platforms at a couple FAANGs.
We work with a lot of the top AI/NLP companies and research labs, and do both the "typical" data labeling work (sentiment analysis, text categorization, etc), but also a lot more advanced stuff (e.g., training coding assistants, evaluating the new wave of large language models, adversarial labeling, etc -- so not just distinguishing cats and dogs, but rather making full use of the power of the human mind!).
-
Building a No-Code Toxicity Classifier – By Talking to GitHub Copilot
> Rather than operating under a strict definition of toxicity, we asked our team to identify comments that they personally found toxic.
[0]: https://github.com/surge-ai/toxicity
-
Ask HN: Who is hiring? (January 2022)
Love language? So do we, and our mission is to infuse AI with that same love. At Surge, we're building the human infrastructure to power NLP — from detecting hate speech, to parsing complex documents, to injecting human values into the next wave of language models. Our first product is a platform that helps ML teams create amazing, human-powered datasets to train AI in the richness of language. We're a team of former Google, Facebook, and Airbnb engineering leads, and we work with top companies at the forefront of machine learning. Our tech stack is Ruby on Rails, React, and Python. We’re rapidly growing, and we're looking for full-stack engineers to join the team and develop our product. To apply, please email [email protected] with a resume and 2-3 sentences describing your interest in Surge. We love personal projects and writings too!
More information: https://www.surgehq.ai/about#careers
A blog post explaining the problems we are working to solve: https://www.surgehq.ai/blog/the-ai-bottleneck-high-quality-h...
- The Toxicity Dataset – building the largest free dataset of online toxicity
- [Free] The Toxicity Dataset — building the world's largest free dataset of online toxicity [Github]
- The Toxicity Dataset — building the world's largest free dataset of online toxicity
- The Toxicity Dataset (1000 social media comments) — any ideas for interesting visualizations? [github]
- The Toxicity Dataset - free dataset of online toxicity (Github) - could be used for interesting portfolio projects
- The Toxicity Dataset — free dataset of online toxicity (Github)
datapane
- Datapane: Build and share data reports in 100% Python
-
Polars: Company Formation Announcement
If you're looking for an easy way to build an HTML report using Python, you might find Datapane (https://github.com/datapane/datapane) helpful. I'm one of the people building it! We don't support polars (yet, on the roadmap) but we do support pandas so you can convert to a pandas DataFrame and include your data and any plots, etc.
-
JupyterLab 4.0
If you're interested in an easier way to create reports using Python and Plotly/Pandas, you should check out our open-source library, Datapane: https://github.com/datapane/datapane - you can create a standalone, redistributable HTML file in a few lines of Python.
-
Evidence – Business Intelligence as Code
You might be interested in what we're hacking on at Datapane (I'm one of the founders): https://github.com/datapane/datapane.
You can create standalone HTML data reports from Python/Jupyter in ~3 lines of code: https://docs.datapane.com/reports/overview/
-
Ask HN: Fastest way to turn a Jupyter notebook into a website these days?
You can build web apps from Jupyter using Datapane [0]. I'm one of the founders, so let me know if I can help at all.
You can either export a static site [1] (and host on GH pages or S3), or, if you need backend logic, you can add Python functions [2] and serve on your favourite host (we use Fly).
We have specific Jupyter integration to automatically convert your notebook into an app [3].
[0] https://github.com/datapane/datapane
[1] https://docs.datapane.com/reference/reports/#datapane.proces...
[2] https://docs.datapane.com/apps/overview/
[3] https://docs.datapane.com/reports/jupyter-integration/#conve...
- Datapane – Build full-stack data apps in 100% Python
-
Datapane - Build full-stack data apps in 100% Python
Our GitHub is https://github.com/datapane/datapane and you can get started here: https://docs.datapane.com/quickstart/
- Datapane: Build internal analytics products in minutes using Python
-
Datapane - Build internal data products in 100% Python
Thanks a lot! Yes, absolutely, a few people have brought this up and working working on removing the header right now. If I can help at all, feel free to reach us on GH Discussions: https://github.com/datapane/datapane/discussions
- Datapane/datapane: Build full-stack data analytics apps in Python
What are some alternatives?
hate-speech-and-offensive-language - Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
streamlit - Streamlit — A faster way to build and share data apps.
seldon-core - An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
dash - Data Apps & Dashboards for Python. No JavaScript Required.
zotero - Zotero is a free, easy-to-use tool to help you collect, organize, annotate, cite, and share your research sources.
jupyter-dash - OBSOLETE - Dash v2.11+ has Jupyter support built in!
Fleet - Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
perspective - A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
zenml - ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.
superset - Apache Superset is a Data Visualization and Data Exploration Platform
deno - A modern runtime for JavaScript and TypeScript.
plotly - The interactive graphing library for Python :sparkles: This project now includes Plotly Express!