Building a No-Code Toxicity Classifier – By Talking to GitHub Copilot

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • toxicity

    The world's largest social media toxicity dataset.

  • > Rather than operating under a strict definition of toxicity, we asked our team to identify comments that they personally found toxic.

    [0]: https://github.com/surge-ai/toxicity

  • rockstar

    The Rockstar programming language specification

  • > OK, I am waiting for you to propose a basic language parser that can do it. There's a reason we're only now having this debate - it was unconceivable 5 years ago, in the era of basic language parsers.

    This is really untrue. In fact, making "English as a programming language" was a goal of many older programming languages such as COBOL[1], BASIC, and PASCAL as early as the 60s. It's hardly a new idea and was hardly inconceivable "5 years ago" for something to output a programming language.

    The sentence example here could easily be broken down by the ParseTalk model from the mid-90s[2].

    Here's a recent ish example (2018) of someone developing a "fully English" programming language:

    https://osmosianplainenglishprogramming.blog/2018/05/02/plai...

    It's also a source of fun[3][4][5] for people.

    These are all examples of either programming languages straight up using English as syntax, or lexical parsers that can break down language and provide you with the programmatic ability to make this kind of output.

    The difference here is that while copilot is pulling in python examples based on its training data set, that one thing the author singled out for amazement could easily be done by these older non-ML methods. The value copilot is adding in the example is just outputting python compared to those other methods. The real value is way larger than that, pulling in potentially more complex code to accomplish a complete task.

    It's a bit like seeing an all-electric cargo train and being amazed that a train can run on electricity, when electrified light rail has existed for a long time. The impressive part is not that a thing on rails can use electricity to move around, it's the fact that it can pull heavy cargo efficiently enough to make electric power viable.

    [1]: https://en.wikipedia.org/wiki/COBOL#COBOL_60

    [2]: https://arxiv.org/abs/cmp-lg/9410017

    [3]: https://github.com/RockstarLang/rockstar/blob/main/examples/...

    [4]: https://en.wikipedia.org/wiki/Shakespeare_Programming_Langua...

    [5]: https://github.com/lhartikk/ArnoldC/wiki/ArnoldC

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • ArnoldC

    Arnold Schwarzenegger based programming language

  • > OK, I am waiting for you to propose a basic language parser that can do it. There's a reason we're only now having this debate - it was unconceivable 5 years ago, in the era of basic language parsers.

    This is really untrue. In fact, making "English as a programming language" was a goal of many older programming languages such as COBOL[1], BASIC, and PASCAL as early as the 60s. It's hardly a new idea and was hardly inconceivable "5 years ago" for something to output a programming language.

    The sentence example here could easily be broken down by the ParseTalk model from the mid-90s[2].

    Here's a recent ish example (2018) of someone developing a "fully English" programming language:

    https://osmosianplainenglishprogramming.blog/2018/05/02/plai...

    It's also a source of fun[3][4][5] for people.

    These are all examples of either programming languages straight up using English as syntax, or lexical parsers that can break down language and provide you with the programmatic ability to make this kind of output.

    The difference here is that while copilot is pulling in python examples based on its training data set, that one thing the author singled out for amazement could easily be done by these older non-ML methods. The value copilot is adding in the example is just outputting python compared to those other methods. The real value is way larger than that, pulling in potentially more complex code to accomplish a complete task.

    It's a bit like seeing an all-electric cargo train and being amazed that a train can run on electricity, when electrified light rail has existed for a long time. The impressive part is not that a thing on rails can use electricity to move around, it's the fact that it can pull heavy cargo efficiently enough to make electric power viable.

    [1]: https://en.wikipedia.org/wiki/COBOL#COBOL_60

    [2]: https://arxiv.org/abs/cmp-lg/9410017

    [3]: https://github.com/RockstarLang/rockstar/blob/main/examples/...

    [4]: https://en.wikipedia.org/wiki/Shakespeare_Programming_Langua...

    [5]: https://github.com/lhartikk/ArnoldC/wiki/ArnoldC

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Perhaps It Is a Bad Thing That the Leading AI Companies Cannot Control Their AIs

    1 project | news.ycombinator.com | 12 Dec 2022
  • 30% of Google's Emotions Dataset Is Mislabeled

    1 project | news.ycombinator.com | 13 Jul 2022
  • The Toxicity Dataset – building the largest free dataset of online toxicity

    1 project | news.ycombinator.com | 9 Dec 2021
  • [Free] The Toxicity Dataset — building the world's largest free dataset of online toxicity [Github]

    1 project | /r/ArtificialInteligence | 9 Dec 2021
  • The Toxicity Dataset — building the world's largest free dataset of online toxicity

    1 project | /r/LanguageTechnology | 9 Dec 2021