Replit's new Code LLM was trained in 1 week

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • fauxpilot

    FauxPilot - an open-source alternative to GitHub Copilot server

  • Of course it's possible, just not officialy

    See https://github.com/fauxpilot/fauxpilot

  • code-align-evals-data

  • My favorite line from the HumanEval paper

    > It is important for these tasks to be hand-written, since our models are trained on a large fraction of GitHub, which already contains solutions to problems from a variety of sources.

    So to answer your question, yes, the evaluation dataset is spoiled. You can find such unique and never before seen docstrings like

    > For a given list of input numbers calculate the Mean Absolute Deviation around the mean of this dataset. Mean Absolute Deviation is the absolute difference between each element and a centerpoint (mean in this case)[0]

    And here's a repo I found that is 8 years old[1]. But how about a more recent one that is even closer?[2] There's plenty more examples[3] (does anyone know how actually limit the date to prior to 2021? `pushed:<2021` doesn't work nor does using the `created` keyword. Date searching doesn't seem to work well).

    [0] https://github.com/openai/code-align-evals-data/blob/97446d9...

    [1] https://github.com/bertomartin/stat4701/blob/ec2b64f629cbbf6...

    [2] https://github.com/danielwatson6/hate-speech-project/blob/64...

    [3] https://github.com/search?q=abs%28x+-+mean%29+for+language%3...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ReplitLM

    Inference code and configs for the ReplitLM model family

  • Some links:

    - Repo: https://github.com/replit/ReplitLM/tree/main/replit-code-v1-...

    - HuggingFace: https://huggingface.co/replit/replit-code-v1-3b

    - Demo: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo

    - Early benchmark results: https://twitter.com/amasad/status/1651019556423598081

    A lot about this was surprising. We knew it was going to be good, but didn't expect to be this good -- especially surprising was the finetuned performance boost and the fact that the model is decent (in some cases much better than much larger language models) at language tasks and reasoning.

    It feels like there is a lot more to do with this model, and I have a suspicion you can even make a half-decent chat bot (at least one focused on code) by finetuning.

    Will follow up with the UL2R version (fill-in-the-middle support).

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • I don't think it's possible to point Copilot to other models. I don't think Microsoft would benefit much from that feature. You could use existing tools [0] to host your own model which in theory could be used by an extension your IDE uses. But I'm not sure if an extension like that exists.

    [0] https://github.com/oobabooga/text-generation-webui

  • mation-spec

  • Have you thought of finding or creating something like this [0]?

    I created this as the basis for my origami folding descriptive language. I tried to find something similar, requirements being both well structured and English-like but couldn't find any, so I created it.

    The origami folding app will hopefully be out in 2 weeks, so you can see how it's used.

    [0] https://github.com/fuzzthink/mation-spec

  • stat4701

    Final Project

  • My favorite line from the HumanEval paper

    > It is important for these tasks to be hand-written, since our models are trained on a large fraction of GitHub, which already contains solutions to problems from a variety of sources.

    So to answer your question, yes, the evaluation dataset is spoiled. You can find such unique and never before seen docstrings like

    > For a given list of input numbers calculate the Mean Absolute Deviation around the mean of this dataset. Mean Absolute Deviation is the absolute difference between each element and a centerpoint (mean in this case)[0]

    And here's a repo I found that is 8 years old[1]. But how about a more recent one that is even closer?[2] There's plenty more examples[3] (does anyone know how actually limit the date to prior to 2021? `pushed:<2021` doesn't work nor does using the `created` keyword. Date searching doesn't seem to work well).

    [0] https://github.com/openai/code-align-evals-data/blob/97446d9...

    [1] https://github.com/bertomartin/stat4701/blob/ec2b64f629cbbf6...

    [2] https://github.com/danielwatson6/hate-speech-project/blob/64...

    [3] https://github.com/search?q=abs%28x+-+mean%29+for+language%3...

  • hate-speech-project

  • My favorite line from the HumanEval paper

    > It is important for these tasks to be hand-written, since our models are trained on a large fraction of GitHub, which already contains solutions to problems from a variety of sources.

    So to answer your question, yes, the evaluation dataset is spoiled. You can find such unique and never before seen docstrings like

    > For a given list of input numbers calculate the Mean Absolute Deviation around the mean of this dataset. Mean Absolute Deviation is the absolute difference between each element and a centerpoint (mean in this case)[0]

    And here's a repo I found that is 8 years old[1]. But how about a more recent one that is even closer?[2] There's plenty more examples[3] (does anyone know how actually limit the date to prior to 2021? `pushed:<2021` doesn't work nor does using the `created` keyword. Date searching doesn't seem to work well).

    [0] https://github.com/openai/code-align-evals-data/blob/97446d9...

    [1] https://github.com/bertomartin/stat4701/blob/ec2b64f629cbbf6...

    [2] https://github.com/danielwatson6/hate-speech-project/blob/64...

    [3] https://github.com/search?q=abs%28x+-+mean%29+for+language%3...

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • trax

    Trax — Deep Learning with Clear Code and Speed

  • and the implementation https://github.com/google/trax/blob/master/trax/models/resea... if you are interested.

    Hope you get to look into this!

  • IF

  • >for any purpose, even commercially.

    Compare this to the latest release from StabilityAI lab DeepFloyd, "IF", which in addition to various restrictive clauses strictly prohibits commercial use: https://github.com/deep-floyd/IF/blob/develop/LICENSE-MODEL

    Repl.it's release is as open as it gets these days, in my book.

    Replit doesn't have special mod powers but a HN moderator downweighted this subthread, the same way we do any generic/indignant subthread. In this case we did so less than we normally would (because we moderate HN less when the topic is a YC co - see https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu... for lots of past explanation).

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts