Show HN: Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • koboldcpp

    A simple one-file way to run various GGML and GGUF models with KoboldAI's UI

  • > We can achieve up to 100 tokens per second single-stream while GPT-4 runs around 20 tokens per second at best.

    Is that with batching? If so, thats quite impressive.

    > certain challenging questions where it is capable of getting the right answer, the Phind Model might take more generations to get to the right answer than GPT-4.

    Some of this is sampler tuning. Y'all should look at grammar based sampling if you aren't using it already, as well as some of the "dynamic" sampling like mirostat and dynatemp: https://github.com/LostRuins/koboldcpp/pull/464

    I'd guess you want a low temperature for coding, but its still a tricky balance.

  • hkhomekit

  • It's definitely not impossible at least.

    Someone is doing it in python here:

    https://pyatv.dev/

    GPT-4 actually sent me here:

    "Here is an example of a C# library that implements the HAP: CSharp.HomeKit (https://github.com/brutella/hkhomekit). You can use this library as a reference or directly use it in your project."

    Which, to no surprise based on my experiences with LLMs for programming does not exist and doesn't seem to have ever existed.

    I get that they aren't magic, but I guess I am just bad at trying to use LLMs to help in my programming. Apparently all I do are obscure things or something. Or I am just not good enough at prompting. But I feel like that's also a reflection of the weakness of an LLM in that it needs such perfect and specific prompting to get good answers.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • pyatv

    A client library for Apple TV and AirPlay devices

  • It's definitely not impossible at least.

    Someone is doing it in python here:

    https://pyatv.dev/

    GPT-4 actually sent me here:

    "Here is an example of a C# library that implements the HAP: CSharp.HomeKit (https://github.com/brutella/hkhomekit). You can use this library as a reference or directly use it in your project."

    Which, to no surprise based on my experiences with LLMs for programming does not exist and doesn't seem to have ever existed.

    I get that they aren't magic, but I guess I am just bad at trying to use LLMs to help in my programming. Apparently all I do are obscure things or something. Or I am just not good enough at prompting. But I feel like that's also a reflection of the weakness of an LLM in that it needs such perfect and specific prompting to get good answers.

  • vim-chatgpt

    Vim Plugin For ChatGPT

  • exllamav2

    A fast inference library for running LLMs locally on modern consumer-class GPUs

  • Without batching, I was actually thinking that's kind of modest.

    ExllamaV2 will get 48 tokens/s on a 4090, which is much slower/cheaper than an H100:

    https://github.com/turboderp/exllamav2#performance

    I didn't test codellama, but the 3090 TI figures are in the ballpark of my generation speed on a 3090.

  • transformers

    πŸ€— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

  • Too much money being thrown around on BS in the LLM space, hardly any of it is going to places where it matters.

    For example, the researchers working hard on better text sampling techniques, or on better constraint techniques (i.e. like this https://arxiv.org/abs/2306.03081), or on actual negative prompting/CFG in LLMs (i.e. like this https://github.com/huggingface/transformers/issues/24536) are doing far FAR more to advance the state of AI than dozens of VC backed LLM "prompt engineering" companies operating today.

    HN, and the NLP community have some serious blindspots with knowing how to exploit their own technology. At least someone at Anderson Howartz got a clue and gave some funding to Oogabooga - still waiting for Automatic1111 to get any funding.

  • HapSharp

    HomeKit Accessory Server .Net bridge!

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • HomeKit

    Native C# Libary for Apple's HomeKit Accessory Protocol (by ppumkin)

  • ChatGPT-AutoExpert

    πŸš€πŸ§ πŸ’¬ Supercharged Custom Instructions for ChatGPT (non-coding) and ChatGPT Advanced Data Analysis (coding).

  • Take a look at the AutoExpert custom instructions: https://github.com/spdustin/ChatGPT-AutoExpert

    It lets you specify verbosity from 1 to 5 (e.g. "V=1" in the prompt). Sometimes it will just ignore that, but it actually does work most of the time. I use a verbosity of 1 or 2 when I just want a quick answer.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts