cria
effort
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
cria
-
Show HN: Speeding up LLM inference 2x times (possibly)
It originally started as a fork to Recmo’s cria pure numpy llama impl :)
https://github.com/recmo/cria
Took a whole night to compute a few
-
Jsonformer: A bulletproof way to generate structured output from LLMs
Not op, but I can share my approach - I went line by line by Recmo's Cria: https://github.com/recmo/cria - which is an implementation of Llama in Numpy - so very low level. Took me I think 3-4 days x 10 hours + 1-2 days of reading about Transformers to understand what's going on - but from that you can see how models generate text and have a deep understanding of what's going on.
- LLaMA for poor
effort
-
Some scientists can't stop using AI to write research papers
My experience is exactly the opposite. AI has way more domain-specific knowledge than any translator I could imagine.
I recently published effort ( http://kolinko.github.io/effort/ - got to the front page of HN two weeks ago ), and literally everything on that page was rewritten by chatgpt.
The flow is that I write it however I can, sometimes so badly that even a human professional would have difficulty to understand, then I ask chatgpt to go paragraph by paragraph and rewrite and smoothen out the text.
You can see how the website was reworked here:
https://chat.openai.com/share/e/10d7ba3f-f7eb-48cd-9250-d864...
GPT has way more domain knowledge in my area than most engineer friends I know, not to mention translators, and it managed to not just help with grammar, but with overall explanations I wrote.
Of course it fails at higher level concepts, and a plenty of things, but still - I can get more quality output from him in an hour than by working a week with a dedicated translator/editor.
-
Show HN: Speeding up LLM inference 2x times (possibly)
I think it was somewhere around that tag:
https://github.com/kolinko/effort/releases/tag/5.0-last-mixt...
Cannot rerun easily any more, because the underlying model/weight names changed in the meantime. It doesn't help that Mixtral's published .safetensor files seem messed up, and I needed to hack a conversion from pytorch - it added an extra layer of confusion into the project.
What are some alternatives?
transmogrifier - Unstructured data goes in, structured data comes out. Sometimes comedically.
clownfish - Constrained Decoding for LLMs against JSON Schema
magic - AI functions for Typescript