refactor-benchmark vs codespin

refactor-benchmark

Aider's refactoring benchmark exercises based on popular python repos (by paul-gauthier)

Suggest topics

Source Code

Suggest alternative

Edit details

codespin

CodeSpin.AI Code Generation Tools (by codespin-ai)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

refactor-benchmark		codespin
	Project
2	Mentions	5
21	Stars	59
-	Growth	-
5.9	Activity	9.5
3 months ago	Latest Commit	4 days ago
Python	Language	TypeScript
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

refactor-benchmark

Posts with mentions or reviews of refactor-benchmark. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-10.

GPT-4 Turbo with Vision is a step backwards for coding
5 projects | news.ycombinator.com | 10 Apr 2024

FWIW, I agree with you that each model has its own personality and that models may do better or worse on different kinds of coding tasks. Aider leans into both of these concepts.
The GPT-4 Turbo models have a lazy coding personality, and I spent months of effort figuring out how to both measure and reduce that laziness. This resulted in aider supporting "unified diffs" as a code editing format to reduce such laziness by 3X [0] and the aider refactoring benchmark as a way to quantify these benefits [1].
The benchmark results I just shared about GPT-4 Turbo with Vision cover both smaller, toy coding problems [2] as well as larger edits to larger source files [3]. The new model slightly underperforms on smaller coding tasks, and significantly underperforms on the larger edits where laziness is often a culprit.
[0] https://aider.chat/2023/12/21/unified-diffs.html
[1] https://github.com/paul-gauthier/refactor-benchmark
[2] https://aider.chat/2024/04/09/gpt-4-turbo.html#code-editing-...
[3] https://aider.chat/2024/04/09/gpt-4-turbo.html#lazy-coding
OpenAI: Memory and New Controls for ChatGPT
4 projects | news.ycombinator.com | 13 Feb 2024

1-2 sentences: Rather than writing code, GPT-4 Turbo often inserts comments like "... finish implementing function here ...". I made a benchmark that provokes and quantifies that behavior.
1-2 paragraphs:
I found that I could provoke lazy coding by giving GPT-4 Turbo refactoring tasks, where I ask it to refactor a large method out of a large class. I analyzed 9 popular open source python repos and found 89 such methods that were conceptually easy to refactor, and built them into a benchmark [0].
GPT succeeds on a task if it can remove the method from its original class and add it to the top level of the file with appropriate changes to the SIZE of the abstract syntax tree. By measuring the size of the AST, we infer that GPT didn't replace a bunch of code with a comment like "... insert original method here...". I also gathered other laziness metrics like counting the number of new comments that contained "...", which correlated well with the AST size test.
[0] https://github.com/paul-gauthier/refactor-benchmark

codespin

Posts with mentions or reviews of codespin. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-10.

GPT-4 Turbo with Vision is a step backwards for coding
5 projects | news.ycombinator.com | 10 Apr 2024

Shameless plug. I have a VS Code extension that's very nearly ready.
Codespin CLI tools (ready to use): https://github.com/codespin-ai/codespin
VS Code extension for the CLI tool (soon): https://www.youtube.com/watch?v=2TJqosFmkao
I'll do a Show HN in a week or two.
LLMs and Programming in the first days of 2024
8 projects | news.ycombinator.com | 2 Jan 2024

Shameless plug: https://github.com/codespin-ai/codespin-cli
It's similar to aider (which is a great tool btw) in goals, but with a different recipe.
Copying Angry Birds with nothing but AI
3 projects | news.ycombinator.com | 31 Oct 2023

That AI is transformative for development is not in doubt any more. Just this past week, I've been able to build two medium sized services (a couple of thousand lines of code in python, a language I hadn't used for more than a decade!). What's truly impressive is that for the large part, it's better than the code I'd have written anyway. Want a nice README.md? Just provide the source code that contains routes/cli args/whatever, and it'll generate it for you. Want tests? Sure. Developers have never had it so easy.
Another thing to note is that for code generation, GPT4 runs circles around GPT3.5. GPT35 is alright at copying if you provide very tight examples, but GPT4 kinda "thinks".
Shameless plug: I have this open source app which automates a lot of grunt work in prompt generation - https://github.com/codespin-ai/codespin-cli
An Open Source Node.JS-based CLI tool for Generating Code using GPT
1 project | /r/node | 21 Oct 2023
CodeSpin: Code generation framework and tools using OpenAI APIs
1 project | news.ycombinator.com | 15 Oct 2023