swe-bump-bench

Benchmark test cases for version bumps (by xeol-io)

Swe-bump-bench Alternatives

Similar projects and alternatives to swe-bump-bench

  • ts-morph

    TypeScript Compiler API wrapper for static analysis and programmatic code changes.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • bumpgen

    bumpgen is an AI agent that upgrades npm packages

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better swe-bump-bench alternative or higher similarity.

swe-bump-bench discussion

Log in or Post with

swe-bump-bench reviews and mentions

Posts with mentions or reviews of swe-bump-bench. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-10.
  • Show HN: I built an AI agent that upgrades NPM packages
    3 projects | news.ycombinator.com | 10 May 2024
    We built our own benchmarks [1] to test against. It's a set of repos with breaking changes commit that we run bumpgen on the previous commit to then compare the results.

    We are clocking in around 50% success rate in this benchmark.

    [1] https://github.com/xeol-io/swe-bump-bench

  • Show HN: Can AI keep my code up-to-date?
    2 projects | news.ycombinator.com | 6 May 2024
    Hey HN,

    We are open sourcing (https://github.com/xeol-io/bumpgen/), a tool that uses an LLM to upgrade your dependencies to their newer versions. We are building this for our customers (security teams) who need to upgrade their dependencies for security and compliance but lack engineering resources. Most patching tend to be painless but major version upgrades will cause breaking changes that require engineers to fix. We want bumpgen to reduce this engineering cost to security.

    It has been an interesting experience using AI to identify breaking changes, fix them, then propagate the fixes across a codebase. We thought for sure the biggest challenge would be overcoming a LLM’s coding shortcomings but turns out navigating the codebase correctly was the much bigger issue.

    When bumpgen fixes a breaking change, the fix itself needs to be applied to the rest of the codebase. This requires bumpgen to understand how every function in a codebase interacts with each other and how a change to one would affect the others. To address this, we drew inspiration from CodePlan (https://arxiv.org/abs/2309.12499) that describes a theoretical(?) way to apply change consistently for an entire repository. In short, it combines a dependency graph and an oracle to understand the codebase then verify a change.

    If I were to summarize my learnings, it would be that applying codegen to this problem is much more a “refactoring” problem than it is a “coding” problem (if that makes sense). So to answer my own question, I’d say AI CAN keep your code up-to-date but ONLY if it is guided carefully through the codebase.

    We benchmarked bumpgen against our test suite (https://github.com/xeol-io/swe-bump-bench) and it currently stands at around 50% accuracy. The bench suite is a set of repos with human commits for version bumps. We run bumpgen on the prior commit then compare bumpgen’s code diffs to that of the human commit along with passing builds to determine success.

    There are a lot of improvements we want to test out in May to increase accuracy. Starting with improving and tightening our core CodePlan logic. We think getting this right will unlock automating far more complex breaking changes and codebases like upgrading a vue 2 to vue 3.

    We have been building bumpgen for a month and we would love to hear people’s thoughts, experiences, and suggestions!

  • Show HN: Bumpgen – upgrade NPM packages using AI
    3 projects | news.ycombinator.com | 30 Apr 2024
    - validate the fixes with a rebuild

    We also built our own benchmark suite (https://github.com/xeol-io/swe-bump-bench) to test bumpgen’s accuracy. It is a set of repos with human commits for version bumps. We would run bumpgen on the prior commit then compare bumpgen’s PR to that of the human commit to determine success. Our latest benchmark sits around ~50% accuracy.

    Up next we want to better address challenge [2] by building embeddings for different dependency versions to give the LLM more context on the changes across versions. After that we will be releasing a GitHub app with some qol features such as configuring update cadences, etc before moving onto C# and Golang support.

    We covered a lot of our architecture and thought process for bumpgen, we would love to hear your thoughts and feedback in the comments!

  • A note from our sponsor - SaaSHub
    www.saashub.com | 15 Jun 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic swe-bump-bench repo stats
3
1
5.6
about 2 months ago

xeol-io/swe-bump-bench is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of swe-bump-bench is TypeScript.


Sponsored
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io