Semgrep: Like Grep but for Code

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • semgrep

    Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.

  • pre-commit

    A framework for managing and maintaining multi-language pre-commit hooks.

  • https://pre-commit.com/#pre-commit-autoupdate

    A person could easily `ln -s repo/.hooks/hook*.sh repo/.git/hooks/` after every git clone.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • terraform-provider-aws

    The AWS Provider enables Terraform to manage AWS resources.

  • Hey, I work on Semgrep. As a real world example, I just noticed that Hashicorp uses a whole bunch of Semgrep rules on terraform-provider-aws[0].

    [0]: https://github.com/hashicorp/terraform-provider-aws/blob/mai...

  • refex

    A syntactically aware search-and-replace tool for Python.

  • There's lots of confusion about what semgrep does here, which is kind of unfortunate. I haven't touched it much, but I have built a very similar tool (I'm one of the contributors to refex[1], which is a very similar project).

    The starting point of semantic grep is very useful. When you have a big codebase, you often want to detect antipatterns, or not even antipatterns, but just uses of a thing, say you're renaming a method and want to track down the callers.

    Being able to act on the AST, instead of hoping you searched up all of the variants of whitespace and line breaks and, depending on the specific example, different uses of argument passing, is really useful.

    But often when you're semantically grepping, your goal is to replace something with something else (this is what refex was initially built for: to aide in large scale changes in python, as a sort of equivalent to the C++ tools that Google uses).

    But then you want to shift left even further: once you have a pattern that you want to replace once, you can just enforce that a linter yell at you when anyone does it again. So it's very natural to develop a linter-style thing on top of one of these[2].

    This is, as I understand it sort of the same thing that happens in C++: clang-tidy and clang-format are written on top of AST libraries that can be used for ad-hoc analysis and transformations, but you can also just plug them into a linter.

    The thing is, for most organizations, enforcing code style and best practices is more valuable than apply a refactoring to 10M lines of code, because most organizations don't have 10M lines of code to refactor. That doesn't mean that these tools aren't also useful for ad-hoc transforms and exploratory analysis. They absolutely are!

    [1]: https://github.com/ssbr/refex

    [2]: https://github.com/ssbr/refex/tree/main/refex/fix

  • ocaml-tree-sitter-semgrep

    Generate parsers from tree-sitter grammars extended to support Semgrep patterns

  • https://github.com/returntocorp/ocaml-tree-sitter/blob/maste... appears to be the general answer to your question, but navigating to the tree-sitter docs shows that tree-sitter has one in progress: https://github.com/tree-sitter/tree-sitter-swift so hopefully the machinery to incorporate it into semgrep will not be horrific

  • tree-sitter-swift

    Discontinued Swift grammar for tree-sitter (by tree-sitter)

  • https://github.com/returntocorp/ocaml-tree-sitter/blob/maste... appears to be the general answer to your question, but navigating to the tree-sitter docs shows that tree-sitter has one in progress: https://github.com/tree-sitter/tree-sitter-swift so hopefully the machinery to incorporate it into semgrep will not be horrific

  • Bear

    Bear is a tool that generates a compilation database for clang tooling.

  • For C/C++ code, you can already do refactoring using clang-tidy scripts [0], or even can write custom linters using libtooling [1] and leverage the AST Matchers [2] which work at the AST level.

    All that's needed is a compile_commands.json file which can be easily generated via most build systems, or you can use Bear [3]/some other tool (or write a script that logs all syscalls and generate it yourself).

    [0] https://releases.llvm.org/12.0.0/tools/clang/tools/extra/doc...

    [1] https://releases.llvm.org/12.0.0/tools/clang/docs/LibTooling...

    [2] https://releases.llvm.org/12.0.0/tools/clang/docs/LibASTMatc...

    [3] https://github.com/rizsotto/Bear

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • checkr

    Custom static analysis rules for the lazy. Write project specific static analysis checks in a few lines of code.

  • I wrote a small VS code extension and pre-commit hook that might meet 80% of your needs:

    https://github.com/elanning/checkr

    It is just simple regex at this time, but hopefully I can add something like CCGrep syntax in the future:

    https://github.com/yuy-m/CCGrep

  • CCGrep

    Code Clone Detector like grep

  • I wrote a small VS code extension and pre-commit hook that might meet 80% of your needs:

    https://github.com/elanning/checkr

    It is just simple regex at this time, but hopefully I can add something like CCGrep syntax in the future:

    https://github.com/yuy-m/CCGrep

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Terraform Tools and Testing

    5 projects | /r/Terraform | 30 Jun 2022
  • Using pre-commits hooks to improve terraform IaC code quality

    2 projects | /r/Terraform | 26 Oct 2021
  • Best approach to manage S3

    2 projects | /r/Terraform | 7 Oct 2021
  • Terraform v15.0 with AWS (EKS deployment)

    7 projects | dev.to | 17 Apr 2021
  • Pylyzer – A fast static code analyzer and language server for Python

    6 projects | news.ycombinator.com | 11 Apr 2024