Show HN: Codemodder – A new codemod library for Java and Python

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • asmoses

    MOSES Machine Learning: Meta-Optimizing Semantic Evolutionary Search for the AtomSpace (https://github.com/opencog/atomspace)

  • Thanks for your reply!

    I think they called it an FST "Full Syntax Tree", which is probably very similar to a CST "Concrete Syntax Tree". At the time that moses was written, Python's internal AST hadn't sufficient code to mutate sufficiently for moses' designs.

    MOSES: Meta-Optimizing Semantic Evolutionary Search :

    https://wiki.opencog.org/w/Meta-Optimizing_Semantic_Evolutio... :

    > All program evolution algorithms tend to produce bloated, convoluted, redundant programs ("spaghetti code"). To avoid this, MOSES performs reduction at each stage, to bring the program into normal form. The specific normalization used is based on Holman's "elegant normal form", which mixes alternate layers of linear and non-linear operators. The resulting form is far more compact than, say, for example, boolean disjunctive normal form. Normalization eliminates redundant terms, and tends to make the resulting code both more human-readable, and faster to execute.

    > The above two techniques, optimization and normalization, allow MOSES to outperform standard genetic programming systems.

    https://github.com/opencog/asmoses

    MOSES outputs Combo (a LISP), Python as an output transform IIUC, and now Atomese with asmoses, which links to a demo notebook: https://robert-haas.github.io/mevis-docs/code/examples/moses...

    Evolutionary algorithm > Convergence: https://en.wikipedia.org/wiki/Evolutionary_algorithm#Converg...

    /? mujoco learning to walk [with evolutionary selection / RL Reinforcement Learning]

  • LibCST

    A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • RedBaron

    Bottom-up approach to refactoring in python

  • https://github.com/PyCQA/redbaron

    E.g. PyCQA/bandit does static analysis for security issues in Python code:

  • bandit

    Bandit is a tool designed to find common security issues in Python code.

  • codemodder-python

    Python implementation of the Codemodder framework

  • Hi! Great questions. I'm the lead maintainer of the Python version of the Codemodder framework so I'll do my best to answer.

    > How does libCST compare to e.g. pyCQA/redbaron?

    LibCST is similar to redbaron in the sense that it does preserve comments and whitespace. The "CST" in LibCST refers to "concrete syntax tree", which preserves comments and whitespace, as opposed to an "abstract syntax tree" or "AST", which does not. Our goal is to make the absolute minimal changes required to harden and improve code, and messing with whitespace would be counter to that goal. It's worth noting that redbaron no longer appears to be maintained and the most recent version of Python that it supported was 3.7 which is now itself EOL.

    > What about for EA Evolutionary Algorithms

    Can you elaborate? I am familiar with the concept of evolutionary algorithms but I'm not sure I understand what you mean in this context.

    > does it preserve comments, or update docstrings and type annotations in mutating the code under test?

    Codemodder does preserve comments. Currently none of our codemods update docstrings; I'm not sure we currently have any cases where that would make sense. We do make an effort to update type annotations where appropriate.

    > Is it necessary to run `black` (and `precommit run --all-files`) to format the code after mutating it?

    Yes, it is currently necessary to run `black` and `precommit` if you're using it on your project. While `black` is incredibly popular, we also can't assume that it's being used on any given project. Running `black` would cause each updated file to be completely reformatted which would lead to very noisy and difficult-to-review changes. I would like to explore better solutions to this issue going forward.

    I am familiar with `bandit`. It's a fairly simple security linter and is useful for finding some common issues. It's also pretty prone to false positives and noisy findings. Not every problem identified by `bandit` is something that can be automatically fixed; for example I can't replace a hard-coded password without making a lot of (breaking) assumptions about the structure of your application and the manner in which it is deployed.

    I'd love to get your feedback on Python Codemods! Give us a star on GitHub and feel free to open an issue or PR: https://github.com/pixee/codemodder-python

  • nagini

    Nagini is a static verifier for Python 3, based on the Viper verification infrastructure.

  • https://en.wikipedia.org/wiki/Semgrep links to OWASP Source Code Analysis Tools: https://owasp.org/www-community/Source_Code_Analysis_Tools

    But what's static analysis or dynamic analysis source code analysis without Formal Verification?

    "Nagini: A Static Verifier for Python": https://pm.inf.ethz.ch/publications/EilersMueller18.pdf https://github.com/marcoeilers/nagini :

    > However, there is currently virtually no tool support for reasoning about Python programs beyond type safety.

    > We present Nagini, a sound verifier for statically-typed, concurrent Python

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts