-
LibCST
A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
> One of the problems of patching code using ast package of Python is that it loses all the formatting and comments of the original source code.
What you actually want here is a concrete syntax tree preserving all the comments and whitespace. There’s a sort of built-in one as part of lib2to3 used by the likes of black and yapf. There’s also a good third party options now in the form of https://github.com/Instagram/LibCST.
There’s been talks about adding a CST module to PSL, but I don’t think the proposal went anywhere.
https://bugs.python.org/issue33337
I wrote a criminally underused library that does much the same kind of transformations: https://github.com/mbarkhau/lib3to6/tree/master/src/lib3to6
I have done something similar using https://pybowler.io/. This library wraps the LibCST library and abstracts away some of the refactoring operations into nice wrapper functions.
I had a problem of applying patches to the files that got indented differently meantime.
So I wrote a tool that takes a context diff, rebuilds the before and after text, then parses both as vectors of (leading whitespaces, token), then creates a diff of those vectors ignoring the whitespaces, parses the target file as a similar vector, and applies this diff to such a vector (again with ignoring/copying whitespaces) - after which it reconstructs the text.
The results are surprisingly robust, despite the relatively naive tokenization heuristic.
https://github.com/ayourtch/tbpatch