Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wimsey
-
Classic Data science pipelines built with LLMs
I'm definitely biased because my day job is writing ETL pipelines and supporting software, and my current side project is a data contracts library for helping the above[0]. Still I'm not sure I see this happening.
80% of the focus of an ETL pipeline is in ensuring edge cases are handled appropriately (i.e. not producing models from potentially erroneous data, dead letter queing unknown fields etc).
I think an LLM would be great for "take this json and make it a pandas dataframe", but a lot less great for interact with this billing API to produce auditable payment tables.
For areas that are reliability focused, LLMs still need a lot more improvments to be useful.
[0] https://github.com/benrutter/wimsey
-
The Data Engineering Handbook
Nice list! Although as somebody who works on open source tools for data engineering, it kills me a little to see "companies" as the the list header rather than, say, "projects".
(also, shameless plug for my.latest project Wimsey which is non-company affiliated but does let you test data in a nice, lightweight way: https://github.com/benrutter/wimsey)
- Wimsey: A flexible, lightweight data contracts library
-
This Week In Python
wimsey – Easy and flexible data testing and documentation
Music
-
This Week In Python
Music – Algorithmic Music Generation with Python
- Algorithmic Music Generation with Python
What are some alternatives?
Scrapling - 🕷️ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping simple and easy again!
abacus-minimal - A minimal event-based ledger in Python that follows accounting rules
finstruments - Financial instrument definitions built with Python and Pydantic
play - Player interface for Generative.fm