mandala
beaver
mandala | beaver | |
---|---|---|
8 | 1 | |
228 | 1 | |
- | - | |
6.3 | 9.1 | |
about 2 months ago | 4 months ago | |
Python | Ruby | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mandala
-
Mandala: A little plaground for testing pixel logic patterns
I was so confused, expecting this to be some trickery related to the computational-graph-memoization-and-exploration tool "mandala" https://github.com/amakelov/mandala
- Mandala: Notebook memoization on steroids, used by Anthropic
-
Improve Jupyter Notebook Reruns by Caching Cells
This is neat and self-contained! But as someone running experiments with a high degree of interactivity, I often have an orthogonal requirement: add more computations to the same cell without recomputing previous computations done in the cell (or in other cells).
For a concrete example, often in an ML project you want to study how several quantities vary across several parameters. A straightforward workflow for this is: write some nested loops, collect results in python dictionaries, finally put everything together in a dataframe and compare (by plotting or otherwise).
However, after looking at the results, maybe you spot some trend and wonder if it will continue if you tweak one of the parameters by using a new value for it; of course, you also want to look at the previous values and bring everything together in the same plot(s). You now have a problem: either re-run the cell (thus losing previous work, which is annoying even if you have to wait 1 minute - you know it's a wasted minute!), or write the new computation in a new cell, possibly with a lot of redundancy (which over time makes the notebook hard to navigate and keep consistent).
So, this and other considerations eventually convinced me that the function is more natural than the cell as an interface/boundary at which caching should be implemented, at least for my use cases (coming from ML research). I wrote a framework based on this idea, with lots of other features (some quite experimental/unusual) to turn this into a feasible experiment management tool - check it out at https://github.com/amakelov/mandala
P.S.: I notice you use `pickle` for the hashing - `joblib.dump` is faster with objects containing numpy arrays, which covers a lot of useful ML things
-
ML Experiments Management with Git
Another option, that manages versioning of your computational graph and its results and provides extremely elegant query-able memoization is Mandala https://github.com/amakelov/mandala
It is a much simpler and much more magical piece of software that truly expanded how I think about writing, exploring, and experimenting with code. Even if you never use it, you probably would really enjoy reading the blog posts the author wrote about the design of the tool https://amakelov.github.io/blog/pl/
-
Snakemake β A framework for reproducible data analysis
You might like mandala (https://github.com/amakelov/mandala) - it is not a build recipe tool, rather it is a tool that tracks the history of how your builds / computational graph has changed, and ties it to how the data looked like at each such step.
-
Piper: A proposal for a graphy pipe-based build system
u/rust4yy: I've been building mandala, a Python framework for (among other things) incremental computing. One way to think of it is "a build system for Python objects", except the units of computation are Python functions.
beaver
-
Piper: A proposal for a graphy pipe-based build system
Interesting read. I might implement some of the ideas in my own build system. (If youβre interested: https://github.com/Jomy10/beaver)
What are some alternatives?
oxen-release - Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.
curio - Good Curio!
snakemake-wrappers - This is the development home of the Snakemake wrapper repository, see
Ruby-Cheatsheet - π The missing cheatsheet for Ruby
aim - Aim π« β An easy-to-use & supercharged open-source experiment tracker.
memo_wise - The wise choice for Ruby memoization
sdk - Metadata store for Production ML
fuzzily - Fast fuzzy string searching/matching for Rails
make-booster - Utility routines to simplify using GNU make and Python
chanCrawler - A simple gem that crawls chans and retrieves visual content
devise_masquerade - Extension for devise, enable login as functionality. Add link to the masquerade_path(resource) and use it.