hamilton
aipl
hamilton | aipl | |
---|---|---|
24 | 4 | |
2,007 | 119 | |
3.1% | 0.0% | |
9.7 | 9.2 | |
5 days ago | over 1 year ago | |
Jupyter Notebook | Python | |
BSD 3-clause Clear License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hamilton
-
Show HN: I built an open-source data pipeline tool in Go
I always thought Hamilton [1] does a good job of giving enough visual hooks that draw you in.
I also noticed this pattern where library authors sometimes do a bit extra in terms of discussing and even promoting their competitors, and it makes me trust them more. A “heres why ours is better and everyone else sucks …” section always comes across as the infomercial character who is having quite a hard time peeling an apple to the point you wonder if this the first time they’ve used hands.
One thing wish for is a tool that’s essentially just Celery that doesn’t require a message broker (and can just use a database), and which is supported on Windows. There’s always a handful of edge cases where we’re pulling data from an old 32-bit system on Windows. And basically every system has some not-quite-ergonomic workaround that’s as much work as if you’d just built it yourself.
It seems like it’s just sending a JSON message over a queue or HTTP API and the worker receives it and runs the task. Maybe it’s way harder than I’m envisioning (but I don’t think so because I’ve already written most of it).
I guess that’s one thing I’m not clear on with Bruin, can I run workers if different physical locations and have them carry out the tasks in the right order? Or is this more of a centralized thing (meaning even if its K8s or Dask or Ray, those are all run in a cluster which happens to be distributed, but they’re all machines sitting in the same subnet, which isn’t the definition of a “distributed task” I’m going for.
[1] https://github.com/DAGWorks-Inc/hamilton
-
Greppability is an underrated code metric
Yep. When I was designing https://github.com/dagworks-inc/hamilton part of the idea was to make it easy to understand what and where. That is, enable one to grep for function definitions and their downstream use easily, and where people can't screw this up. You'd be surprised how easy it is to make a code base where grep doesn't help you all that much (at least in the python data transform world) ...
-
Ask HN: What are you working on (August 2024)?
Graph-based libraries for building ML/AI systems:
- Burr -- build AI applications/agents as state machines https://github.com/dagworks-inc/burr
- Hamilton -- build dataflows as DAGs: https://github.com/dagworks-inc/hamilton
Looking for feedback -- we had some good initial traction on HN, and are looking for OS users/contributors/people who are building complimentary tooling!
- Show HN: Hamilton's UI – observability, lineage, and catalog for data pipelines
-
Building an Email Assistant Application with Burr
Note that this uses simple OpenAI calls — you can replace this with Langchain, LlamaIndex, Hamilton (or something else) if you prefer more abstraction, and delegate to whatever LLM you like to use. And, you should probably use something a little more concrete (E.G. instructor) to guarantee output shape.
-
Using IPython Jupyter Magic commands to improve the notebook experience
In this post, we’ll show how your team can turn any utility function(s) into reusable IPython Jupyter magics for a better notebook experience. As an example, we’ll use Hamilton, my open source library, to motivate the creation of a magic that facilitates better development ergonomics for using it. You needn’t know what Hamilton is to understand this post.
-
FastUI: Build Better UIs Faster
We built an app with it -- https://blog.dagworks.io/p/building-a-lightweight-experiment. You can see the code here https://github.com/DAGWorks-Inc/hamilton/blob/main/hamilton/....
Usually we've been prototyping with streamlit, but found that at times to be clunky. FastUI still has rough edges, but we made it work for our lightweight app.
- Show HN: On Garbage Collection and Memory Optimization in Hamilton
-
Facebook Prophet: library for generating forecasts from any time series data
This library is old news? Is there anything new that they've added that's noteworthy to take it for another spin?
[disclaimer I'm a maintainer of Hamilton] Otherwise FYI Prophet gels well with https://github.com/DAGWorks-Inc/hamilton for setting up your features and dataset for fitting & prediction[/disclaimer].
- Show HN: Declarative Spark Transformations with Hamilton
aipl
-
Ask HN: Tell us about your project that's not done yet but you want feedback on
AIPL is an "Array-Inspired Pipeline Language", a tiny DSL in Python to make it easier to explore and experiment with AI pipelines.
https://github.com/saulpw/aipl
When you want to run some prompts through an LLM over a dataset, with some preprocessing and/or chaining prompts together, AIPL makes it much easier than writing a Python script.
-
The Problem with LangChain
Yes! This is why I started working on AIPL. The scripts are much more like recipes (linear, contained in a single-file, self-evident even to people who don't know the language). For instance, here's a multi-level summarizer of a webpage: https://github.com/saulpw/aipl/blob/develop/examples/summari...
The goal is to capture all that knowledge that langchain has, into consistent legos that you can combine and parameterize with the prompts, without all the complexity and boilerplate of langchain, nor having to learn all the Python libraries and their APIs. Perfect for prototypes and experiments (like a notebook, as you suggest), and then if you find something that really works, you can hand-off a single text file to an engineer and they can make it work in a production environment.
-
Langchain Is Pointless
I agree, and that's why I've been working on AIPL[0]. Our first v0.1 release should be in the next few days. https://github.com/saulpw/aipl
It's basically just a simple scripting language with array semantics and inline prompt construction, and you can drop into Python any time you like.
-
Re-implementing LangChain in 100 lines of code
I also was underwhelmed by langchain, and started implementing my own "AIPL" (Array-Inspired Pipeline Language) which turns these "chains" into straightforward, linear scripts. It's very early days but already it feels like the right direction for experimenting with this stuff. (I'm looking for collaborators if anyone is interested!)
https://github.com/saulpw/aipl
What are some alternatives?
phidata - Agno is a lightweight framework for building multi-modal Agents [Moved to: https://github.com/agno-agi/agno]
modelfusion - The TypeScript library for building AI applications.
tree-of-thought-llm - [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
multi-gpt - A Clojure interface into the GPT API with advanced tools like conversational memory, task management, and more
awesome-pipeline - A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin
llm - Access large language models from the command-line
llm-gpt4all - Plugin for LLM adding support for the GPT4All collection of models
snowpark-python - Snowflake Snowpark Python API
llm-api - Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.
vscode-reactive-jupyter - A simple Reactive Python Extension for Visual Studio Code
jehuty - Fluent API to interact with chat based GPT model