hamilton vs talk-transcripts

hamilton

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton (by stitchfix)

DISCONTINUED

Suggest alternative

Edit details

talk-transcripts

Transcripts of Clojure-related talks (by matthiasn)

Suggest topics

Source Code

matthiasnehlsen.com

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

hamilton		talk-transcripts
	Project
26	Mentions	35
878	Stars	2,854
-	Growth	-
8.1	Activity	4.7
about 1 year ago	Latest Commit	11 months ago
Python	Language
BSD 3-clause Clear License	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

hamilton

Posts with mentions or reviews of hamilton. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-27.

Write production grade pandas (and other libraries!) with Hamilton
2 projects | /r/Python | 27 Feb 2023

And find the repository here: https://github.com/dagworks-inc/hamilton/
Useful libraries for data engineering in various programming languages
1 project | /r/dataengineering | 16 Sep 2022

Python - https://github.com/stitchfix/hamilton (author here). It's great if you want your code to be always unit testable and documentation friendly, and you want to be able to visualize execution. Blog post on using it with Pandas https://link.medium.com/XhyYD9BAntb.
Cognitive Loads in Programming
5 projects | news.ycombinator.com | 31 Aug 2022

Yes! As one of the creators of https://github.com/stitchfix/hamilton this was one of the aims. Simplifying the cognitive burden for those developing and managing data transforms over the course of years, and in particular for ones they didn't write!
For example in Hamilton -- we force people to write "declarative functions" which then are stitched together to create a dataflow.
E.g. example function -- my guess is that you can read and understand/guess what it does very easily.
Prefect vs other things question
2 projects | /r/mlops | 3 Aug 2022

For (1) there are quite a few options - prefect is one, metaflow is another, airflow, dagster, even https://github.com/stitchfix/hamilton (core contributor here), etc.
Field Lineage
4 projects | /r/dataengineering | 2 Aug 2022

If you're want to do more python https://github.com/stitchfix/hamilton allows you to model dependencies at a columnar (field) level.
Show HN
1 project | news.ycombinator.com | 1 Aug 2022
[D] Is anyone working on interesting ML libraries and looking for contributors?
4 projects | /r/MachineLearning | 17 Jun 2022

Take a look at https://github.com/stitchfix/hamilton - we're after contributors who can help us grow the project, e.g. make documentation great, dog fooding features and suggesting/contributing usability improvements.
Useful Python decorators for Data Scientists
1 project | /r/Python | 23 May 2022

For a real world example of their power, we built an entire framework (https://github.com/stitchfix/hamilton) at Stitch Fix, where a lot of cool magic is provide via decorators - see https://hamilton-docs.gitbook.io/docs/reference/api-reference/available-decorators and these two source files (https://github.com/stitchfix/hamilton/blob/main/hamilton/function_modifiers_base.py, https://github.com/stitchfix/hamilton/blob/main/hamilton/function_modifiers.py ). Note we do some non-trivial stuff via them.
unit tests
1 project | /r/mlops | 23 May 2022

For data processing/transform code, I would recommend looking at https://github.com/stitchfix/hamilton, especially if you're trying to test pandas code. Short getting started here - https://towardsdatascience.com/how-to-use-hamilton-with-pandas-in-5-minutes-89f63e5af8f5 (disclaimer: I'm one of the authors).
Dealing with hundreds of customer/computed columns
1 project | /r/dataengineering | 19 May 2022

The python package, hamilton, from Stitch Fix (https://hamilton-docs.gitbook.io/docs/) can help manage transformations on pandas dataframes. This DAG of transformations is managed separately in a file - so it can be versioned, in case the transformations change. The memory required is reduced, because only the API call tables and mapping parameter table have to be in memory. The calculated columns can be produced as needed. Just like dbt, transformations are separate from the source tables - but hamilton can be used on any python object - not just dataframes. dbt is SQL based.

talk-transcripts

Posts with mentions or reviews of talk-transcripts. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-15.

In praise of idleness – Bertrand Russell
1 project | news.ycombinator.com | 4 May 2024

Reminds me a little of hammock-driven development [1]
> the background mind is good at synthesizing things. It's good about strategy
[1] https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Teach Yourself Programming in Ten Years (1998)
3 projects | news.ycombinator.com | 15 Jan 2024

Thank you for this recommendation. I've never heard of it before and now I'm reading: https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
It's giving me energy this Monday holiday(USA)!
Can't Be Fucked: Underrated Cause of Tech Debt
1 project | news.ycombinator.com | 12 Oct 2023

race?
> [Audience reply: Sprinter]
> Right, only somebody who runs really short races, okay?
> [Audience laughter]
> But of course, we are programmers, and we are smarter than runners, apparently, because we know how to fix that problem, right? We just fire the starting pistol every hundred yards and call it a new sprint.
https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Strong typing, a hill I'm willing to die on
9 projects | news.ycombinator.com | 4 Oct 2023

>So this is 10x, a full order of magnitude reduction in (?) severity before we get to the set of problems I think are more in the domain of what programming languages can help with, right? And because you can read these they'll all going to come up in a second as I go through each one on some slide so I'm not going to read them all out right now. But importantly there's another break where we get to trivialisms of problems in programming. Like typos and just being inconsistent, like, you thought you're going to have a list of strings and you put a number in there. That happens, you know, people make those kinds of mistakes, they're pretty inexpensive.
[0] Video: https://www.youtube.com/watch?v=2V1FtfBDsLU
[1] Slides and transcript: https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
[2] Video https://www.youtube.com/watch?v=YR5WdGrpoug
[3] Slides and transcript https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Puzzle Languages
1 project | news.ycombinator.com | 4 Oct 2023

This is tangentially related to Puzzles-vs-Problems in Rich Hickey's Effective Programs
> Eventually I got back to scheduling and again wrote a new kind of scheduling system in Common Lisp, which again they did not want to run in production. And then I rewrote it in C++. Now at this point I was an expert C++ user and really loved C++, for some value of love. But as we'll see later I love the puzzle of C++. So I had to rewrite it in C++ and it took, you know, four times as long to rewrite it as it took to write it in the first place, it yielded five times as much code and it was no faster. And that's when I knew I was doing it wrong.
[...]
> So I mean for young programmers, if everybody's tired and old, this doesn't matter any more. But when I was young, when I was young, I really, you know, when you're young you've got lots of free space. I used to say "an empty head", but that's not right. You've got a lot of free space available and you can fill it with whatever you like. And these type systems they're quite fun, because from an endorphin standpoint solving puzzles and solving problems is the same, it gives you the same rush. Puzzle solving is really cool. But that's not what it should be about.
Talk: https://www.youtube.com/watch?v=2V1FtfBDsLU
Slides and transcript: https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
All the ways to capture changes in Postgres
12 projects | news.ycombinator.com | 22 Sep 2023

Using triggers + history tables (aka audit tables) is the right answer 98% of the time. Just do it. If you're not already doing it, start today. It is a proven technique, in use for _over 30 years_.
Here's a quick rundown of how to do it generically https://gist.github.com/slotrans/353952c4f383596e6fe8777db5d... (trades off space efficiency for "being easy").
It's great if you can store immutable data. Really, really great. But you _probably_ have a ton of mutable data in your database and you are _probably_ forgetting a ton of it every day. Stop forgetting things! Use history tables.
cf. https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Do not use Papertrail or similar application-space history tracking libraries/techniques. They are slow, error-prone, and incapable of capturing any DB changes that bypass your app stack (which you probably have, and should). Worth remembering that _any_ attempt to capture an "updated" timestamp from your app is fundamentally incorrect, because each of your webheads has its own clock. Use the database clock! It's the only one that's correct!
G. Polya, How to Solve It
1 project | news.ycombinator.com | 22 Aug 2023

Rich Hickey (creator of Clojure) references Polya several times in his classic talk "Hammock Driven Development". Here's a transcript:
https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
I've long been impressed by Hickey's problem solving skills, so I took much of this talk to heart, and even bought a copy of HTSI. Can't say it really helped me any more than Rich's talk (as a programmer) but I'm thinking I'll give it another look.
Interfaces All the Way Down
1 project | news.ycombinator.com | 23 Jul 2023

>Great product designs require no manual, and similarly, great interfaces need no documentation. Imagine having to read a manual on how to use a coffee mug.
This could not be more wrong.
Not everything is easy. If a library is addressing a complicated domain, solving by definition a complicated problem, it is fine if it requires some learning.
When did expertise and learning become bad things? If software is an engineering discipline, why would people in it ever promulgate the idea that any random cog can step in to any “engineer”s shoes?
Rich Hickey analogizes this mentality to the world of music, where it taken for granted that learning an instrument requires a lot of study:
“ We start with the cello. Should we make cellos that auto tune? Like, no matter where you put your finger, it's just going to play something good, play a good note.
“[Audience laughter]
“Like, you're good. We'll just fix that.
“ Should we have cellos with, like, red and green lights? Like, if you're playing the wrong note, you know, it's red. You slide around, and it's green. You're like, great! I'm good. I'm playing the right song. Right?
“ Or maybe we should have cellos that don't make any sound at all. Until you get it right, there's nothing.
“ [Audience laughter]”
https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Slightly off-topic: Whose lectures do you recommend listening to, similar to Rich Hickey?
1 project | /r/Clojure | 6 Jun 2023

You might find adjacent talks and speakers here ... https://github.com/matthiasn/talk-transcripts
Functions vs. Procedures: Keep them separate.
2 projects | dev.to | 8 May 2023

Many languages merge the two concepts, and implement procedures as functions that return void. This may muddle/complect their distinction, causing programmers to call procedures from within functions, thereby making those functions into impure functions (meaning that they affect the world outside of themselves, through side-effects like I/O or mutating state). This should be avoided, especially if you care about debug-ability and Functional Core, Imperative Shell architectures (see Gary Bernhardt's Boundaries talk at 31:56) (which make testing your system easier, without mocking).

What are some alternatives?

When comparing hamilton and talk-transcripts you can also consider the following projects:

prosto - Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

rich4clojure - Practice Clojure using Interactive Programming in your editor

versatile-data-kit - One framework to develop, deploy and operate data workflows with Python and SQL.

etaoin - Pure Clojure Webdriver protocol implementation

plumbing - Prismatic's Clojure(Script) utility belt

clj-chrome-devtools - Clojure API for controlling a Chrome DevTools remote

OpenLineage - An Open Standard for lineage metadata collection

codetour - VS Code extension that allows you to record and play back guided tours of codebases, directly within the editor.

composer - Supercharge Your Model Training

base - Unison base libraries

polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust

lumo - Fast, cross-platform, standalone ClojureScript environment

hamilton vs prosto talk-transcripts vs rich4clojure hamilton vs versatile-data-kit talk-transcripts vs etaoin hamilton vs plumbing talk-transcripts vs clj-chrome-devtools hamilton vs OpenLineage talk-transcripts vs codetour hamilton vs composer talk-transcripts vs base hamilton vs polars talk-transcripts vs lumo

Compare hamilton vs talk-transcripts and see what are their differences.

hamilton

talk-transcripts

hamilton

talk-transcripts

What are some alternatives?