kafka-delta-ingest
delta
Our great sponsors
kafka-delta-ingest | delta | |
---|---|---|
6 | 88 | |
319 | 20,617 | |
4.4% | - | |
7.5 | 8.4 | |
8 days ago | 12 days ago | |
Rust | Rust | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kafka-delta-ingest
-
Using rust for DE activities?
Rust can offer incredible cost savings when you can use it in place of spark to interact with your delta lake. One such project was kafka-delta-ingest. The developers were able to reduce the cost of running the pipeline by over 90%. However, most of this stuff is still very experimental and not ready for production but you will definitely be seeing more projects like this just based on how much money can be saved.
-
Which lakehouse table format do you expect your organization will be using by the end of 2023?
This independence from a catalog allows for path based reads and writes. This is handy when writing from Kafka directly to Delta Lake for the first layer of ingestion. You don’t need a catalog (or even Spark). https://github.com/delta-io/kafka-delta-ingest/tree/main/src
-
Streaming Data and Postgres
As far as I know no. You certainly could use events on a streaming ledger like Kafka or Redpanda and then store to delta with https://github.com/delta-io/kafka-delta-ingest and process them with all the gis goodness of spark. However, this is fairly complicated and much different from a simple postgis drop in replacement. There are specialized meaning faster and more efficient systems out there for specialized tasks such as geo fencing in real-time
-
Rust is showing a lot of promise in the DataFrame / tabular data space
kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.
-
process millions of events per sec
What about https://github.com/delta-io/kafka-delta-ingest?
- Exactly once delivery from Kafka to Delta Lake with Rust
delta
- Difftastic, a structural diff tool that understands syntax
- Popular Git Config Options
-
So You Think You Know Git – Git Tips and Tricks by Scott Chacon
Thanks for the difftastic & zoxide tips.
However, I've been using this git pager/difftool: https://github.com/dandavison/delta
While it's not structural like difft, it does produce more readable output for me (at least when scrolling fast through git log -p /scanning quickly
-
Essential Command Line Tools for Developers
View on GitHub
- Potencializando Sua Experiência no Linux: Conheça as Ferramentas em Rust para um Desenvolvimento Eficiente
-
Unified versus Split Diff
I'm currently waiting on the integration between Delta and Difftastic:
https://github.com/dandavison/delta/issues/535
Difftastic now has JSON output, whic should make it much easier to build this.
- Delta, a syntax-highlighting pager for Git, diff, and grep output
- Ask HN: What's a new developer tool you recently started using?
-
Magit
I'm surely in the minority here. I've been using Emacs for almost a decade now, but I just can't get into the Magit workflow. I've tried several times, but always end up going back to Git on the command line. I have dozens of aliases, shell integrations, a nice diff viewer[1], etc., and interacting with Git has become muscle memory. I can commit, cherry-pick, rebase, bisect, fix conflicts, etc., in a fraction of the time it would take me to navigate Magit's UI. I'm sure with enough practice, a Magit user could do this more quickly and efficiently, but honestly, with some custom-built porcelain, Git's UI is not so bad. Though this could very well be Stockholm syndrome after using it for such a long time...
For whatever reason, Magit's opinionated workflows never clicked with me. A part of it is the concern that it will do something weird to my repo that I'll then have to waste more time undoing manually. I usually don't trust sugary wrappers around tools. And another is the fact I don't use Emacs on all machines, and setting up Git on a remote system is just a matter of copying over my config and some shell integrations.
Also, on a more personal note, I find the cultish fanboyism whenever Magit is brought up slightly offputting. Does anyone have anything bad to say about it? No software can realistically be this infallible. :)
[1]: https://github.com/dandavison/delta
-
How to use Git?
For looking at diffs I still prefer the command line though, and use delta to view diffs between commits or branches.
What are some alternatives?
delta-rs - A native Rust library for Delta Lake, with bindings into Python
diff-so-fancy - Good-lookin' diffs. Actually… nah… The best-lookin' diffs. :tada:
dipa - dipa makes it easy to efficiently delta encode large Rust data structures.
difftastic - a structural diff that understands syntax 🟥🟩
kafka-rust - Rust client for Apache Kafka
vim-fugitive - fugitive.vim: A Git wrapper so awesome, it should be illegal
rust-rdkafka - A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka
lazygit - simple terminal UI for git commands
flowgger - A fast data collector in Rust
vim-gitgutter - A Vim plugin which shows git diff markers in the sign column and stages/previews/undoes hunks and partial hunks.
arrow2 - Transmute-free Rust library to work with the Arrow format
gitui - Blazing 💥 fast terminal-ui for git written in rust 🦀