seatunnel
qsv
seatunnel | qsv | |
---|---|---|
31 | 13 | |
7,388 | 2,228 | |
1.0% | - | |
9.8 | 9.9 | |
about 18 hours ago | 7 days ago | |
Java | Rust | |
Apache License 2.0 | The Unlicense |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
seatunnel
- SeaTunnel – super high-performance, distributed data integration tool
- Apache SeaTunnel: Next-generation high-performance, distributed integration tool
- FLaNK Weekly 31 December 2023
-
Five Apache projects you probably didn't know about
Apache SeaTunnel is a data integration platform that offers the three pillars of data pipelines: sources, transforms, and sinks. It offers an abstract API over three possible engines: the Zeta engine from SeaTunnel or a wrapper around Apache Spark or Apache Flink. Be careful, as each engine comes with its own set of features.
-
SymmetricDS: Open-Source, cross platform database replication software
looks that way. there is an other project that does similar things Apache SeaTunnel: https://seatunnel.apache.org/
- Breakthrough in the book search field! Use Apache SeaTunnel to improve the efficiency of book title similarity search
-
Questions Regarding design DW
https://seatunnel.apache.org/ Might be an overkill though...
-
SeaTunnel Zeta engine, the first choice for massive data synchronization, is officially released!
See the specific Change log: https://github.com/apache/incubator-seatunnel/releases/tag/2.3.0
-
The Ultimate Beginner’s Guide to Open Source Contribution
Apache SeaTunnel (Incubating) SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of nearly 100 companies. Official website https://seatunnel.apache.org/ GitHub projects https://github.com/apache/incubator-seatunnel
- Major Release! SeaTunnel 2.3.0-beta supports the self-innovate SeaTunnel Engine and more connectors!
qsv
- FLaNK Weekly 31 December 2023
-
Qsv: Efficient CSV CLI Toolkit
Thanks for the detailed feedback @snidane!
As maintainer of qsv, here's my reply:
- Given qsv's rapid release cycle (173 releases over three years), the auto-update check is essential at the moment. Once we reach 1.0, I'll turn it off. For now, given your feedback, I've only made it check 10% of the time.
- Pivot is in the backlog and I'll be sure to add unpivot when I implement it. (https://github.com/jqnatividad/qsv/issues/799)
- I'll add a dedicated summing command with the group by (-by) and window by (-over) capability (https://github.com/jqnatividad/qsv/issues/1514). Do note that `stats` has basic sum as @ezequiel-garzon pointed out.
- With the `enum` command, qsv can achieve what you proposed with `laminate`. E.g. qsv enum --new-column newcol --constant newconstant mydata.csv --output laminated-data.csv
- With the cat rowskey command, qsv can already concatenate files with mismatched headers.
- other file formats. qsv supports parquet, csv, tsv, excel, ods, datapackage, sqlite and more (see https://github.com/jqnatividad/qsv/tree/master#file-formats). Fixed-format though is not supported yet and quite interesting, and have added it to the backlog (https://github.com/jqnatividad/qsv/issues/1515)
- as to "enable embedding outputs of commands", qsv is composable by design, so you can use standard stdin/stdout redirection/piping techniques to have it work with other CLI tools like jq, awk, etc.
Finally, just released v0.120.0 that already incorporates the less aggressive self-update check. https://github.com/jqnatividad/qsv/releases/tag/0.120.0
- Joining CSV Data Without SQL: An IP Geolocation Use Case
-
Why my favourite API is a zipfile on the European Central Bank's website
qsv [1] also has a sqlp command which lets you run Polars SQL queries (even on multiple files). Here I'll send the csv data from stdin (represented by -) and then (optionally) pipe the output to the table command for formatting. The shape of the result is also printed to stderr (the (4, 2) below).
[1] https://github.com/jqnatividad/qsv
$ echo 'Name,Department,Salary
- Qsv: Ultra-fast CSV data-wrangling toolkit
- Qsv: CSVs sliced, diced and analyzed (fork of xsv)
- Nushell.sh ls – where size > 10mb – –sort-by modified
-
Do Rust and Lua work well together?
It works quite well IMHO. Using the mlua crate, I’ve managed to integrate Luau as a very powerful data-wrangling DSL for qsv (https://github.com/jqnatividad/qsv)
-
How manipulate this CSV in Python?
Maybe this might be better done using this? https://github.com/jqnatividad/qsv
-
How to convert xslx to csv using Rust?
https://github.com/jqnatividad/qsv is another option.
What are some alternatives?
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
kestra - Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
calamine - A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
Leetcode - Solutions to LeetCode problems; updated daily. Subscribe to my YouTube channel for more.
fortune-sheet - A drop-in javascript spreadsheet library that provides rich features like Excel and Google Sheets
hudi - Upserts, Deletes And Incremental Processing on Big Data.
tsv-utils - eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
com.openai.unity - A Non-Official OpenAI Rest Client for Unity (UPM)
csvquote
Apache Hive - Apache Hive
goawk - A POSIX-compliant AWK interpreter written in Go, with CSV support