dsq vs zed

dsq

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more. (by multiprocessio)

Source Code

Suggest alternative

Edit details

zed

A novel data lake based on super-structured data (by brimdata)

Suggest topics

Source Code

zed.brimdata.io

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

dsq		zed
	Project
20	Mentions	13
3,638	Stars	1,312
1.9%	Growth	2.0%
4.3	Activity	9.4
7 months ago	Latest Commit	6 days ago
Go	Language	Go
GNU General Public License v3.0 or later	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

dsq

Posts with mentions or reviews of dsq. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-02.

Tracking SQLite Database Changes in Git
7 projects | news.ycombinator.com | 2 Nov 2023

You might want to look at tsv-utils, or a similar project: https://github.com/eBay/tsv-utils
For the SQL part, but maybe a lot heavier, you can use one of the projects listed on this page: https://github.com/multiprocessio/dsq (No longer maintained, but has links to lots of other projects)
DuckDB: Querying JSON files as if they were tables
9 projects | news.ycombinator.com | 3 Mar 2023

Welcome to the gang! :)
https://github.com/multiprocessio/dsq#comparisons
Ask HN: Programs that saved you 100 hours? (2022 edition)
69 projects | news.ycombinator.com | 20 Dec 2022
Command-line data analytics made easy
6 projects | news.ycombinator.com | 3 Nov 2022

SPyQL is really cool and its design is very smart, with it being able to leverage normal Python functions!
As far as similar tools go, I recommend taking a look at DataFusion[0], dsq[1], and OctoSQL[2].
DataFusion is a very (very very) fast command-line SQL engine but with limited support for data formats.
dsq is based on SQLite which means it has to load data into SQLite first, but then gives you the whole breath of SQLite, it also supports many data formats, but is slower at the same time.
OctoSQL is faster, extensible through plugins, and supports incremental query execution, so you can i.e. calculate a running group by + count while tailing a log file. It also supports normal databases, not just file formats, so you can i.e. join with a Postgres table.
[0]: https://github.com/apache/arrow-datafusion
[1]: https://github.com/multiprocessio/dsq
[2]: https://github.com/cube2222/octosql
Disclaimer: Author of OctoSQL
Jq Internals: Backtracking
10 projects | news.ycombinator.com | 5 Oct 2022

> dsq registers go-sqlite3-stdlib so you get access to numerous statistics, url, math, string, and regexp functions that aren't part of the SQLite base. (https://github.com/multiprocessio/dsq#standard-library)
Ah, I wondered if they rolled their own SQL parser, but no, I now see the sqlite.go in the repo and all is made clear
Run SQL on CSV, Parquet, JSON, Arrow, Unix Pipes and Google Sheet
9 projects | news.ycombinator.com | 24 Sep 2022

I am currently evaluating dsq and its partner desktop app DataStation. AIUI, the developer of DataStation realised that it would be useful to extract the underlying pieces into a standalone CLI, so they both support the same range of sources.
dsq CLI - https://github.com/multiprocessio/dsq
multiprocessio / dsq :
1 project | /r/golang | 1 Sep 2022
OctoSQL allows you to join data from different sources using SQL
13 projects | news.ycombinator.com | 14 Jul 2022

OctoSQL is an awesome project and Kuba has a lot of great experience to share from building this project I'm excited to learn from.
And while building a custom database engine does allow you to do pretty quick queries, there are a few issues.
First, the SQL implemented is nonstandard. As I was looking for documentation and it pointed me to `SELECT * FROM docs.functions fs`. I tried to count the number of functions but octosql crashed (a Go panic) when I ran `SELECT count(1) FROM docs.functions fs` and `SELECT count() FROM docs.functions fs` which is what I lazily do in standard SQL databases. (`SELECT count(fs.name) FROM docs.function fs` worked.)
This kind of thing will keep happening because this project just doesn't have as much resources today as SQLite, Postgres, DuckDB, etc. It will support a limited subset of SQL.
Second, the standard library seems pretty small. When I counted the builtin functions there were only 29. Now this is an easy thing to rectify over time but just noting about the state today.
And third this project only has builtin support for querying CSV and JSON files. Again this could be easy to rectify over time but just mentioning the state today.
octosql is a great project but there are also different ways to do the same thing.
I build dsq [0] which runs all queries through SQLite so it avoids point 1. It has access to SQLite's standard builtin functions plus* a battery of extra statistic aggregation, string manipulation, url manipulation, date manipulation, hashing, and math functions custom built to help this kind of interactive querying developers commonly do [1].
And dsq supports not just CSV and JSON but parquet, excel, ODS, ORC, YAML, TSV, and Apache and nginx logs.
A downside to dsq is that it is slower for large files (say over 10GB) when you only want a few columns whereas octosql does better in some of those cases. I'm hoping to improve this over time by adding a SQL filtering frontend to dsq but in all cases dsq will ultimately use SQLite as the query engine.
You can find more info about similar projects in octosql's Benchmark section but I also have a comparison section in dsq [2] and an extension of the octosql benchmark with different set of tools [3] including duckdb.
Everyone should check out duckdb. :)
[0] https://github.com/multiprocessio/dsq
[1] https://github.com/multiprocessio/go-sqlite3-stdlib
[2] https://github.com/multiprocessio/dsq#comparisons
[3] https://github.com/multiprocessio/dsq#benchmark
GitHub Actions are down again
2 projects | news.ycombinator.com | 29 Jun 2022

What's annoying about this is that the PR doesn't even say it's trying to run tests. It says everything is passing and just doesn't list the actions.
For a second I thought someone must have deleted the actions yaml files.
This is a dangerous failure mode.
https://github.com/multiprocessio/dsq/pull/82
Xlite: Query Excel, Open Document spreadsheets (.ods) as SQLite virtual tables
6 projects | news.ycombinator.com | 25 Jun 2022

This is a cool project! But if you query Excel and ODS files with dsq you get the same thing plus a growing standard library of functions that don't come built into SQLite such as best-effort date parsing, URL parsing/extraction, statistical aggregation functions, math functions, string and regex helpers, hashing functions and so on [1].
[0] https://github.com/multiprocessio/dsq
[1] https://github.com/multiprocessio/go-sqlite3-stdlib

zed

Posts with mentions or reviews of zed. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-05.

Ask HN: What projects are trying to reinvent core software infrastructure?
2 projects | news.ycombinator.com | 5 Aug 2023
The Zed Project | Zed
1 project | /r/dataengineering | 25 May 2023

1 project | /r/programming | 25 May 2023
VAST 3.0 released. Open-Source Security Data Pipelines with Kusto-like syntax
2 projects | /r/cybersecurity | 15 Mar 2023

VAST is an open-source SecDataOps project for working with data from open-source security tools. Version 3.0 adds a pipeline syntax similar to splunk, Kusto, PRQL, and Zed.
The Magic of Small Databases
3 projects | news.ycombinator.com | 28 Jan 2023
zed
1 project | /r/devopspro | 20 May 2022
Super-Structured Data: Rethinking the Schema
3 projects | news.ycombinator.com | 17 May 2022

Cool, I didn't realize you used sqlite-utils for your performance demo!
It's not particularly designed for speed - it should be fast as far as Python code goes (I use some generator tricks to stream data and avoid having to load everything into memory at once) but I wouldn't expect "sqlite-utils insert" to win any performance competitions with tools written in other languages.
Those benchmarks against sqlite itself are definitely interesting. I'm looking forward to playing with the "native ZNG support for Python" mentioned on https://github.com/brimdata/zed/blob/main/docs/libraries/pyt... when that becomes available.
Zq: An Easier (and Faster) Alternative to Jq
36 projects | news.ycombinator.com | 26 Apr 2022

Hi, all. Author here. Thanks for all the great feedback.
I've learned a lot from your comments and pointers.
The Zed project is broader than "a jq alternative" and my bad for trying out this initial positioning. I do know there are a lot of people out there who find jq really confusing, but it's clear if you become an expert, my arguments don't hold water.
We've had great feedback from many of our users who are really productive with the blend of search, analytics, and data discovery in the Zed language, and who find manipulating eclectic data in the ZNG format to be really easy.
Anyway, we'll write more about these other aspects of the Zed project in the coming weeks and months, and in the meantime, if you find any of this intriguing and want to kick the tires, feel free to hop on our slack with questions/feedback or file GitHub issues if you have ideas for improvements or find bugs.
Thanks a million!
https://github.com/brimdata/zed
The many uses of mock data
4 projects | dev.to | 1 Jan 2022

In my observation, mock data has tended to be used in a rather loose, slipshod, careless manner. Unlike documentation, it is treated as the garbage of software material. (Sometimes even referred to as "garbage data"). People will try to avoid writing it by using elaborate "generators" such as jFairy or zed.
Internet Object – A JSON alternative data serialization format
6 projects | news.ycombinator.com | 24 Oct 2021

There are a few examples in the ZSON spec...
https://github.com/brimdata/zed/blob/main/docs/formats/zson....
And you can easily see whatever data you'd like formatted as ZSON using the "zq" CLI tool, but I just made this gist (with some data from the brimdata/zed-sample-data report) so you can have a quick look (the bstring stuff is a little noisy and an artifact of the data source being Zeek)... https://gist.github.com/mccanne/94865d557ca3de8abfd3eb09e8ac...

What are some alternatives?

When comparing dsq and zed you can also consider the following projects:

go-duckdb - go-duckdb provides a database/sql driver for the DuckDB database engine.

simdjson - Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

q - q - Run SQL directly on delimited files and multi-file sqlite databases

sirix - SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach.

querycsv - QueryCSV enables you to load CSV files and manipulate them using SQL queries then after you finish you can export the new values to a CSV file

yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor

octosql - OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

jid - json incremental digger

xlite - Query Excel spredsheets (.xlsx, .xls, .ods) using SQLite

feedback - Public feedback discussions for: GitHub for Mobile, GitHub Discussions, GitHub Codespaces, GitHub Sponsors, GitHub Issues and more! [Moved to: https://github.com/github-community/community]

textql - Execute SQL against structured text like CSV or TSV

gojq - Pure Go implementation of jq

dsq vs go-duckdb zed vs simdjson dsq vs q zed vs sirix dsq vs querycsv zed vs yq dsq vs octosql zed vs jid dsq vs xlite zed vs feedback dsq vs textql zed vs gojq

Compare dsq vs zed and see what are their differences.

dsq

zed

dsq

zed

What are some alternatives?