octosql
datasette
Our great sponsors
octosql | datasette | |
---|---|---|
34 | 186 | |
4,689 | 8,862 | |
- | - | |
4.3 | 9.2 | |
7 months ago | 7 days ago | |
Go | Python | |
Mozilla Public License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
octosql
-
Wazero: Zero dependency WebAssembly runtime written in Go
Never got it to anything close to a finished state, instead moving on to doing the same prototype in llvm and then cranelift.
That said, here's some of the wazero-based code on a branch - https://github.com/cube2222/octosql/tree/wasm-experiment/was...
It really is just a very very basic prototype.
- Analyzing multi-gigabyte JSON files locally
-
DuckDB: Querying JSON files as if they were tables
This is really cool!
With their Postgres scanner[0] you can now easily query multiple datasources using SQL and join between them (i.e. Postgres table with JSON file). Something I strived to build with OctoSQL[1] before.
It's amazing to see how quickly DuckDB is adding new features.
Not a huge fan of C++, which is right now used for authoring extensions, it'd be really cool if somebody implemented a Rust extension SDK, or even something like Steampipe[2] does for Postgres FDWs which would provide a shim for quickly implementing non-performance-sensitive extensions for various things.
Godspeed!
[0]: https://duckdb.org/2022/09/30/postgres-scanner.html
[1]: https://github.com/cube2222/octosql
[2]: https://steampipe.io
-
Show HN: ClickHouse-local – a small tool for serverless data analytics
Congrats on the Show HN!
It's great to see more tools in this area (querying data from various sources in-place) and the Lambda use case is a really cool idea!
I've recently done a bunch of benchmarking, including ClickHouse Local and the usage was straightforward, with everything working as it's supposed to.
Just to comment on the performance area though, one area I think ClickHouse could still possibly improve on - vs OctoSQL[0] at least - is that it seems like the JSON datasource is slower, especially if only a small part of the JSON objects is used. If only a single field of many is used, OctoSQL lazily parses only that field, and skips the others, which yields non-trivial performance gains on big JSON files with small queries.
Basically, for a query like `SELECT COUNT(*), AVG(overall) FROM books.json` with the Amazon Review Dataset, OctoSQL is twice as fast (3s vs 6s). That's a minor thing though (OctoSQL will slow down for more complicated queries, while for ClickHouse decoding the input is and remains the bottleneck).
-
Steampipe – Select * from Cloud;
To add somewhat of a counterpoint to the other response, I've tried the Steampipe CSV plugin and got 50x slower performance vs OctoSQL[0], which is itself 5x slower than something like DataFusion[1]. The CSV plugin doesn't contact any external API's so it should be a good benchmark of the plugin architecture, though it might just not be optimized yet.
That said, I don't imagine this ever being a bottleneck for the main use case of Steampipe - in that case I think the APIs themselves will always be the limiting part. But it does - potentially - speak to what you can expect if you'd like to extend your usage of Steampipe to more than just DevOps data.
[0]: https://github.com/cube2222/octosql
[1]: https://github.com/apache/arrow-datafusion
Disclaimer: author of OctoSQL
-
Go runtime: 4 years later
Actually, folks just use gRPC or Yaegi in Go.
See Terraform[0], Traefik[1], or OctoSQL[2].
Although I agree plugins would be welcome, especially for performance reasons, though also to be able to compile and load go code into a running go process (JIT-ish).
[0]: https://github.com/hashicorp/terraform
[1]: https://github.com/traefik/traefik
[2]: https://github.com/cube2222/octosql
Disclaimer: author of OctoSQL
- Run SQL on CSV, Parquet, JSON, Arrow, Unix Pipes and Google Sheet
-
Beginner interested in learning SQL. Have a few question that I wasn’t able to find on google.
Through more magic, you COULD of course use stuff like Spark, or easier with programs like TextQL, sq, OctoSQL.
-
How I Used DALL·E 2 to Generate The Logo for OctoSQL
The logo was created for OctoSQL and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think!
-
How I Used DALL·E 2 to Generate the Logo for OctoSQL
Hey, author here, happy to answer any questions!
The logo was created for OctoSQL[0] and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think!
datasette
- Show HN: TextQuery – Query and Visualize Your CSV Data in Minutes
-
Little Data: How do we query personal data? (2013)
I'm a fan on simonw's datasette/dogsheep ecosystem https://datasette.io/
-
LaTeX and Neovim for technical note-taking
I use Anki the exact same way. After a lifetime of learning I have accepted that I will never read over anything I write for myself voluntarily - so my two options are:
1. Write an article so good I can publish it and look it over myself later on. I did this last year with https://andrew-quinn.me/fzf/, for example.
2. Create Anki cards out of the material. Use the builtin Card Browser or even https://datasette.io/ on the underlying SQLite database in a pinch to search for my notes any time I have to.
-
Daily Price Tracking for Trader Joes
Were you aware of, or tempted by https://datasette.io/ for creating your solution?
- SQLite-Web: Web-based SQLite database browser written in Python
-
Ask HN: What two software products should have a kid?
Browsing HN, GitHub and the like we get to see a huge variety of software products and code bases.
I often see products and think - if this product X, got together with Y, it would be pretty cool - kind of like if they had a kid together.
Not too literally, but more on the conceptual level - my level of programming is low.
E.g. Just some....
- pocketable.io & datasette (+with some more charting) [https://pocketbase.io, https://datasette.io]
-
Ask HN: Looking for a project to volunteer on? (February 2024)
You might like the Datasette project: https://datasette.io/
I don't think they are desperate for contributions but it's a welcoming environment and a fun project to hack on. You'll learn a lot just from reading the source and the incredibly informative PRs. The creator is a really talented developer with a great blog which shows up on the HN front page often.
-
Stuff I Learned during Hanukkah of Data 2023
Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.
-
What We Watched: A Netflix Engagement Report – About Netflix
> uploads of boring raw excel data and receive a nice UI
-
Ask HN: What are some unpopular technologies you wish people knew more about?
Don't overlook https://datasette.io/ even though it does much more than endpoints.
What are some alternatives?
duckdb - DuckDB is an in-process SQL OLAP Database Management System
nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
q - q - Run SQL directly on delimited files and multi-file sqlite databases
trdsql - CLI tool that can execute SQL queries on CSV, LTSV, JSON, YAML and TBLN. Can output to various formats.
sql.js-httpvfs - Hosting read-only SQLite databases on static file hosters like Github Pages
sqlitebrowser - Official home of the DB Browser for SQLite (DB4S) project. Previously known as "SQLite Database Browser" and "Database Browser for SQLite". Website at:
litestream - Streaming replication for SQLite.
textql - Execute SQL against structured text like CSV or TSV
Sequel-Ace - MySQL/MariaDB database management for macOS
sqlite-utils - Python CLI utility and library for manipulating SQLite databases
beekeeper-studio - Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.