pysimdjson vs datasette

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

pysimdjson		datasette
	Project
6	Mentions	187
628	Stars	8,881
-	Growth	-
5.3	Activity	9.2
3 months ago	Latest Commit	7 days ago
Python	Language	Python
GNU General Public License v3.0 or later	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pysimdjson

Posts with mentions or reviews of pysimdjson. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-18.

Analyzing multi-gigabyte JSON files locally
14 projects | news.ycombinator.com | 18 Mar 2023
I Use C When I Believe in Memory Safety
5 projects | news.ycombinator.com | 5 Feb 2023

Its magic function wrapping comes at a cost, trading ease of use for runtime performance. When you have a single C++ function to call that will run for a "long" time, pybind all the way. But pysimdjson tends to call a single function very quickly, and the overhead of a single function call is orders of magnitude slower than with cython when being explit with types and signatures. Wrap a class in pybind11 and cython and compare the stack trace between the two, and the difference is startling.
Ex: https://github.com/TkTech/pysimdjson/issues/73
Processing JSON 2.5x faster than simdjson with msgspec
5 projects | /r/Python | 3 Oct 2022

simdjson
[package-find] lsp-bridge
5 projects | /r/emacs | 23 May 2022

You are aware of simdjson being available in python if you really need some json crunching, albeit json module in Python is implemented in C itself, so I don't think understand why do you think Python is slow there?
The fastest tool for querying large JSON files is written in Python (benchmark)
16 projects | news.ycombinator.com | 12 Apr 2022

json: 113.79130696877837 ms
While `orjson`, is faster than `ujson`/`json` here, it's only ~6% faster (in this benchmark). `simdjson` and `msgspec` (my library, see https://jcristharif.com/msgspec/) are much faster due to them avoiding creating PyObjects for fields that are never used.
If spyql's query engine can determine the fields it will access statically before processing, you might find using `msgspec` for JSON gives a nice speedup (it'll also type check the JSON if you know the type of each field). If this information isn't known though, you may find using `pysimdjson` (https://pysimdjson.tkte.ch/) gives an easy speed boost, as it should be more of a drop-in for `orjson`.
How I cut GTA Online loading times by 70%
7 projects | /r/programming | 28 Feb 2021

I don't think JSON is really the problem - parsing 10MB of JSON is not so slow. For example, using Python's json.load takes about 800ms for a 47MB file on my system, using something like simdjson cuts that down to ~70ms.

datasette

Posts with mentions or reviews of datasette. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-19.

Ask HN: High quality Python scripts or small libraries to learn from
12 projects | news.ycombinator.com | 19 Apr 2024

Simon Willison's github would be a great place to get started imo -
https://github.com/simonw/datasette
Show HN: TextQuery – Query and Visualize Your CSV Data in Minutes
3 projects | news.ycombinator.com | 2 Apr 2024
Little Data: How do we query personal data? (2013)
1 project | news.ycombinator.com | 1 Mar 2024

I'm a fan on simonw's datasette/dogsheep ecosystem https://datasette.io/
LaTeX and Neovim for technical note-taking
10 projects | news.ycombinator.com | 21 Feb 2024

I use Anki the exact same way. After a lifetime of learning I have accepted that I will never read over anything I write for myself voluntarily - so my two options are:
1. Write an article so good I can publish it and look it over myself later on. I did this last year with https://andrew-quinn.me/fzf/, for example.
2. Create Anki cards out of the material. Use the builtin Card Browser or even https://datasette.io/ on the underlying SQLite database in a pinch to search for my notes any time I have to.
Daily Price Tracking for Trader Joes
7 projects | news.ycombinator.com | 8 Feb 2024

Were you aware of, or tempted by https://datasette.io/ for creating your solution?
SQLite-Web: Web-based SQLite database browser written in Python
7 projects | news.ycombinator.com | 7 Feb 2024
Ask HN: What two software products should have a kid?
2 projects | news.ycombinator.com | 5 Feb 2024

Browsing HN, GitHub and the like we get to see a huge variety of software products and code bases.
I often see products and think - if this product X, got together with Y, it would be pretty cool - kind of like if they had a kid together.
Not too literally, but more on the conceptual level - my level of programming is low.
E.g. Just some....
- pocketable.io & datasette (+with some more charting) [https://pocketbase.io, https://datasette.io]
Ask HN: Looking for a project to volunteer on? (February 2024)
15 projects | news.ycombinator.com | 1 Feb 2024

You might like the Datasette project: https://datasette.io/
I don't think they are desperate for contributions but it's a welcoming environment and a fun project to hack on. You'll learn a lot just from reading the source and the incredibly informative PRs. The creator is a really talented developer with a great blog which shows up on the HN front page often.
Stuff I Learned during Hanukkah of Data 2023
5 projects | dev.to | 18 Dec 2023

Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.
What We Watched: A Netflix Engagement Report – About Netflix
1 project | news.ycombinator.com | 12 Dec 2023

> uploads of boring raw excel data and receive a nice UI
https://datasette.io/

What are some alternatives?

When comparing pysimdjson and datasette you can also consider the following projects:

orjson - Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative

cysimdjson - Very fast Python JSON parsing library

duckdb - DuckDB is an in-process SQL OLAP Database Management System

ultrajson - Ultra fast JSON decoder and encoder written in C with Python bindings

sql.js-httpvfs - Hosting read-only SQLite databases on static file hosters like Github Pages

Fast JSON schema for Python - Fast JSON schema validator for Python.

litestream - Streaming replication for SQLite.

lupin is a Python JSON object mapper - Python document object mapper (load python object from JSON and vice-versa)

Sequel-Ace - MySQL/MariaDB database management for macOS

PyValico - Small python wrapper around https://github.com/rustless/valico

beekeeper-studio - Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.

pysimdjson vs orjson datasette vs nocodb pysimdjson vs cysimdjson datasette vs duckdb pysimdjson vs ultrajson datasette vs sql.js-httpvfs pysimdjson vs Fast JSON schema for Python datasette vs litestream pysimdjson vs lupin is a Python JSON object mapper datasette vs Sequel-Ace pysimdjson vs PyValico datasette vs beekeeper-studio

Compare pysimdjson vs datasette and see what are their differences.

pysimdjson

datasette

pysimdjson

datasette

What are some alternatives?