DictDataBase
quokka
DictDataBase | quokka | |
---|---|---|
9 | 23 | |
219 | 1,082 | |
- | - | |
7.8 | 8.3 | |
about 2 months ago | 7 months ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DictDataBase
-
LiliDB (JSON-based database)
Nice! Quite similar to a database I am working on: https://github.com/mkrd/DictDataBase
-
Sunday Daily Thread: What's everyone working on this week?
So in the end, this is an attempt to but a truly document oriented equivalent to SQLite, and I think there is potential. I would love to do the same benchmark with the SQLite solution you proposed and compare the two, but my time is limiting me right now… Here is the github link if you are interested: https://github.com/mkrd/DictDataBase
-
Reading data from a JSON file 5000 times faster with DictDataBase
Hi guys! I'm working on DictDataBae, and just wanted to share some performance numbers which might be interesting to you.
-
Would it make sense of Python to have a NoSQL database as part of the standard library?
Someone recently posted their project, I'd imagine you're after something like this? Haven't tried it but I have a drop in use case I might fool around with (giant json flat file store). Duct database
-
I updated DictDataBase, it's like SQLite but for JSON, now a lot better!
The project is available on [Github](https://github.com/mkrd/DictDataBase) and [PyPi](https://pypi.org/project/dictdatabase) if you wanna take a look!
-
This Week In Python
DictDataBase – A python NoSQL database that uses dicts, and provided thread and process safety
-
I made DictDataBase, it‘s like SQLite but for JSON!
If a thread holds a lock (so is in a with DDBSession(): block) for longer than 30 seconds, its hold is automatically revoked, but DDBSession.write doesn't detect this case. I would definitely recommend using operating system functions designed for file locking instead.
-
I've created DictDataBase, a JSON file based serverless DB for concurrent environments!
Here is the Github Link and the PyPi Page if you wanna take a look!
quokka
-
How Query Engines Work
An awesome read!
Something related that I found out about from HN a few months back is another engine called quokka. It's particularly interesting and applicable how quokka schedules distributed queries to outperform Spark https://github.com/marsupialtail/quokka/blob/master/blog/why...
- Quokka – Distributed Polars on Ray
-
Algorithmic Trading with Go
Hi Justin, you might be interested in my blog: https://github.com/marsupialtail/quokka/blob/master/blog/bac... advocating a cloud based approach.
You don't have to use the system I am building, but it's worth thinking about that design.
-
Daft: A High-Performance Distributed Dataframe Library for Multimodal Data
SQL support is very challenging.
I work on Quokka (https://github.com/marsupialtail/quokka). I support Iceberg reads. Recently we are adding SQL support from just parsing the DuckDB logical plan, though that is very challenging as well.
The Python world lacks a standard for a plug and play SQL query optimizer. Apache Calcite is good for the JVM world, but not great if you are trying to cut out the JVM.
- Why your dataframe library needs to understand vector embeddings
-
The Inner Workings of Distributed Databases
In case people are interested, I wrote a post about fault tolerance strategies of data systems like Spark and Flink: https://github.com/marsupialtail/quokka/blob/master/blog/fau...
The key difference here is that these systems don't store data, so fault tolerance means recovering within a query instead of not losing data.
-
Launch HN: DAGWorks – ML platform for data science teams
would love to collaborate on an integration with pyquokka (https://github.com/marsupialtail/quokka) once I put out a stable release end of this month :-)
-
is spark always your go to solution ?
Then you should keep an eye on quokka. This may become the "Spark" for Polars/DuckDB. It seems to be under active development though I'm not sure how stable it is.
- Distributed fault tolerance made simple
- Fault tolerance for distributed data systems is quite simple
What are some alternatives?
sparrowci_web - ci.sparrowhub.io website
opteryx - 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
SparrowCI - SparrowCI - super fun and flexible CI system with many programming languages support
cempaka - "Write a trading bot which buys low and sells high." Sounds simple enough, right?
pysonDB - A Simple , ☁️ Lightweight , 💪 Efficent JSON based database for 🐍 Python. PysonDB-V2 has been released ⬇️
awesome-pipeline - A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin
datasette - An open source multi-tool for exploring and publishing data
spyql - Query data on the command line with SQL-like SELECTs powered by Python expressions
dlib - The python dictionary library.
pg8000 - A Pure-Python PostgreSQL Driver
PiGreenHouse - Automated Greenhouse with a raspberry pi 4 and Pimoroni breakoutgarden
blog - Some notes on things I find interesting and important.