DictDataBase
spyql
DictDataBase | spyql | |
---|---|---|
9 | 23 | |
219 | 902 | |
- | - | |
7.8 | 0.0 | |
about 2 months ago | over 1 year ago | |
Python | Jupyter Notebook | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DictDataBase
-
LiliDB (JSON-based database)
Nice! Quite similar to a database I am working on: https://github.com/mkrd/DictDataBase
-
Sunday Daily Thread: What's everyone working on this week?
So in the end, this is an attempt to but a truly document oriented equivalent to SQLite, and I think there is potential. I would love to do the same benchmark with the SQLite solution you proposed and compare the two, but my time is limiting me right now… Here is the github link if you are interested: https://github.com/mkrd/DictDataBase
-
Reading data from a JSON file 5000 times faster with DictDataBase
Hi guys! I'm working on DictDataBae, and just wanted to share some performance numbers which might be interesting to you.
-
Would it make sense of Python to have a NoSQL database as part of the standard library?
Someone recently posted their project, I'd imagine you're after something like this? Haven't tried it but I have a drop in use case I might fool around with (giant json flat file store). Duct database
-
I updated DictDataBase, it's like SQLite but for JSON, now a lot better!
The project is available on [Github](https://github.com/mkrd/DictDataBase) and [PyPi](https://pypi.org/project/dictdatabase) if you wanna take a look!
-
This Week In Python
DictDataBase – A python NoSQL database that uses dicts, and provided thread and process safety
-
I made DictDataBase, it‘s like SQLite but for JSON!
If a thread holds a lock (so is in a with DDBSession(): block) for longer than 30 seconds, its hold is automatically revoked, but DDBSession.write doesn't detect this case. I would definitely recommend using operating system functions designed for file locking instead.
-
I've created DictDataBase, a JSON file based serverless DB for concurrent environments!
Here is the Github Link and the PyPi Page if you wanna take a look!
spyql
-
Fq: Jq for Binary Formats
I prefer a SQL-like format. It’s not as complete but it cover most of the day-to-day use cases. Take a look at https://github.com/dcmoura/spyql (I am the author). Congrats on fq!
-
Command-line data analytics made easy with SPyQL
SPyQL documentation: spyql.readthedocs.io
-
This Week In Python
spyql – Query data on the command line with SQL-like SELECTs powered by Python expressions
- Command-line data analytics made easy
-
Jc – JSONifies the output of many CLI tools
This is great!
I am the author of SPyQL [1]. Combining JC with SPyQL you can easily query the json output and run python commands on top of it from the command-line :-) You can do aggregations and so forth in a much simpler and intuitive way than with jq.
I just wrote a blogpost [2] that illustrates it. It is more focused on CSV, but the commands would be the same if you were working with JSON.
[1] https://github.com/dcmoura/spyql
- The fastest command-line tools for querying large JSON datasets
-
Working with more than 10gb csv
You can import the data into a PostgreSQL/MySQL/SQLite/... database and then query the database. However, even with the right choice of indexes, it might take a while to run queries on a table with hundreds of millions of records. You can easily import your data to these databases with SpyQL: $ spyql "SELECT * FROM csv TO sql(table=my_table_name) | sqlite3 my.db" (you would need to create the table my_table_name before running the command).
-
ClickHouse Cloud is now in Public Beta
https://github.com/dcmoura/spyql/blob/master/notebooks/json_...
And ClickHouse looks like a normal relational database - there is no need for multiple components for different tiers (like in Druid), no need for manual partitioning into "daily", "hourly" tables (like you do in Spark and Bigquery), no need for lambda architecture... It's refreshing how something can be both simple and fast.
- A SQLite extension for reading large files line-by-line
-
I want to convert a large JSON file into Tabular Format.
I thought this library was pretty nifty for json. It's also relatively fast compared to most json parsers: https://github.com/dcmoura/spyql
What are some alternatives?
sparrowci_web - ci.sparrowhub.io website
prql - PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
SparrowCI - SparrowCI - super fun and flexible CI system with many programming languages support
malloy - Malloy is an experimental language for describing data relationships and transformations.
pysonDB - A Simple , ☁️ Lightweight , 💪 Efficent JSON based database for 🐍 Python. PysonDB-V2 has been released ⬇️
tresql - Shorthand SQL/JDBC wrapper language, providing nested results as JSON and more
datasette - An open source multi-tool for exploring and publishing data
Preql - An interpreted relational query language that compiles to SQL.
dlib - The python dictionary library.
prosto - Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
PiGreenHouse - Automated Greenhouse with a raspberry pi 4 and Pimoroni breakoutgarden
pxi - 🧚 pxi (pixie) is a small, fast, and magical command-line data processor similar to jq, mlr, and awk.