huniq
fast-sqlite3-inserts
huniq | fast-sqlite3-inserts | |
---|---|---|
3 | 11 | |
230 | 363 | |
- | - | |
2.7 | 0.0 | |
4 months ago | about 1 year ago | |
Rust | Rust | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
huniq
-
Zet 1.0 is out (compare to uniq and comm)
How does it compare with huniq and runiq?
-
I/O is no longer the bottleneck
`sort | uniq` is really slow for this, as it has to sort the entire input first. I use `huniq` which is way faster for this. I'm sure there are many similar options.
https://github.com/koraa/huniq
-
What’s your favorite shell one liner?
For better speed, check out https://github.com/koraa/huniq
fast-sqlite3-inserts
-
SQLite performance tuning: concurrent reads, multiple GBs and 100k SELECTs/s
I am experimenting with SQLite, where I try inserting 1B rows in under a minute. The current best is inserting 100M rows at 23s. I cut many corners to get performance, but the tweaks might suit your workload.
I have explained my rationale and approach here - https://avi.im/blag/2021/fast-sqlite-inserts/
the repo link - https://github.com/avinassh/fast-sqlite3-inserts
-
I/O is no longer the bottleneck
I am working on a project [0] to generate 1 billion rows in SQLite under a minute and inserted 100M rows inserts in 33 seconds. First, I generate the rows and insert them in an in-memory database, then flush them to the disk at the end. To flush it to disk it takes only 2 seconds, so 99% of the time is being spent generating and adding rows to the in-memory B Tree.
For Python optimisation, have you tried PyPy? I ran my same code (zero changes) using PyPy, and I got 3.5x better speed.
I published my findings here [1].
[0] - https://github.com/avinassh/fast-sqlite3-inserts
[1] - https://avi.im/blag/2021/fast-sqlite-inserts/
- Ask HN: Which personal projects got you hired?
-
Is there any language that is as similar as possible to Python in syntax, readability, and features, but is statically typed?
I have a side project where I tried to insert one billion rows in SQLite. I was able to insert 100 million rows using Python under 210 seconds. The same thing with PyPy took 120 seconds. I am wondering what kind of speed boost I would get with Cython
- Ask for benchmark. The owner can’t verify a 18% perf gain, could you?
-
Inserting One Billion Rows in SQLite Under A Minute
Measure, measure, measure! There is a PR which made really minor changes, but it got 2x speed boost with CPython version
- Inserting One Billion Rows in SQLite Under a Minute
- Weekly Coders, Hackers & All Tech related thread - 17/07/2021
-
How we achieved write speeds of 1.4 million rows per second
[somewhat related] Recently, I was benchmarking SQLite inserts and I managed to insert 3.3M records per second (100M in 33 ish seconds) on my local machine - https://github.com/avinassh/fast-sqlite3-inserts Ofcourse the comparison is not apples to apples, but sharing here if anyone finds it interesting
What are some alternatives?
fzy - :mag: A simple, fast fuzzy finder for the terminal
tsbs - Time Series Benchmark Suite, a tool for comparing and evaluating databases for time series data
RAMCloud - **No Longer Maintained** Official RAMCloud repo
julia - The Julia Programming Language
repo
plum - Multiple dispatch in Python
napkin-math - Techniques and numbers for estimating system's performance from first-principles
sqlite_micro_logger_arduino - Fast and Lean Sqlite database logger for Arduino UNO and above
share-file-systems - Use a Windows/OSX like GUI in the browser to share files cross OS privately. No cloud, no server, no third party.
remixdb - RemixDB: A read- and write-optimized concurrent KV store. Fast point and range queries. Extremely low write-amplification.
runiq - An efficient way to filter duplicate lines from input, à la uniq.
dynamic-dns - An automated dynamic DNS solution for Docker and DigitalOcean