sedona
Tile38
Our great sponsors
sedona | Tile38 | |
---|---|---|
8 | 9 | |
1,771 | 8,902 | |
2.5% | - | |
9.6 | 7.0 | |
8 days ago | 8 days ago | |
Java | Go | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sedona
-
Show HN: TG – Fast geometry library in C
This is awesome! I wonder how feasible is it to include TG in Apache Sedona (https://github.com/apache/sedona)
Although Sedona runs as a distributed system, but TG may speed local in-memory geometrical computation for each worker node. Let me know your thoughts!
- Apache Sedona: Big Geospatial Data and AI Engine
-
The Apache Software Foundation Announces New Top-Level Project Apache Sedona
Flexible deployment options, including standalone, local, and cluster modes.
ADDITIONAL RESOURCES
Website: https://sedona.apache.org/
-
Is geomesa still the way to go for large scale geospatial data analytics ?
Have you looked into Apache Sedona? It's good for spatial queries on dataframe https://sedona.apache.org/
- Apache Sedona for Processing Geospatial Data at Scale
Tile38
-
Show HN: TG – Fast geometry library in C
[2] https://github.com/tidwall/tile38
-
PostgreSQL: No More Vacuum, No More Bloat
Experimental format to help readability of a long rant:
1.
According to the OP, there's a "terrifying tale of VACUUM in PostgreSQL," dating back to "a historical artifact that traces its roots back to the Berkeley Postgres project." (1986?)
2.
Maybe the whole idea of "use X, it has been battle-tested for [TIME], is robust, all the bugs have been and keep being fixed," etc., should not really be that attractive or realistic for at least a large subset of projects.
3.
In the case of Postgres, on top of piles of "historic code" and cruft, there's the fact that each user of Postgres installs and runs a huge software artifact with hundreds or even thousands of features and dependencies, of which every particular user may only use a tiny subset.
4.
In Kleppmann's DDOA [1], after explaining why the declarative SQL language is "better," he writes: "in databases, declarative query languages like SQL turned out to be much better than imperative query APIs." I find this footnote to the paragraph a bit ironic: "IMS and CODASYL both used imperative query APIs. Applications typically used COBOL code to iterate over records in the database, one record at a time." So, SQL was better than CODASYL and COBOL in a number of ways... big surprise?
Postgres' own PL/pgSQL [2] is a language that (I imagine) most people would rather NOT use: hence a bunch of alternatives, including PL/v8, on its own a huge mass of additional complexity. SQL is definitely "COBOLESQUE" itself.
5.
Could we come up with something more minimal than SQL and looking less like COBOL? (Hopefully also getting rid of ORMs in the process). Also, I have found inspiring to see some people creating databases for themselves. Perhaps not a bad idea for small applications? For instance, I found BuntDB [3], which the developer seems to be using to run his own business [4]. Also, HYTRADBOI? :-) [5].
6.
A usual objection to use anything other than a stablished relational DB is "creating a database is too difficult for the average programmer." How about debugging PostgreSQL issues, developing new storage engines for it, or even building expertise on how to set up the instances properly and keep it alive and performant? Is that easier?
I personally feel more capable of implementing a small, well-tested, problem-specific, small implementation of a B-Tree than learning how to develop Postgres extensions, become an expert in its configuration and internals, or debug its many issues.
Another common opinion is "SQL is easy to use for non-programmers." But every person that knows SQL had to learn it somehow. I'm 100% confident that anyone able to learn SQL should be able to learn a simple, domain-specific, programming language designed for querying DBs. And how many of these people that are not able to program imperatively would be able to read a SQL EXPLAIN output and fix deficient queries? If they can, that supports even more the idea that they should be able to learn something different than SQL.
----
1: https://dataintensive.net/
2: https://www.postgresql.org/docs/7.3/plpgsql-examples.html
3: https://github.com/tidwall/buntdb
4: https://tile38.com/
5: https://www.hytradboi.com/
-
Your Data Fits in RAM
I actually worked on a project that did this. We used a database called "Tile38" [1] which used an R-Tree to make geospatial queries speedy. It was pretty good.
Our dataset was ~150 GiB, I think? All in RAM. Took a while to start the server, as it all came off disk. Could have been faster. (It borrowed Redis's query language, and its storage was just "store the commands the recreate the DB, literally", IIRC. Dead simple, but a lot of slack/wasted space there.)
Overall not a bad database. Latency serving out of RAM was, as one should/would expect, very speedy!
[1]: https://tile38.com/
-
Redcon - Redis compatible server framework for Rust
I ported it from Go and use it for my Tile38 project.
- Tile38 - a geolocation data store, spatial index, and realtime geofence
- Path hints for B-trees can bring a performance increase of 150% – 300%
- How do I implement push notifications on a 10 mile radius from a certain user?
What are some alternatives?
esProc - esProc SPL is a scripting language for data processing, with well-designed rich library functions and powerful syntax, which can be executed in a Java program through JDBC interface and computing independently.
vitess - Vitess is a database clustering system for horizontal scaling of MySQL.
geos-wasm - WASM + JS port of GEOS
go-mysql-elasticsearch - Sync MySQL data into elasticsearch
Graphhopper - Open source routing engine for OpenStreetMap. Use it as Java library or standalone web server.
ledisdb - A high performance NoSQL Database Server powered by Go
tg - Geometry library for C - Fast point-in-polygon
goleveldb - LevelDB key/value database in Go.
robust-predicates - Fast robust predicates for computational geometry in JavaScript
groupcache - groupcache is a caching and cache-filling library, intended as a replacement for memcached in many cases.
sqlite-tg - SQLite extension around tg, a geometric library for limited GIS operations
kingshard - A high-performance MySQL proxy