Our great sponsors
-
This isn't theoretical; many companies do PostgreSQL async 1:N physical replication, by using e.g. https://pgbackrest.org/ to have the primary push WAL segment files (a.k.a. "the last n milliseconds of packets" in the write-ahead log) as objects to S3, and then to have all read-replicas fetch from S3 and replay.
> You could do even better if you out-of-band signal the readiness so you do not need to poll while idle.
S3 and its clones have "object lifecycle notifications", where you can be informed by a push-based mechanism whenever a new object is put into the bucket.
But — what do you have to do, to get these notifications?
Subscribe to a message queue that S3 puts them into.
-
Bailo
Managing the lifecycle of machine learning to support scalability, impact, collaboration, compliance and sharing.
Interesting. What makes you want to switch to the file system? I wrote one for a project[0] a while back and it didn't seem like databases introduced too much complexity. I based my implementation off of an existing solution, but it only took a couple of hundred lines of easy to understand code.
[0] https://github.com/gchq/Bailo/tree/main/lib/p-mongo-queue
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
My read time estimate code is here: https://github.com/Xe/site/blob/aa3608afa6c62695ca0ab139f823...
I've been trying to play with the constants over the years to make the read time estimate more "accurate", but it's a tough nut to crack in general. So I can go over my numbers more accurately, how long did it take you to read it?
-
-
cstore_fdw
Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method.
I'm definitely not from Citus data -- just a pg zealot fighting the culture war.
If you want to reach people who can actually help, you probably want to check this link:
-
Database config should be two connection strings, 1 for the admin user that creates the tables and anther for the queue user. Everything else should be stored in the database itself. Each queue should be in its own set of tables. Large blobs may or may not be referenced to an external file.
Shouldn't a message send be worst case a CAS. It really seems like all the work around garbage collection would have some use for in-memory high speed queues.
Are you familiar with the LMAX Disruptor? Is is a Java based cross thread messaging library used for day trading applications.