pg-bulk-ingest vs asyncpg

pg-bulk-ingest

Python utility function to ingest data into a SQLAlchemy-defined PostgreSQL table (by uktrade)

Source Code

pg-bulk-ingest.docs.trade.gov.uk

Suggest alternative

Edit details

asyncpg

A fast PostgreSQL Database Client Library for Python/asyncio. (by MagicStack)

Database Drivers Postgresql Python python-3 Asyncio async-python async-programming High Performance database-driver

Source Code

Suggest alternative

Edit details

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

pg-bulk-ingest		asyncpg
	Project
3	Mentions	16
34	Stars	6,699
-	Growth	1.3%
8.7	Activity	6.4
15 days ago	Latest Commit	8 days ago
Python	Language	Python
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pg-bulk-ingest

Posts with mentions or reviews of pg-bulk-ingest. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-10.

Show HN: pg-bulk-ingest – now with multi-table support
2 projects | news.ycombinator.com | 10 Feb 2024

Ah the name - you're not the first to mention it! Do you (or anyone lurking...) have any suggestions as to what it might better be called?
On what it does/why it exists, we've kept the README quite light to avoid duplication, with the main bits of the docs at https://pg-bulk-ingest.docs.trade.gov.uk/
But to try to answer the question here:
A set of insert statements - there are lots of cases where this would be fine, so pg-bulk-ingest (/its future name ;-) would be unnecessary, and so you might as well use insert statements.
But there are lots of things that pg-bulk-ingest does that a set of insert statements don't:
- It uses COPY, which in many cases is (much?) faster than INSERT
Show HN: pg-bulk-ingest – Bulk ingest into PostgreSQL with high-watermarking
1 project | news.ycombinator.com | 20 May 2023

asyncpg

Posts with mentions or reviews of asyncpg. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-30.

PyPy has been working for me for several years now
4 projects | news.ycombinator.com | 30 May 2024
Ask HN: Is Python async/await some kind of joke?
4 projects | news.ycombinator.com | 27 Jan 2024

- SqlAlchemy/asyncpg => you can’t use it if you’re using PgBouncer (necessary most of the time with Postgres) in transaction mode? What?? https://github.com/MagicStack/asyncpg/issues/1058
Differences from Psycopg2
1 project | news.ycombinator.com | 10 Oct 2023

OK I stand corrected, asyncpg has these two C files:
https://github.com/MagicStack/asyncpg/blob/master/asyncpg/pr...
https://github.com/MagicStack/asyncpg/blob/master/asyncpg/pr...
If you are interested here is a post by the psycopg author about psycopg2 and 3 and performance versus asyncpg.
https://www.varrazzo.com/blog/2020/05/19/a-trip-into-optimis...
Asyncpg – A Fast PostgreSQL Database Client Library for Python/Asyncio
1 project | news.ycombinator.com | 29 Sep 2023
Ruby Outperforms C: Breaking the Catch-22
4 projects | news.ycombinator.com | 9 Sep 2023

This pure Python library claims quite fabulous performance: https://github.com/MagicStack/asyncpg
I believe it because that team have done lots of great stuff but I haven't used it, I just remembered thinking it was interesting the performance was so good. Not sure how related it is to running on the asyncio loop (or which loop they used for benchmarks).
PgBouncer is useful, important, and fraught with peril
2 projects | news.ycombinator.com | 8 Sep 2023

what a great post, we have had a ton of issues with users using pgbouncer and it's not because things are "broken" per se, it's just the situation is very complicated, and pgbouncer's docs are also IMO in need of updating to be more detailed and in a few critical cases less misleading, specifically the prepared statements docs.
This blog post refers to this misleading nature at https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html#pre... .
> PgBouncer says it doesn’t support prepared statements in either PREPARE or protocol-level format. What it actually doesn’t support are named prepared statements in any form.
That's also not really accurate. You can use a named prepared statement just fine in transaction mode. start a transaction (so you aren't in autocommit), use a named statement, works fine. you just can't use it again in another transaction, because it will be "gone" (more accurately, "unmoored" - might be in your session, might be in someone else's session). Making things worse, when the prepared statement is "unmoored", its name can then conflict with another client attempting to use the same name.
so to use named prepared statements, you can less ideally name them with random strings to avoid conflicts, or you can DEALLOCATE the prepared statement(s) you used at the end of your transaction. for our users that use asyncpg, we have them use a uuid for prepared statements to avoid these name conflicts (asyncpg added this feature for us here: https://github.com/MagicStack/asyncpg/issues/837). however, they can just as well use DEALLOCATE ALL, set this as their `server_reset_query`, and then so that happens in transaction mode, also set `server_reset_query_always`, so that it's called at the end of transactions. Where pgbouncer here IMO entirely misleadingly documents this as "This setting is for working around broken setups that run applications that use session features over a transaction-pooled PgBouncer." - which is why nobody uses it, because pgbouncer claims this is "broken". It's not any more broken than it is to switch out the PostgreSQL session underneath a connection that uses multiple transactions. Pgbouncer can do better here and make this clearer and more accommodating of real world database drivers.
Library to connect Python to Postgresql
1 project | /r/learnpython | 4 May 2023

asyncpg is another great driver if you're using asyncio and want maximum performance (although they also break with DBAPI, but the tradeoff may be worth it).
aiopg vs asyncpg vs psycopg3
3 projects | /r/learnpython | 28 Jun 2022

asyncpg: 5.5k starts, last commit recently, ~150 issues, some incompatibility, few open PRs, extensive README. Includes benchmark showing it's supposedly 3x faster than aiopg and psycopg2, psycopg3 is not mentioned in the benchmark.
Announcing Quart-DB
3 projects | /r/Python | 12 Apr 2022

Quart-DB uses asyncpg to manage the connections and buildpg to parse the named parameter bindings.
Should I use TimescaleDB or partitioning is enough?
1 project | /r/PostgreSQL | 4 Mar 2022

A major performance boost specifically on inserts with timescaledb was actually starting to use https://github.com/MagicStack/asyncpg.

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

Compare pg-bulk-ingest vs asyncpg and see what are their differences.

pg-bulk-ingest

asyncpg

pg-bulk-ingest

asyncpg