python-mysql-replication
Pyjion
python-mysql-replication | Pyjion | |
---|---|---|
5 | 23 | |
2,255 | 1,411 | |
- | - | |
9.1 | 5.0 | |
about 1 month ago | about 1 month ago | |
Python | C++ | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
python-mysql-replication
-
Is anyone using PyPy for real work?
I'm maintaining an internal change-data-capture application that uses a python library to decode mysql binlog and store the change records as json in the data lake (like Debezium). For our most busiest databases a single Cpython process couldn't process the amount of incoming changes in real time (thousands of events per second). It's not something that can be easily parallelized, as the bulk of the work is happening in the binlog decoding library (https://github.com/julien-duponchelle/python-mysql-replicati...).
So we've made it configurable to run some instances with Pypy - which was able to work through the data in realtime, i.e. without generating a lag in the data stream. The downside of using pypy was increased memory usage (4-8x) - which isn't really a problem. An actually problem that I didn't really track down was that the test suite (running pytest) was taking 2-3 times longer with Pypy than with CPython.
A few months ago I upgraded the system to run with CPython 3.11 and the performance improvements of 10-20% that come with that version now actually allowed us to drop Pypy and only run CPython. Which is more convenient and makes the deployment and configuration less complex.
-
Why Binlog size grows drastically when isolation level set to "Repeatable Read" & When isolation level set to "Read Committed" the size of Binlog file reduces ?
doing the using Python, https://github.com/julien-duponchelle/python-mysql-replication, the recommended way of doing this
-
How to Use BinLogs to Make an Aurora MySQL Event Stream
The BinLogStreamReader has several inputs that we need to retrieve. First we'll retrieve the cluster's secret with the database host/username/password and then we'll fetch the serverId we stored in S3.
-
How is everyone ingesting backend relational data?
From backend relational tables to data warehouses my team has mostly relied on change data capture replication. We use MySQL upstream, and historically used AWS DMS or Attunity Replicate to replicate directly to SQL server. Recently we made the switch to Snowflake, and used mostly AWS DMS to replicate CDC data to S3 (lists individual inserts, updates, deletes), and then from there use snowpipes to copy to snowflake and then a job to merge that data into the target table to get the latest state. In addition we've used this library in production https://github.com/noplay/python-mysql-replication, and still use it today for one high volume, critical data source. Generally we see data go end to end in a matter of minutes, but occasionally there are spikes in latency.
- Robust data transfer mechanism?
Pyjion
-
Python 3.13 Gets a JIT
It exists, was created by microsoft employees, and is referenced in the article: https://www.trypyjion.com/
-
Is anyone using PyPy for real work?
I've actually come across and started using Pyjion recently (https://github.com/tonybaloney/pyjion); how does Pypy compare, both in terms of performance and purpose? There seems to be a lot of overlap...
-
funAndEasyToUse
Python is capable of doing things at runtime that are really hard to statically compile around, such as monkeypatching methods onto existing objects. You can compile it, but it's complicated. One strategy is to use a JIT that can observe application state at runtime and then invalidate code as it becomes obsoleted by changes, but it's complicated. See pyjion for an example.
-
Javascript has Typescript. WHY WE DONT HAVE TYPY !
When I say "Python" I am referring to the standard CPython interpreter which most people use. But there is also PyPy, which includes a Just In Time compile that compiles selected code into machine language on the fly, as needed. pyjion is another JIT compiler that generates machine language on the fly, and you can install it with pip. Or you could work for Facebook and use Cinder. Cython, Nuitka and Pyston are other alternatives.
-
How is Golang websocket better than FastAPI websocket?
and if you need more speed you can try https://www.pypy.org/ or https://github.com/tonybaloney/Pyjion or https://www.pyston.org/
-
CPython vs PyPy
Finally, there is also Pyjion which based on its website is “A drop-in JIT Compiler for Python 3.10” (https://www.trypyjion.com/). We will be covering it on a separate writeup. See you next time ;-).
- Accelerate Python code 100x by import taichi as ti
- Create CPython extensions in .NET?
-
Instant upvotes
Though some exciting stuff happening over the next few years, Python is getting faster, has been for awhile, and stuff like Pyjion https://www.trypyjion.com/, a drop in C# powered JIT compiler is starting to approach usable. Rust and Python seem to be best buds right now, so more extension libraries in rust, a newer more approachable language than say C/C++ but with a similar speed. Sign me up!
-
You think python is slow ?
Pyjion Easy to use, small compiler. Increase performance of our 🐌 CPython.
What are some alternatives?
AWS Data Wrangler - pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Numba - NumPy aware dynamic Python compiler using LLVM
PyMySQL - MySQL client library for Python
Nuitka - Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.10, and 3.11. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
PonyORM - Pony Object Relational Mapper
cinder - Cinder is Meta's internal performance-oriented production version of CPython.
sparc-curation - code and files for SPARC curation workflows
graalpython - A Python 3 implementation built on GraalVM
preshed - 💥 Cython hash tables that assume keys are pre-hashed
Cython - The most widely used Python to C compiler
mycli - A Terminal Client for MySQL with AutoCompletion and Syntax Highlighting.
hpy - HPy: a better API for Python