Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Python MySQL Projects
-
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
dev-setup
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
AWS Data Wrangler
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
-
python-mysql-replication
Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL
-
prisma-client-py
Prisma Client Python is an auto-generated and fully type-safe database client designed for ease of use
-
nagios-plugins
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Redash: Connect to data source, easily visualize, dashboard and share your data | news.ycombinator.com | 2024-03-20
Project mention: What’s the Difference Between Fine-tuning, Retraining, and RAG? | dev.to | 2024-04-08Check us out on GitHub.
Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.
It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.
Project mention: Does anyone prefer the CLI over the shell, or other way around? If so, why? | /r/mysql | 2023-04-23Also, check out MyCLI. https://github.com/dbcli/mycli "Terminal Client for MySQL with AutoCompletion and Syntax Highlighting"
Recommend checking out https://github.com/tobymao/sqlglot if you are interested in this capability for other SQL dialects
Tools like this are helpful for:
- Rendering SQL in a consistent way, eg for snapshot testing
Project mention: How to Connect a FastAPI Server to PostgreSQL and Deploy on GCP Cloud Run | dev.to | 2023-05-26To do this, we can use the Tortoise-ORM. Begin by installing the package:
ibis – portable Python dataframe library
Project mention: Read files from s3 using Pandas/s3fs or AWS Data Wrangler? | /r/dataengineering | 2023-12-06I had no problem with awswrangler (https://github.com/aws/aws-sdk-pandas) and it supports reading and writing partitions which was really helpful and a few other optimizations that made it a great tool
If the issue happen a lot, there is also: https://github.com/datafold/data-diff
That is a nice tool to do it cross database as well.
I think it's based on checksum method.
for your reference: https://github.com/PyMySQL/mysqlclient/issues/672
I'm maintaining an internal change-data-capture application that uses a python library to decode mysql binlog and store the change records as json in the data lake (like Debezium). For our most busiest databases a single Cpython process couldn't process the amount of incoming changes in real time (thousands of events per second). It's not something that can be easily parallelized, as the bulk of the work is happening in the binlog decoding library (https://github.com/julien-duponchelle/python-mysql-replicati...).
So we've made it configurable to run some instances with Pypy - which was able to work through the data in realtime, i.e. without generating a lag in the data stream. The downside of using pypy was increased memory usage (4-8x) - which isn't really a problem. An actually problem that I didn't really track down was that the test suite (running pytest) was taking 2-3 times longer with Pypy than with CPython.
A few months ago I upgraded the system to run with CPython 3.11 and the performance improvements of 10-20% that come with that version now actually allowed us to drop Pypy and only run CPython. Which is more convenient and makes the deployment and configuration less complex.
- for important files, a separate box where I have borgmatic [1] in deduplication mode installed; this is updated once in a while
Just curious: Do you have any reason to believe that such a data corruption bug is likely in ZFS? It seems like saying that ext4 could have a bug and you should also store stuff on NTFS, just in case (which I think does not make sense..).
Python MySQL related posts
- What’s the Difference Between Fine-tuning, Retraining, and RAG?
- Merging data from multiple Excel files
- Echolocate your MySQL health with real-time monitoring in the terminal
- Vanna.ai: Chat with your SQL database
- Can't install mysqclient in python 3.12
- Show HN: Easily Visualize Your SQLAlchemy Data Models in a Nice SVG Diagram
- Cannot get Code to Run After Cloning a Repo
-
A note from our sponsor - InfluxDB
www.influxdata.com | 19 Apr 2024
Index
What are some of the best open-source MySQL projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Redash | 24,917 |
2 | MindsDB | 21,160 |
3 | airbyte | 13,821 |
4 | mycli | 11,251 |
5 | PyMySQL | 7,551 |
6 | dev-setup | 6,032 |
7 | sqlglot | 5,389 |
8 | tortoise-orm | 4,224 |
9 | ibis | 4,041 |
10 | AWS Data Wrangler | 3,797 |
11 | databases | 3,692 |
12 | PonyORM | 3,514 |
13 | data-diff | 2,830 |
14 | Gopherus | 2,644 |
15 | mysqlclient | 2,406 |
16 | python-mysql-replication | 2,253 |
17 | aiomysql | 1,697 |
18 | learning | 1,685 |
19 | borgmatic | 1,636 |
20 | prisma-client-py | 1,590 |
21 | fapro | 1,497 |
22 | nagios-plugins | 1,119 |
23 | eralchemy | 1,074 |