The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 16 Python Redshift Projects
-
Redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Project mention: Redash: Connect to data source, easily visualize, dashboard and share your data | news.ycombinator.com | 2024-03-20 -
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Project mention: Launch HN: Bracket (YC W22) – Two-Way Sync Between Salesforce and Postgres | news.ycombinator.com | 2023-12-12I'l also give a shout-out to Airbyte (https://airbyte.com/), with which I've had some limited success with integrating Salesforce to a local database. The particular pull for Airbyte is that we can self-host the open source version, rather than pay Fivetran a significant sum to do this for us.
It's an immature tool, so I don't yet know that I can claim we've spent _less_ than Fivetran on the additional engineering and ops time, but it feels like it has potential to do so once stabilized.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
awesome-aws
A curated list of awesome Amazon Web Services (AWS) libraries, open source repos, guides, blogs, and other resources. Featuring the Fiery Meter of AWSome.
-
Recommend checking out https://github.com/tobymao/sqlglot if you are interested in this capability for other SQL dialects
Tools like this are helpful for:
- Rendering SQL in a consistent way, eg for snapshot testing
-
AWS Data Wrangler
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Project mention: Read files from s3 using Pandas/s3fs or AWS Data Wrangler? | /r/dataengineering | 2023-12-06I had no problem with awswrangler (https://github.com/aws/aws-sdk-pandas) and it supports reading and writing partitions which was really helpful and a few other optimizations that made it a great tool
-
Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06
Hey, HN community!
We're stoked to launch JupySQL today! JupySQL is an open-source library that brings a modern SQL experience to Jupyter. JupySQL is compatible with all major databases, such as Snowflake, Redshift, PostgreSQL, MySQL, MariaDB, DuckDB, SQL Server, Clickhouse, Trino, and more!
To get started, check out our tutorial: https://jupysql.ploomber.io/en/latest/quick-start.html
SQL is the defacto language for data analysis; however, analysis often requires a mix of SQL and Python. JupySQL bridges this gap, allowing users to execute SQL queries seamlessly in Jupyter and continue their analysis in Python. Add %%sql to the top of your cell and start writing SQL.
Here are some of JupySQL's main features:
- Syntax highlighting
-
Project mention: Launch HN: Grai (YC S22) – Open-Source Data Observability Platform | news.ycombinator.com | 2023-07-17
Elastic v2 if one is interested in such things: https://github.com/grai-io/grai-core/blob/v0.1.33/LICENSE
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
dataall
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
-
CueObserve
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
-
Project mention: Is there something wrong with me, I hate dbt, what am I missing ? | /r/dataengineering | 2023-05-15
This just feels like you aren’t using the plentiful tools to make those “mind-numbingly slow” dev steps faster. For ex., using dbt-coves to generate the staging models with casting to types in a couple clicks. And pulling directly from Fivetran tables is just poor practice, with the additional steps needed to do it “right” being inconsequential at best.
-
dbt-ml-preprocessing
A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
-
pytest-mock-resources
Pytest Fixtures that let you actually test against external resource (Postgres, Mongo, Redshift...) dependent code.
-
prism
Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python. (by runprism)
Project mention: Prism: the easiest way to create robust data workflows. Accessible via CLI | /r/coolgithubprojects | 2023-09-21 -
dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
-
TrueColorTools
GUI application for calculating human-visible colors of celestial bodies from their photometry data
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Redshift related posts
- New to wm, I choose i3-gaps, any suggestions?
- [Project] Open-source Anomaly detection on SQL data
- CueObserve - Anomaly detection on SQL data warehouses and databases
- CueObserve - Anomaly detection on SQL data warehouses and databases
- CueObserve - Anomaly detection on SQL data warehouses and databases
- Show HN: CueObserve – Open-source Anomaly detection on SQL data
-
A note from our sponsor - WorkOS
workos.com | 18 Apr 2024
Index
What are some of the best open-source Redshift projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Redash | 24,881 |
2 | airbyte | 13,821 |
3 | awesome-aws | 12,134 |
4 | sqlglot | 5,389 |
5 | AWS Data Wrangler | 3,791 |
6 | jupysql | 591 |
7 | grai-core | 267 |
8 | dataall | 208 |
9 | CueObserve | 205 |
10 | dbt-coves | 204 |
11 | dbt-ml-preprocessing | 175 |
12 | pytest-mock-resources | 171 |
13 | prism | 79 |
14 | dbd | 55 |
15 | TrueColorTools | 19 |
16 | dotfiles | 4 |