Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
sqlfluff
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
-
Apache Arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
-
delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)
-
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Here is a list of open source projects that are said to be awesome for beginners.
Apache Spark
If you’re still new to development in general and not that comfortable with development tools (using an IDE, the terminal, etc.) check out this link: the missing semester in your CS education. It covers the more practical sides of coding that aren’t taught in university courses. Learn this along the way.
Apache Airflow
dbt Core
Apache Parquet
Apache Avro
SQLFluff
Apache Arrow
Apache Cassandra
Apache Hadoop
Apache Kafka
Delta Lake
Apache Pinot
Apache NiFi
Apache Hudi
Although Trino (formerly Presto) is in the awesome for beginners list, it’s also a really good DE project as it is a distributed query engine that connects to most of the projects listed above. So depending on where you work in this project you can gain a depth of knowledge on the query engine or breadth across all the connectors …or go hybrid .
As our project grows, we've seen first-hand how difficult it is for others to contribute to open-source: from setting up the development environment, understanding the codebase, drafting a PR, etc. We've learned a lot from helping others successfully contribute to our project so we share our thoughts here in a blog post, don't hesitate to reach out if you need help! Happy to help you contribute to any of our projects or any other!