Launch HN: Airbyte (YC W20) – Open-Source ELT (Fivetran/Stitch Alternative)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

    Hi HN!

    Michel here with John, Shrif, Jared, Charles, and Chris. We are building an open-source ELT platform that replicates data from any applications, APIs, databases, etc. into your data warehouses, data lakes or databases: [https://airbyte.io](https://airbyte.io).

    I’ve been in data engineering for 11 years. Before Airbyte, I was the head of integrations at Liveramp, where we built and scaled over 1,000 data ingestion connectors to replicate 100TB worth of data every day. John, on the other end, has already built 3 startups with 2 exits. His latest one didn’t work out, though. He spent almost a year building ETL pipelines for an engineering management platform, but he eventually ran out of money before reaching product-market fit.

    By late 2019, we had known each other for 7 years, and always wanted to work together. When John’s third startup shut down, it was finally the right timing for both of us. And we knew which problem we wanted to address: data integration, and ELT more specifically.

    We started interviewing Fivetran, Stitchdata, and Matillion’s customers, in order to see if the existing solutions were solving their problems. We learned they all fell short, and always with the same patterns.

    Some limitations we identified are due to the fact that they are closed source. This prevents them from addressing the long tail of integrations because they will always have a ROI consideration when building and maintaining new connectors. A good example is Fivetran which, after 8 years, offers around 150 connectors. This is not a lot when you look at the number of existing tools out there (more than 10,000). In fact, all their customers that we talked to are building and maintaining their own connectors (along with orchestration, scheduling, monitoring, etc.) in-house, as the connectors they needed were either not supported in the way they needed or not supported at all.

    Some of those customers also tried to leverage existing open-source solutions, but the quality of the existing connectors is inconsistent, as many haven't been updated in years. Plus, they are not usable out of the box.

    That’s when we knew we wanted Airbyte to be open-source (MIT license), usable out of the box, and cover the long tail of integrations. By making it trivial to build new connectors on Airbyte in any language (they run as Docker containers), we hope the community will help us build and maintain the long tail of connectors. While open-source also enables us to address all use cases (including internal DBs and APIs), it also allows us to solve the problem inherent to cloud-based solutions: the security and privacy of your data. Companies don’t need to trust yet another 3rd-party vendor. Because it is self-hosted, it will disrupt the pricing of existing solutions.

    Here’s a 2-minute demo video if you want to check out how it looks: [https://www.youtube.com/watch?v=sKDviQrOAbU](https://www.you...

    Airbyte can run on a single node without any external infrastructure. We also integrate with Kubernetes (alpha), and will soon integrate with Airflow so you can run replication tasks across your cluster.

    Today, our early version supports about 41 [sources](https://docs.airbyte.io/integrations/sources) and 6 [destinations](https://docs.airbyte.io/integrations/destinations). We’re releasing [new connectors](https://docs.airbyte.io/integrations/integrations-changelog) every week (6 of them have already been contributed by the community). We bootstrapped some connectors using the highest-quality ones from Singer. Our connectors will always remain open-source.

    Our goal is to solve data integration for as many companies as possible, and the success of Airbyte is predicated on the open-source project becoming loved and ubiquitous. For this reason, we will focus the entirety of 2021 strengthening the open-source edition; we are dedicated to making it amazing for all users. We will eventually create a paid edition (open core model) with enterprise-level features (support, SLA, hosting and management, privacy compliance, role and access management, SSO, etc.) to address the needs of our most demanding users.

    Give it a spin: [https://github.com/airbytehq/airbyte/](https://github.com/ai... & a [demo](https://demo.airbyte.io). Let us know what you think. This is our first time building an open-source technology, so we know we have a lot to learn!

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • supabase

    The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.

    The Open Source Fivetran alternative. Yay, it was about time! A simple license : MIT. Clear differentiation between free & paid plans. I am liking what I am seeing so far. One of our client is in advertising industry and is syncing data from 20 different API vendors to postgres. So I am one of your potential customer.

    However, there is a big problem I'm noticing with "Open source alternatives" lately on HN. I had to mention this.

    Even a simple installation of airbyte on my local machine fails :( I tried docker-compose up! I simply wanna know why a basic example is not working on a important day of your company ? :) Is this a genuine mistake ? Sorry, this feedback will sound harsh but companies are taking words 'open source' for a complete ride. It's a great marketing trick. Gets you plenty of eyeballs, good will & trust to begin with. Then later we figure it's not even self hostable.

    Here is a bad example that you may not want to follow : Supabase "The Open Source Firebase Alternative". The product is not self hostable despite calling themselves open source firebase all over internet. The Founders of Supabase have been disingenuous not to address self hosting[1][2] and its a been long time since their launch. The self hosting section on their website[3] doesn't provide any details on how to self host and they are careless enough to even mention "how to migrate away" from Supabase in that section.

    [1] : https://github.com/supabase/supabase/discussions/219#discuss...

  • rosettable

    service to add postgres triggers on mysql CRUD events

    check this out, https://github.com/francoisp/rosettable. Uses binlog a reader to call postgres triggers, upon which you can build 2 way sync realtime.

  • meltano

    At GitLab, we're not ready to give up on the Singer spec, community, and ecosystem yet, which is why I've been working on Meltano for the past year: https://meltano.com/

    We think that the biggest things holding back Singer are the lack of documentation and tooling around taking existing taps and targets to production, and around building, debugging, maintaining, and testing new or existing high-quality taps and targets.

    Meltano itself addresses the first problem, and provides a robust and reliable platform for building, running & orchestrating Singer- and dbt-based ELT pipelines.

    At the same time, we have been working with some members of the community on a new framework for building taps and targets: https://gitlab.com/meltano/meltano/-/issues/2401, which we have decided to call the Singer SDK: https://gitlab.com/meltano/singer-sdk

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • 🛠️6 tools to kickstart your full-stack app with AI 🤖

    4 projects | dev.to | 28 Nov 2023
  • Robust Form - A simple and easy to use Form Builder application like Google Form.

    3 projects | dev.to | 20 Jul 2023
  • How do small SaaS's handle databases?

    2 projects | /r/SaaS | 11 Jul 2023
  • refine + DEV Open Source Hackathon 2 - Pre-Announcement

    2 projects | dev.to | 15 Jun 2023
  • Supabase Beta May 2023

    8 projects | dev.to | 9 Jun 2023