airbyte
grouparoo
DISCONTINUED
Our great sponsors
airbyte | grouparoo | |
---|---|---|
139 | 27 | |
13,646 | 607 | |
5.2% | - | |
10.0 | 9.9 | |
5 days ago | almost 2 years ago | |
Python | JavaScript | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
airbyte
-
Who's hiring developer advocates? (October 2023)
Link to GitHub -->
- All the ways to capture changes in Postgres
-
Is it impossible to contribute to open source as a data engineer?
You can try and contribute some new connectors/operators for workflow managers like Airflow or Airbyte
-
airbyte VS cloudquery - a user suggested alternative
2 projects | 2 Jun 20232 projects | 2 Jun 2023
-
New age ETL products every data team needs to know
- https://airbyte.com/
2. Reverse ETL:
-
Is it safe to update docker/docker-compose?
Here's the docker-compose file https://github.com/airbytehq/airbyte/blob/master/docker-compose.yaml
I'm trying to insall https://airbyte.com/ is a great selfhosted ELT platform. In common words, it's an app that can access all kinds of api to scrub the data and put it in a database. I really like the idea of being able to own my data and make all kinds of analyse with it.
-
Top 10 Best Open Source GitHub repos for Developers 2023
AirByte GitHub: https://github.com/airbytehq/airbyte
grouparoo
-
Reference Data Stack for Data-Driven Startups
There are other tools that we will have to adopt in the future but haven’t yet due to lack of necessity. Specifically, one category that is popular in modern data stacks is Reverse ETL (Hightouch, Census, or Grouparoo). We currently don’t have a usecase for piping data back into 3rd party tools but it will definitely come up in the future.
-
Data pipeline suggestions
Reverse ETL: Grouparoo, Castled
-
Where can I find free data engineering ( big data) projects online?
Ingestion / ETL: Airbyte, Singer, Jitsu Transformation: dbt Orchestration: Airflow, Dagster Testing: GreatExpectations Observability: Monosi Reverse ETL: Grouparoo, Castled Visualization: Lightdash, Superset
-
Ask HN: Who is hiring? (December 2021)
Grouparoo | Remote (US) | Remote-OK | https://www.grouparoo.com
Grouparoo is a venture-backed software company building open source data tools that make data reliable, accessible, and actionable. We’re empowering teams to make great customer experiences, driven by data. While engineering teams have gotten good at storing and generating data about their customers, it’s rare that this data is used to its full potential in external applications. Grouparoo makes these integrations easy by providing a framework for defining your customer data and reliably syncing it to external tools.
To learn more about who we are, our engineering culture, and whether this is the right place for you, read our Key Values profile: https://www.keyvalues.com/grouparoo
Here are our open roles:
- Senior Backend / Lead Engineer: https://jobs.lever.co/grouparoo/6ba485d1-a5a4-41f0-9fa5-920a...
- Developer Advocate: https://jobs.lever.co/grouparoo/5e1531b4-7ec8-4c10-8e52-fc23...
Tech Stack: TypeScript / Javascript / Node.js, ActionHero, React + Next.js, Postgres & Redis, and whole lot of third-party APIs!
-
Launch HN: Hightouch (YC S19) – Sync data from data warehouses to SaaS tools
Congrats on the launch! Hightouch looks great and this need is real. Things seem to be going well, so I don't think I'm taking too much away by mentioning that we have been been working on Grouparoo, an open source alternative that solves similar pain points.
A few differences: git developer workflow focused (branches, CI, PRs, etc), ability to self host, segmentation in destinations (tagging people in mailchimp based on rules, for example)
-
Ask HN: Who is hiring? (August 2021)
Grouparoo | Remote (US) | Remote-OK | https://www.grouparoo.com
Grouparoo is a venture-backed software company building the open-source reverse-ETL framework that makes it easy to have meaningful, data-driven conversations with customers. Do you want to keep product data in-sync with tools like Hubspot, Marketo or Zendesk? Do you want to be able to build, test, and deploy data sync code just like the rest of your tech stack? That’s the kind of thing Grouparoo does.
We started Grouparoo because we are done saying “no” to marketing teams asking for data and want make is easy (and safe!) for everyone to us the data available at work. We are looking for a seasoned back-end engineer to join our US-based, fully remote team. The main components of our stack are Typescript/Javascript, Actionhero, Next.js, and React. Learn more about the position @ https://www.grouparoo.com/jobs and https://www.keyvalues.com/grouparoo. Check out our open-source framework (and see what you will be working on) @ https://github.com/grouparoo/grouparoo
-
Ask HN: Who is hiring? (July 2021)
Grouparoo | Remote (US) | Remote-OK | https://www.grouparoo.com
Grouparoo is a venture-backed software company building the open-source reverse-ETL framework that makes it easy to have meaningful, data-driven conversations with customers. Do you want to keep product data in-sync with tools like Hubspot, Marketo or Zendesk? Do you want to be able to build, test, and deploy data sync code just like the rest of your stack? That’s the kind of thing Grouparoo does.
We started Grouparoo because we are done saying “no” to marketing teams asking for data and want make is easy (and safe!) for everyone to us the data available at work. We are looking for 2 seasoned engineers to join our US-based, fully remote team. The main components of our stack are Typescript/Javascript, Actionhero, Next.js, and React. Learn more about the positions @ https://www.grouparoo.com/jobs and https://www.keyvalues.com/grouparoo. Check out our open-source framework (and see what you will be working on) @ https://github.com/grouparoo/grouparoo
Here are our open roles:
* Senior Backend / Founding Engineer: https://jobs.lever.co/grouparoo/6ba485d1-a5a4-41f0-9fa5-920a...
* Senior Full Stack / Lead Engineer: https://jobs.lever.co/grouparoo/946e3407-6101-45f1-84a8-135d...
* Founding Community Manager / Developer Advocate: https://jobs.lever.co/grouparoo/19ef1a6b-6ad9-49f6-8512-90e3...
Tech Stack: TypeScript / Javascript / Node.js, ActionHero, React + Next.js, Postgres & Redis, and whole lot of third-party APIs!
-
Bundling and Distributing Next.js Sites via NPM
The final thing we learned is that while the contents of the .next directory are needed for your visitors, not everything is needed. We saw that we were shipping 300mb packages to NPM for our Next.js UIs. We dug into the .next folder and learned that if you opt-into Webpack v5 for your Next.js site, large .next/cache/*.pack files will be created to speed up how Webpack works. This is normal behavior, but we were inadvertently publishing these large files to NPM! We added the .next/cache/* directory to our .npmignore and our build sizes went down to a more reasonable 20mb.
-
Using Typescript to create a Robust API between your frontend and backend
The Grouparoo Application is stored in a monorepo, which means that the frontend and backend code always exist side-by-side. This means that we can reference the API code from our Frontend code, and make a helper to check our response types. We don't need our API code at run-time, but we can import the types from it as we develop and compile the app to Javascript.
-
Deferring Side-Effects in Node.js until the End of a Transaction
Looking deeper into how cls-hooked works, we can see that it is possible to tell if you are currently in a namespace, and to set and get values from the namespace. Think of this like a session... but for the callback or promise your code is within! With this in mind, we can write our run method to be transaction-aware. This means that we can use a pattern that knows to run a function in-line if we aren’t within a transaction, but if we are, defer it until the end. We’ve wrapped utilities to do this within Grouparoo’s CLS module.
What are some alternatives?
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
dagster - An orchestration platform for the development, production, and observation of data assets.
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
meltano
jitsu - Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
spark-rapids - Spark RAPIDS plugin - accelerate Apache Spark with GPUs
dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
supabase - The open source Firebase alternative.
dbt - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. [Moved to: https://github.com/dbt-labs/dbt-core]
n8n-docs - Documentation for n8n, a fair-code licensed automation tool with a free community edition and powerful enterprise options. Build AI functionality into your workflows.
superset - Apache Superset is a Data Visualization and Data Exploration Platform
incubator-seatunnel - SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time). [Moved to: https://github.com/apache/seatunnel]