grouparoo
DISCONTINUED
nifi
Our great sponsors
grouparoo | nifi | |
---|---|---|
27 | 35 | |
607 | 4,316 | |
- | 3.2% | |
9.9 | 9.9 | |
almost 2 years ago | 5 days ago | |
JavaScript | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
grouparoo
-
Reference Data Stack for Data-Driven Startups
There are other tools that we will have to adopt in the future but havenāt yet due to lack of necessity. Specifically, one category that is popular in modern data stacks is Reverse ETL (Hightouch, Census, or Grouparoo). We currently donāt have a usecase for piping data back into 3rd party tools but it will definitely come up in the future.
-
Data pipeline suggestions
Reverse ETL: Grouparoo, Castled
-
Where can I find free data engineering ( big data) projects online?
Ingestion / ETL: Airbyte, Singer, Jitsu Transformation: dbt Orchestration: Airflow, Dagster Testing: GreatExpectations Observability: Monosi Reverse ETL: Grouparoo, Castled Visualization: Lightdash, Superset
-
Ask HN: Who is hiring? (December 2021)
Grouparoo | Remote (US) | Remote-OK | https://www.grouparoo.com
Grouparoo is a venture-backed software company building open source data tools that make data reliable, accessible, and actionable. Weāre empowering teams to make great customer experiences, driven by data. While engineering teams have gotten good at storing and generating data about their customers, itās rare that this data is used to its full potential in external applications. Grouparoo makes these integrations easy by providing a framework for defining your customer data and reliably syncing it to external tools.
To learn more about who we are, our engineering culture, and whether this is the right place for you, read our Key Values profile: https://www.keyvalues.com/grouparoo
Here are our open roles:
- Senior Backend / Lead Engineer: https://jobs.lever.co/grouparoo/6ba485d1-a5a4-41f0-9fa5-920a...
- Developer Advocate: https://jobs.lever.co/grouparoo/5e1531b4-7ec8-4c10-8e52-fc23...
Tech Stack: TypeScript / Javascript / Node.js, ActionHero, React + Next.js, Postgres & Redis, and whole lot of third-party APIs!
-
Launch HN: Hightouch (YC S19) ā Sync data from data warehouses to SaaS tools
Congrats on the launch! Hightouch looks great and this need is real. Things seem to be going well, so I don't think I'm taking too much away by mentioning that we have been been working on Grouparoo, an open source alternative that solves similar pain points.
A few differences: git developer workflow focused (branches, CI, PRs, etc), ability to self host, segmentation in destinations (tagging people in mailchimp based on rules, for example)
-
Ask HN: Who is hiring? (August 2021)
Grouparoo | Remote (US) | Remote-OK | https://www.grouparoo.com
Grouparoo is a venture-backed software company building the open-source reverse-ETL framework that makes it easy to have meaningful, data-driven conversations with customers. Do you want to keep product data in-sync with tools like Hubspot, Marketo or Zendesk? Do you want to be able to build, test, and deploy data sync code just like the rest of your tech stack? Thatās the kind of thing Grouparoo does.
We started Grouparoo because we are done saying ānoā to marketing teams asking for data and want make is easy (and safe!) for everyone to us the data available at work. We are looking for a seasoned back-end engineer to join our US-based, fully remote team. The main components of our stack are Typescript/Javascript, Actionhero, Next.js, and React. Learn more about the position @ https://www.grouparoo.com/jobs and https://www.keyvalues.com/grouparoo. Check out our open-source framework (and see what you will be working on) @ https://github.com/grouparoo/grouparoo
-
Ask HN: Who is hiring? (July 2021)
Grouparoo | Remote (US) | Remote-OK | https://www.grouparoo.com
Grouparoo is a venture-backed software company building the open-source reverse-ETL framework that makes it easy to have meaningful, data-driven conversations with customers. Do you want to keep product data in-sync with tools like Hubspot, Marketo or Zendesk? Do you want to be able to build, test, and deploy data sync code just like the rest of your stack? Thatās the kind of thing Grouparoo does.
We started Grouparoo because we are done saying ānoā to marketing teams asking for data and want make is easy (and safe!) for everyone to us the data available at work. We are looking for 2 seasoned engineers to join our US-based, fully remote team. The main components of our stack are Typescript/Javascript, Actionhero, Next.js, and React. Learn more about the positions @ https://www.grouparoo.com/jobs and https://www.keyvalues.com/grouparoo. Check out our open-source framework (and see what you will be working on) @ https://github.com/grouparoo/grouparoo
Here are our open roles:
* Senior Backend / Founding Engineer: https://jobs.lever.co/grouparoo/6ba485d1-a5a4-41f0-9fa5-920a...
* Senior Full Stack / Lead Engineer: https://jobs.lever.co/grouparoo/946e3407-6101-45f1-84a8-135d...
* Founding Community Manager / Developer Advocate: https://jobs.lever.co/grouparoo/19ef1a6b-6ad9-49f6-8512-90e3...
Tech Stack: TypeScript / Javascript / Node.js, ActionHero, React + Next.js, Postgres & Redis, and whole lot of third-party APIs!
-
Bundling and Distributing Next.js Sites via NPM
The final thing we learned is that while the contents of the .next directory are needed for your visitors, not everything is needed. We saw that we were shipping 300mb packages to NPM for our Next.js UIs. We dug into the .next folder and learned that if you opt-into Webpack v5 for your Next.js site, large .next/cache/*.pack files will be created to speed up how Webpack works. This is normal behavior, but we were inadvertently publishing these large files to NPM! We added the .next/cache/* directory to our .npmignore and our build sizes went down to a more reasonable 20mb.
-
Using Typescript to create a Robust API between your frontend and backend
The Grouparoo Application is stored in a monorepo, which means that the frontend and backend code always exist side-by-side. This means that we can reference the API code from our Frontend code, and make a helper to check our response types. We don't need our API code at run-time, but we can import the types from it as we develop and compile the app to Javascript.
-
Deferring Side-Effects in Node.js until the End of a Transaction
Looking deeper into how cls-hooked works, we can see that it is possible to tell if you are currently in a namespace, and to set and get values from the namespace. Think of this like a session... but for the callback or promise your code is within! With this in mind, we can write our run method to be transaction-aware. This means that we can use a pattern that knows to run a function in-line if we arenāt within a transaction, but if we are, defer it until the end. Weāve wrapped utilities to do this within Grouparooās CLS module.
nifi
- FLaNK Stack Weekly 19 Feb 2024
- Ask HN: What are some unpopular technologies you wish people knew more about?
- FLaNK Stack Weekly for 13 November 2023
- Tool decision - What architecture would you choose and why?
-
Is there something like airflow but written in Scala/Java?
Apache Camel Apache Nifi Spring Cloud
-
Your opinion on Kong
This suggestion isn't a standard one, but when a coworker and I were looking for API gateways with a very specific feature set, we couldn't find a single one to do what we needed. We did, however, come across Apache NiFi. It's a flow-based programming tool that allowed us to translate an http-based request to streaming text sent via netcat.
-
S3 to S3 transform
For a simple sequential Pipeline, my goto would be Apache Camel. As soon as you want complexity its either Apache Nifi or a micro service architecture.
-
Analysing Github Stars - Extracting and analyzing data from Github using Apache NiFiĀ®, Apache KafkaĀ® and Apache DruidĀ®
Spencer Kimball (now CEO at CockroachDB) wrote an interesting article on this topic in 2021 where they created spencerkimball/stargazers based on a Python script. So I started thinking: could I create a data pipeline using Nifi and Kafka (two OSS tools often used with Druid) to get the API data into Druid - and then use SQL to do the analytics? The answer was yes! And I have documented the outcome below. Hereās my analytical pipeline for Github stars data using Nifi, Kafka and Druid.
-
Is there any automation solution that isn't "only" CI/CD except Jenkins?
For dataflow pipelines I'm really a fan of apache nifi https://nifi.apache.org/
-
Read database each 5 sec and dispatch event
Pretty sure you can do this with a NiFi connector. https://nifi.apache.org
What are some alternatives?
Logstash - Logstash - transport and process your logs, events, or other data
superset - Apache Superset is a Data Visualization and Data Exploration Platform
meltano
meltano - Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
rotki - A portfolio tracking, analytics, accounting and management application that protects your privacy
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
TileDB - The Universal Storage Engine
Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
streamlit - Streamlit ā A faster way to build and share data apps.
PostHog - š¦ PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.